Help with parsing JSON data efficiently

Welcome Forums General PowerShell Q&A Help with parsing JSON data efficiently

This topic contains 10 replies, has 6 voices, and was last updated by

 
Participant
1 month ago.

  • Author
    Posts
  • #120775

    Participant
    Points: 54
    Rank: Member

    Hey!

    I have written a function that writes information to a logfile in JSON format. Each log entry has lots of information that could help the person who reads the logfile. For instance,

    {
    "id": 1,
    "date": "the date",
    "tags":[
    "tag1",
    "tag2",
    "tag3"
    ],
    "Message": "Creating user xyz",
    "CmdledUsed": "Get-ADUser -Identity xyz",
    "Module": "ActiveDirectory",
    "CmdletToReverseChange": "Remove-ADUser -identity xyz"
    }

    So I call "Write-LogFileJson" when I want to log information to the logfile. The reason I am using JSON is because it is easier for someone else to read the information.

    The problem I have is that it is incredibly inefficient to read all of the JSON file, to then append the new log entry (obviously).

    Is there a more efficient way to do this? I mean to append the JSON file without reading it first before each time the function is called.

    I don't want the script to slow down just because the logfile happened to be really large. I mean I could use runspaces/jobs for that but then I will most likely encounter locked file issues.

    Just need help with some ideas of how I can do this a bit more efficient. Maybe I just need to rethink the whole concept.

  • #120790

    Participant
    Points: 190
    Helping Hand
    Rank: Participant

    As far as I know no.
    If you google it, no matter the language the question (java, python, c# etc.) have poped up and the answer is the same.
    Load it back to objects, add whatever object you create, convert back to json.

    I guess you could try but you would constantly need to add an extra ',' to the last/previous entry and you would need to check for end ']' '}' and so forth.
    So it might work but most likely you'll break the formatting sooner or later.

    I would log to .CSV or a custom windows eventlog, rather than JSON.

  • #120795

    Participant
    Points: 265
    Helping Hand
    Rank: Contributor

    CSV will be tricky with that array property. You may need to reconsider what needs to be stored and how to represent it in the data file if you want to have a readily-appendable file.

  • #120828

    Participant
    Points: 303
    Helping Hand
    Rank: Contributor

    You could also consider using a database rather than a file to centralize the logs as well. Then you are just inserting a record.

    https://www.quackit.com/json/tutorial/list_of_json_databases.cfm

  • #120840

    Participant
    Points: 54
    Rank: Member

    Yeah, CSV will be tricky. The data that I wrote in the post is just an example, there will be more properties. The end goal of it all is to centralize the logs for ease of management as well, so the eventlog approach could become cumbersome (Unless we forward the logs to some central logsystem).

    I have considered the database approach, but I was going for simplicity (Where I am working currently the knowledge of these things are quite low). I could also write some type of wrapper that will display these logs in some readable way also I suppose.

    Thanks for all the suggestions! I have some thinking to do 🙂

  • #120858

    Participant
    Points: 190
    Helping Hand
    Rank: Participant

    Yes if doing it via CSV the logformat would need to be reconsidered.
    On the other hand it's appendable and easy to import in either powershell or e.g. Excel.
    You could use some other delimter for the array, e.g. using ';' so that they will stay in one column etc.
    Maybe not the prettiest solution for the array but might be a solution.

  • #120895

    Participant
    Points: 306
    Helping Hand
    Rank: Contributor

    As for this...

    I have written a function that writes information to a logfile in JSON format.

    … any reason why you chose a text file, vs creating a custom event log with the same info, and granting access for whoever needs to read it from there, that way, it's still just you text, in a central location?

    Just curious.

    Articles and Examples:

    Powershell Script To Create Custom Event Logs and Register App Sources

    This Script Creates a custom event log and registers a source in this log. If the log already exist then the script just registers the source to it. This allows the custom code events to be logged to this custom log. To Run the script just change the path of the config file in the

    https://gallery.technet.microsoft.com/Powershell-Script-To-b07a6e9d

    https://blogs.technet.microsoft.com/heyscriptingguy/2013/02/01/use-powershell-to-create-and-to-use-a-new-event-log

    https://blogs.msdn.microsoft.com/kenobonn/2014/03/29/create-event-log-sources-using-powershell

    https://blogs.technet.microsoft.com/heyscriptingguy/2013/06/20/how-to-use-powershell-to-write-to-event-logs

    • #121809

      Participant
      Points: 54
      Rank: Member

      Hi postanote,

      Thanks for your reply.

      … any reason why you chose a text file, vs creating a custom event log with the same info, and granting access for whoever needs to read it from there, that way, it's still just you text, in a central location?

      The reason I am using a text-file is because it is easier to consume and manage right now. An EventLog usually requires someone to have some sort of permissions to the machine where the events are written, which is not something that we would want to do in all occasions.

      When it comes to centralizing EventLogs you usually have to have some software that can parse the logs and display them for the user. It is the end goal but we are not there yet, so the centralizing of logfiles is much easier.

  • #121822

    Participant
    Points: 190
    Helping Hand
    Rank: Participant

    Come to think of it, what you could do is use YAML instead of JSON.
    If you want to convert it to objects to work with in powershell, then there is a module in the PSGallery to do it.
    There are three modules and tested one of them (powershell-yaml).
    The module isn't that great to convertto-yaml and then output it to a file.
    Since it create an output that doesn't work with convertfrom-yaml.
    It has a switch to output a 'jsoncompatible' format but it's basically a long string with '\r\n' mixed into it.

    Perhaps the other module is better at this but as long as you don't need to import, manipulate and then export it, it should be fine.

    But using the standard YAML syntax works, so if you just append to the file in that format then it should be pretty good.
    E.g. from the Ansible documents.

    $yaml = @"
    -  martin:
        name: Martin D'vloper
        job: Developer
        skills:
          - python
          - perl
          - pascal
    -  tabitha:
        name: Tabitha Bitumen
        job: Developer
        skills:
          - lisp
          - fortran
          - erlang
    "@
    
    $obj = $yaml | convertfrom-yaml # module need to be installed.
    

    Only gotcha if you use that format in the file is that you need to use the -RAW switch with Get-Content if you need to import it.

  • #121827

    Participant
    Points: 190
    Helping Hand
    Rank: Participant

    Have no idea where my post went but I can't see it anymore, maybe it will turn up 🙂

    Anyway, the main idea is that you could use YAML instead of JSON.
    It should be easier to append information in that format.
    There is no native parser in powershell for it but there are modules in PSGallery.

  • #121963

    Participant
    Points: 60
    Rank: Member

    I guess there's quite a few things to look at here before even proceeding with looking at the code

    1) Management of the information
    At some point the data in the logfile/data file is going to be stale unless there is some form of removal/archival of the data. How do you plan to keep the information relevant and not just balloon in size?

    2) Access to the information
    Are all log file(s) going to be stored on a central share, or locally on the system itself? Centralization wherever possible is best

    3) Presentation of the information
    Are you talking about the user just opening the logfile and doing a search? Reason I'm asking is that if you detach it to a separate presentation layer, like via webservices, it's not going to lock you down so much if you have to change the code at a later date. Additionally, it also makes it feasible to consider having the script use HTTP requests for reading, searching, and writing of information. It's really quite simple to do and is much more efficient and scalable.

    If you just want to do appending of data to a json file, I guess there are several options:

    1. Read it as a text file without converting to json (i.e. get-content), but without the closing brace to allow you to append the new info, then insert the closing brace. It's not ideal but prevents the need to do any conversion. You're still reading in a text file but without any other overhead like converting to an object. A bit of regex should make that relatively simple.
    2. Go one step further and use filestreams to read and write the data to a new file. Add in the final new data you have, then use the closing braces. Once complete, remove/archive the old file and renamed the new file to the standard name you are using.

You must be logged in to reply to this topic.