Manipulating imported CSV data points for new CSV export

This topic contains 4 replies, has 4 voices, and was last updated by Profile photo of Max Kozlov Max Kozlov 2 months, 1 week ago.

  • Author
    Posts
  • #66321
    Profile photo of alwayslearning
    alwayslearning
    Participant

    As a newbie to manipulating CSV files within PowerShell I need a foot up with the following problem.

    stats.csv contains data for a time series of two objects A,B. I need to create a report referencing some values from this and also need to calculate new values from the data it contains.

    stats.csv available here:

    The first value on each row is the time series (PT). There are two rows for each time event in the file, one for each object (ON) A and B. The second value for each row is the object (ON). As you can see if you scroll through the CSV, there will be two entries for each PT – one for ON A and another for ON B. Each ON has it's own unique data for values for the PT for TT,RIO,TB,RS,WS,RT that need to be calculated or passed directly to a new CSV.

    The following calculations are needed for the report (excuse the crudeness below):

    $total_tt = sum TT where PT value is the same for ON=A And ON=B
    e.g. for the same PT entries @ 02/06/2017 15:45
    $total_tt would be 12804.49537

    $total_tb = sum TB where PT value is the same for ON=A And ON=B
    e.g. for the same PT entries @ 02/06/2017 15:45
    $total_tb would be 180.226005

    $avg_rp = (if $total_tt>0(sum RIO where same PT for ON=A and=B)/$total_tt)
    e.g. for PT entries @ 02/06/2017 15:45
    $avg_rp would be 53.81826021 (percentage)

    $avg_rs = (if sum of RT from same PT value for ON=A And ON=B>0(RIO*RS for ON=A+RIO*RS where ON=B)/sum of RIO for A and B)
    e.g. for PT entries @ 02/06/2017 15:45
    $avg_rs would be 9.944376086

    $avg_ws = within the same TS(if $total_tt-RIOA-RIOB>0((TTA-RIOA)*WSA+(TTB-RIOB)*WSB/(RIOA-RIOB)
    e.g. for PT entries @ 02/06/2017 15:45
    $avg_ws would be 17.33865279

    $avg_rt = average RT where same PT value is observed for ON=A And ON=B
    e.g. for PT entries @ 02/06/2017 15:45
    $avg_rt would be 0.293554

    Create a 1st output file: report_detail.csv with the following values (some newly calculated, some from the original csv):

    PT,$total_tt,$total_tb,$avg_rp,$avg_rs,$avg_ws,$avg_rt,RT(ON:A from stats.csv),RT(ON:B from stats.csv)

    Next, return the row numbers from report_detail.csv that contain the maximum and 95th percentile value for $total_tt and send them to report_headlines.csv with first row headers for the csv:

    measurement,PT,$total_tt,$total_tb,$avg_rp,$avg_rs,$avg_ws,$avg_rt,RT(ON:A from original CSV),RT(ON:B from original CSV).

    Any help is appreciated to get me started.

  • #66330
    Profile photo of Olaf Soyk
    Olaf Soyk
    Participant

    You could start with "reading" your csv file with Import-CSV. The you iterate through all lines with a Foreach loop and calculate whatever you need. In the end you can output all this to the console or export it to a new csv with Export-CSV

    BTW: A few lines of your csv file would have been enough to show what it's about! 😉

    • #66331
      Profile photo of alwayslearning
      alwayslearning
      Participant

      Hi. No issues. I can trim back the csv 😉

      Would you be able to show me a working example of what you suggest of what I'm asking for with $total_tt? If I get a foot up I can work the remainder.

  • #66340
    Profile photo of Sam Boutros
    Sam Boutros
    Participant

    Here you go:

    # region Input
    $Data = Import-Csv .\stats.csv 
    #endegion
    
    
    #region Process
    $A = ($Data | group ON | ? { $_.Name -eq 'A' }).Group
    $B = ($Data | group ON | ? { $_.Name -eq 'B' }).Group
    
    $Analysis = foreach ($ONA in $A) { 
        Remove-Variable ONB -ErrorAction SilentlyContinue
        $ONB = $B | ? { $_.PT -eq $ONA.PT }
        if ($ONB) {
            $SumTT = [decimal]$ONA.TT + [decimal]$ONB.TT
            $SumRT = [decimal]$ONA.RT + [decimal]$ONB.RT
            New-Object -TypeName PSObject -Property @{
                PT         = [DateTime]$ONA.PT
                RTA        = [decimal]$ONA.RT
                RTB        = [decimal]$ONB.RT
                SumTT      = $SumTT
                SumTB      = [decimal]$ONA.TB + [decimal]$ONB.TB
                AvgRB      = $(
                    if ($SumTT -gt 0) {
                        ([decimal]$ONA.RIO + [decimal]$ONB.RIO)/$SumTT
                    } else { 0 }
                )
                'AvgRB(%)' = $(
                    if ($SumTT -gt 0) {
                        '{0:P2}' -f (([decimal]$ONA.RIO + [decimal]$ONB.RIO)/$SumTT) 
                    } else { 0 }
                )
                AvgRS      = $(
                    if ($SumRT -gt 0) {
                        (([decimal]$ONA.RIO * [decimal]$ONA.RS) + ([decimal]$ONB.RIO * [decimal]$ONB.RS)) /
                        ([decimal]$ONA.RIO * [decimal]$ONB.RIO) * 100                  
                    } else { 0 }
                )
                AvgWS      = $(
                    if ($SumTT - ([decimal]$ONA.RIO + [decimal]$ONB.RIO)) {
                        (([decimal]$ONA.TT-[decimal]$ONA.RIO)*[decimal]$ONA.WS + 
                            ([decimal]$ONB.TT-[decimal]$ONB.RIO)*[decimal]$ONB.WS) / 
                                ([decimal]$ONA.RIO-[decimal]$ONB.RIO)
                    } else { 0 }
                )
                AvgRT      = ([decimal]$ONA.RT+[decimal]$ONB.RT)/2
            } 
        }
    } 
    $Analysis = $Analysis | Sort SumTT -Descending 
    #endregion
    
    
    #region Output
    $Analysis | select PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
        Export-Csv .\report_detail.csv -NoType -Force
    
    $Index95 = [math]::ceiling($Analysis.Count * 0.95)
    $95PercentileValue = ($Analysis | Sort SumTT)[$Index95]
    $Maximum = $Analysis | select -First 1 
    $Maximum | Add-Member -MemberType NoteProperty -Name 'Measurement' -Value 'Maximum'
    $Maximum | select Measurement, PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
        Export-Csv .\report_headlines.csv -NoType -Force 
    $95PercentileValue | Add-Member -MemberType NoteProperty -Name 'Measurement' -Value '95Percentile'
    $95PercentileValue | select Measurement, PT, SumTT, SumTB, AvgRB, AvgRS, AvgWS, AvgRT, RTA, RTB | 
        Export-Csv .\report_headlines.csv -NoType -Append
    #endregion
    

    You owe me $50..lol

  • #66345
    Profile photo of Max Kozlov
    Max Kozlov
    Participant

    I think than grouping will be much more effective if
    instead of grouping by ON we directly group by PT

    $Data = Import-Csv D:\stats.csv | Group-Object -Property PT | Foreach-Object {
      $a = $_.Group | Where-Object { $_.ON -eq 'A' }
      $b = $_.Group | Where-Object { $_.ON -eq 'B' }
      [PSCustomObject]@{
         PT = $_.Name
         A = $a
         B = $b
      }
      #manipulate data 1
    }
    #manipulate data 2
    

You must be logged in to reply to this topic.