CSV import - Group, sort and finally Sum string values

Welcome Forums General PowerShell Q&A CSV import - Group, sort and finally Sum string values

This topic contains 11 replies, has 5 voices, and was last updated by

 
Participant
3 weeks, 5 days ago.

  • Author
    Posts
  • #130719

    Participant
    Points: 60
    Rank: Member

    Hi All,

    I am trying to minimise the time it takes me to monitor the amount time spent on certain tasks. I have a CSV file:
    Project,StartDate,StartTime,EndDate,EndTime,Duration
    Project 1,19122018,16:39:26,19122018,16:39:28,00:01.8
    Project 2,19122018,16:39:28,19122018,16:39:29,00:00.3
    Project 3,19122018,16:39:29,19122018,16:39:29,00:00.4
    Project 4,19122018,16:39:29,19122018,16:39:29,00:00.3
    Project 5,19122018,16:39:29,19122018,16:39:30,00:00.3
    The projects and dates will change week to week and are not known ahead of time. There may be multiple occurances of each project/date combination.

    I am trying to find the total duration of each project for each day.

    So far I am able to list each duration for each project/day but as they are strings I am struggling to generate a sum.

    $output = (Import-CSV -Path -Delimiter ",") | Group-Object -property StartDate,Project
    
    foreach ($name in $output)
    {
    write-host $name.name "," $name.Group.Duration
    }

    Ideally I would like
    StartDate,Project,Sum of Duration

    Thanks in advance for any assistance.

  • #131384

    Participant
    Points: 14
    Rank: Member

    If your CSV had a normal dat code it would be easier, but here is what I came up with based on your CSV

  • #131387

    Participant
    Points: 14
    Rank: Member
    $output = Import-CSV -path C:\TEMP\stuff.csv
    foreach ($project in $output)
    {
        $WTFStart = $project.StartDate.Insert(2,'-').Insert(5,'-')
        $Day = $WTFStart.Split('-')[0]
        $Month = $WTFStart.Split('-')[1]
        $Year = $WTFStart.Split('-')[2]
        $StartTime = $project.StartTime
        $StartDate = (Get-Date -Day $Day -Month $Month -Year $Year) + $StartTime
        $WTFEnd = $project.EndDate.Insert(2,'-').Insert(5,'-')
        $EndDay = $WTFEnd.Split('-')[0]
        $EndMonth = $WTFEnd.Split('-')[1]
        $EndYear = $WTFEnd.Split('-')[2]
        $EndTime = $project.EndTime
        $EndDate = (Get-Date -Day $EndDay -Month $EndMonth -Year $EndYear) + $EndTime
        $Elapsed = $EndDate - $StartDate
        $duration = [ordered]@{'StartDate'=$StartDate;
        'Name'=$project.Project;
        'Duration'= $Elapsed     
        }
    $obj = New-Object -TypeName PSObject -Property $duration
    Write-Output -InputObject $obj
    }
  • #131393

    Participant
    Points: 421
    Helping Hand
    Rank: Contributor

    Vern,
    obviously Rich posted this question three times ... ๐Ÿ˜‰

    https://powershell.org/forums/topic/grouping-sum-imported-csv-text/#post-130863

    https://powershell.org/forums/topic/grouping-sum-csv-content/#post-131100

    ... now he can choose the solution he likes. ๐Ÿ˜‰

  • #131394

    Participant
    Points: 192
    Helping Hand
    Rank: Participant

    You can use [TimeSpan]::ParseExact to convert the text...

    [TimeSpan]::ParseExact($duration, $format, $null)

    https://docs.microsoft.com/en-us/dotnet/api/system.timespan.parseexact
    https://docs.microsoft.com/en-us/dotnet/standard/base-types/custom-date-and-time-format-strings

    edit: Olaf was faster than me.

  • #131397

    Participant
    Points: 421
    Helping Hand
    Rank: Contributor

    You can use [TimeSpan]::ParseExact to convert the text..

    .... unfortunately the duration Rich showed in his CSV data has an "unconventional" format. Do you know a format string we could use to use with [TimeSpan]::ParseExact ?

  • #131399

    Participant
    Points: 192
    Helping Hand
    Rank: Participant

    The last one might be "f: The tenths of a second"...

    [TimeSpan]::ParseExact($duration, 'mm\:ss\.f', $null)

    So no more than an hour work per entry.

  • #131427

    Participant
    Points: 421
    Helping Hand
    Rank: Contributor

    Wow ... cool ... thanks a lot. I hope Rich will see this. That makes his task much easier.

    Happy holidays! ๐Ÿ˜‰

  • #131430

    Participant
    Points: 192
    Helping Hand
    Rank: Participant

    You too ๐Ÿ™‚

    I'm thinking it might be a bug in the csv... Only an hour per entry?

  • #131480

    Participant
    Points: 14
    Rank: Member

    Thanks Olaf, I was unaware of the other 2, this is the only one Don Jones tweeted about though.

  • #131468

    Participant
    Points: 9
    Rank: Member

    Hi, perhaps I've got a bit carried away, but you can try this:

    < #
    .Synopsis
       Aggregate project activity durations.
    .DESCRIPTION
       The script measures activity durations, based on a CSV time sheet entries.
    .EXAMPLE
       Measure-ProjectActivityDuration -Path .\TimeSheet.csv
    .EXAMPLE
       Measure-ProjectActivityDuration -TimeSheet .\TimeSheet_CurrentWeek.csv -Delimiter ';'
    .INPUTS
       [System.String]
    .OUTPUTS
       [System.Management.Automation.PSCustomObject]
    .NOTES
       In case the total duration of activities for a project exceeds 24 hours, the number of days will be set as a dot-delimited prefix to the HH:mm:ss entry, as shown below:
       TotalDuration StartDate Project
    
       ------------- --------- -------
       1.04:41:53    19122018  Project 2
       02:00:06      19122018  Project 1
       00:00:01      19122018  Project 5
       00:00:00      19122018  Project 4
       00:15:00      19122018  Project 3
    
    .COMPONENT
       The component this cmdlet belongs to
    .ROLE
       The role this cmdlet belongs to
    .FUNCTIONALITY
       The functionality that best describes this cmdlet
    
    #>
    
    [CmdletBinding()]
    [Alias()]
    [OutputType([System.Management.Automation.PSCustomObject])]
    
    Param
    (
        # Path to the timesheet file.
        [Parameter(Mandatory = $true, Position = 0)]
        [ValidateNotNull()]
        [ValidateNotNullOrEmpty()]
        [Alias("TimeSheet")]
        [String]$Path,
    
        # Symbol used to delimit the columns in the CSV file.
        [Parameter(Mandatory = $false, Position = 1)]
        [ValidateNotNull()]
        [ValidateNotNullOrEmpty()]
        [String]$Delimiter = ','
    )
    
    Begin {
        # Import and sort the time sheet.
        $timeSheet = Import-Csv -LiteralPath $Path -Delimiter $Delimiter | Sort-Object StartDate, StartTime;
        # Create and empty duration table.
        [System.Collections.ArrayList]$durationTable = @();
    }
    Process {
        foreach ($entry in $timeSheet) {
            # Create an entry in the duration table if the project has not been listed there yet.
            if ([bool]($durationTable | Where-Object {$_.Project -eq $entry.Project}) -eq $false) {
                $tableEntry = New-Object -TypeName PSCustomObject -Property @{Project = $entry.Project; StartDate = $entry.StartDate; TotalDuration = New-TimeSpan};
                $durationTable.Add($tableEntry) | Out-Null;
            }
            # Calculate the duration of the activities for the current entry.
            $startTime = [datetime]::ParseExact("$($entry.StartDate) $($entry.StartTime)", "ddMMyyyy HH:mm:ss", $null);
            $endTime = [datetime]::ParseExact("$($entry.EndDate) $($entry.EndTime)", "ddMMyyyy HH:mm:ss", $null);
            $activityDuration = New-TimeSpan -Start $startTime -End $endTime;
    
            # Add the duration for the current entry to the total duration for the project.
            $projectEntry = $durationTable | Where-Object {$_.Project -eq $entry.Project};
            $projectEntry.TotalDuration += $activityDuration;
        }
    }
    End {
        Return $durationTable;
    }

     

  • #131666

    Participant
    Points: 192
    Helping Hand
    Rank: Participant

    Example of using the duration column

    $csvfile = 'Project,StartDate,StartTime,EndDate,EndTime,Duration
    Project 1,19122018,16:39:26,19122018,16:39:28,00:01.8
    Project 2,19122018,16:39:28,19122018,16:39:29,00:00.3
    Project 3,19122018,16:39:29,19122018,16:39:29,00:00.4
    Project 4,19122018,16:39:29,19122018,16:39:29,00:00.3
    Project 1,22122018,16:39:26,19122018,16:39:28,00:03.8
    Project 1,24122018,16:39:26,19122018,16:39:28,00:11.1
    Project 5,19122018,16:39:29,19122018,16:39:30,00:00.3'
    
    $csv = ConvertFrom-Csv $csvfile
    
    # parse information
    $durationformat = 'mm\:ss\.f'
    
    # group by project
    $grouped = $csv | Group-Object -Property Project
    
    $result = foreach($prj in $grouped) {
        $sum = [TimeSpan]::new(0)
    
        # process all durations for this project
        foreach($entry in $prj.Group) {
            $sum += [TimeSpan]::ParseExact($entry.Duration, $durationformat, $null)
        }
    
        [pscustomobject]@{
            # !! needs to sort on startdate, if entries are not in order
            StartDate = $prj.Group[0].StartDate
            Project = $prj.Name
            'Sum of Duration' = $sum.ToString($durationformat)
        }
    }
    
    $result | ConvertTo-Csv -NoTypeInformation

You must be logged in to reply to this topic.