Remove entire rows from csv with same value under specific column heading.

Welcome Forums General PowerShell Q&A Remove entire rows from csv with same value under specific column heading.

This topic contains 11 replies, has 4 voices, and was last updated by

 
Participant
1 month, 2 weeks ago.

  • Author
    Posts
  • #169642

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    Hi folks,

    I hope someone will be able to help me with this.

    I have a csv file called 'original.csv'. Within this file there are around 1000 rows of data. The issue I have is that there are many duplicate rows in this file whereby some rows are exactly the same as other rows except one column value being different – this column value (Col2) has a different time value but all other values in each of the columns are the same. Please see the example below:

     Col1,Col2,Col3,Col4,Col5,Col6

    1,11:56,3,4,Smith,6

    1,11:57,3,4, Smith,6

    1,11:58,3,4, Smith,6

    I'd like to keep the most recent value (11.58) and export this row and only this row to a separate csv file based on the value of Col5 (Smith). So, I need to have the most recent row for each surname (Smith). At the moment I have multiple rows with the same values whereby only the time value for Col2 differs. I only want one record per surname and this record has to be the most recent.  

    I have looked into input.csv, export csv, compare, sort and select but cannot get to the bottom of this. Any help would be greatly appreciated. My current csv file also has a header row at the top of the file.

     

    Thanks in advance

    Tess

  • #169651

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor
    #region Input
    
    $CSV = '.\CSV1.csv'
    @'
    Col1,Col2,Col3,Col4,Col5,Col6
    1,11:56,3,4,Smith,6
    1,11:58,3,4, Smith,6
    1,11:57,3,4, Smith,6
    1,10:57,3,4, Tom,6
    1,10:58,3,4, Tom,6
    1,10:56,3,4,Tom,6
    '@ | Out-File $CSV
    
    #endregion
    
    #region read and View CSV
    
    $myData = Import-Csv $CSV 
    $myData | FT -a 
    < #
    Col1 Col2  Col3 Col4 Col5  Col6
    ---- ----  ---- ---- ----  ----
    1    11:56 3    4    Smith 6   
    1    11:57 3    4    Smith 6   
    1    11:58 3    4    Smith 6   
    1    10:56 3    4    Tom   6   
    1    10:57 3    4    Tom   6   
    1    10:58 3    4    Tom   6    
    #>
    
    #endregion
    
    #region Process
    
    $SurNameList = $myData.Col5 | select -Unique
    $SurNameList
    < #
    Smith
    Tom
    #>
    
    $myOutput = foreach ($SurName in $SurNameList) {
        ($myData | where Col5 -EQ $SurName | sort Col2)[-1] 
        # 'Where' selects the rows belonging to the current SurName in the loop
        # 'sort' sorts the resulting rows by Col2
        # [-1] selects the last member of this array
    }
    
    #endregion
    
    #region Output
    
    $myOutput | FT -a 
    < #
    Col1 Col2  Col3 Col4 Col5  Col6
    ---- ----  ---- ---- ----  ----
    1    11:58 3    4    Smith 6   
    1    10:58 3    4    Tom   6   
    #>
    
    #endregion
    
  • #169654

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor

    less verbose..

    $CSV = '.\CSV1.csv'
    @'
    Col1,Col2,Col3,Col4,Col5,Col6
    1,11:56,3,4,Smith,6
    1,11:58,3,4, Smith,6
    1,11:57,3,4, Smith,6
    1,10:57,3,4, Tom,6
    1,10:58,3,4, Tom,6
    1,10:56,3,4,Tom,6
    '@ | Out-File $CSV
    
    
    
    $myData = Import-Csv $CSV 
    
    $myOutput = foreach ($SurName in ($myData.Col5 | select -Unique)) {
        ($myData | where Col5 -EQ $SurName | sort Col2)[-1] 
    }
    
    $myOutput | FT -a 
    
  • #169663

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    Many thanks Sam for your speedy and very helpful reply. I already have a formatted csv file and I was therefore wondering how I would export the two lines in your filtered example to a new csv file. I'm trying to get to grips with PowerShell and my apologies if this seems very obvious. Also what does the | FT -a do?

    Thanks in advance.

  • #169669

    Participant
    Topics: 9
    Replies: 423
    Points: 676
    Helping Hand
    Rank: Major Contributor

    FT -a is short for Format-Table -Autosize
    You would export the output to CSV by piping it to Export-Csv

  • #169672

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    okay thanks, this is what I have at present:

    I have changed the variable names to better suit their purpose:

    
    

    $CleanLog= "C:\_logs\CleanAppLog.csv"

    $myData = Import-Csv $CleanLog

    $myOutput = foreach ($recordRow in ($myData.HostIP | select -Unique)) {
    ($myData | where HostIP -EQ $recordRow | sort UserIdentifier)[-1] | Export-CSV $CleanLog
    # 'Where' selects the rows belonging to the current HostIP in the loop
    # 'sort' sorts the resulting rows by UserIdentifier column heading
    # [-1] selects the last member of this array
    }

    Please note that I have now changed Surname to HostIP and Col5 to UserIdentifier.

    Thanks again.

     

  • #169675

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    I know that I'm currently overwriting the source file with the updated file but it would be better to create a completely new file using export csv rather than overwriting the source. Thanks

  • #169687

    Participant
    Topics: 1
    Replies: 1530
    Points: 2,591
    Helping Hand
    Rank: Community Hero

    Feel free to change the code suggestion you got here according to your needs. There is no copyright on it. 😉

  • #169693

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    thanks, do you have any ideas how to solve this?

  • #169708

    Participant
    Topics: 1
    Replies: 1530
    Points: 2,591
    Helping Hand
    Rank: Community Hero

    ... do you have any ideas how to solve this?

    Of course. Take a look at Sams last example code ... instead of piping the result to FT (Format-Table) you could pipe it to Export-CSV.

    BTW: He already recommended that earlier on. 😉

  • #169801

    Participant
    Topics: 1
    Replies: 5
    Points: -18
    Rank: Member

    Thanks for the reply. I have tried piping it using export-csv to variable path $CleanLog (the code is located a few messages up) instead of a table but the output file does not contain the expected filtered/parsed data. This got me thinking that perhaps the export-csv was not the best/only way to go.

  • #169837

    Participant
    Topics: 1
    Replies: 3
    Points: 31
    Rank: Member

    If export-csv is giving you row counts or row numbers instead of your data you might need to add -NoTypeInformation, you could also tee-object or out-file depending on your script and it's purpose.

You must be logged in to reply to this topic.