Sorting Table, Results Not As Expected

This topic contains 0 replies, has 1 voice, and was last updated by Profile photo of Forums Archives Forums Archives 5 years, 5 months ago.

  • Author
    Posts
  • #5197

    by qjudge at 2012-10-16 06:56:31

    Hi All,

    I am trying to use the following code to import a .csv file as a table, then sort the table numerically:

    $ResultsTable = Import-CSV 'C:\Statistics.csv'
    $SortedTable = $ResultsTable | Sort-Object Occurrences

    The .csv file contains two columns, with the titles 'Results' and 'Occurences' in the first row. The code "functions', but the sorting is not correct. Output of $SortedTable shows:

    Results———-Occurrences
    45—————-10
    26—————-11
    28—————-2
    22—————-3

    ...and so on. How do I get the table to be sorted by the actual number in the 'Occurrences' column?

    Thanks!

    Quinn

    by Makovec at 2012-10-16 07:45:50

    Hi,
    the reason is that data from your file are loaded as string and therefore sorted as text. It's visible when you import CSV and send it to Get-Member:
    PS> Import-Csv .\statistics.csv | Get-Member

    TypeName: System.Management.Automation.PSCustomObject

    Name MemberType Definition
    ---- ---------- ----------
    Equals Method bool Equals(System.Object obj)
    GetHashCode Method int GetHashCode()
    GetType Method type GetType()
    ToString Method string ToString()
    Occurences NoteProperty System.String Occurences=10
    Results NoteProperty System.String Results=45

    See the last two records, both are Strings. There are more ways to do what you want, first I tried is this one:
    PS> Import-Csv .\statistics.csv | Select-Object @{l='Results';e={[int]$_.Results}},@{l='Occurences';e={[int]$_.Occurences}} |Sort Occurences

    I used calculated properties and converted Strings to numbers. As you see in my definition, I am taking Occurences (and Results) columns and using [int] converting it to number. Then it's passed to Sort-Object which then sort it as numbers. You can check it this way (if running Get-Member against result):

    TypeName: Selected.System.Management.Automation.PSCustomObject

    Name MemberType Definition
    ---- ---------- ----------
    Equals Method bool Equals(System.Object obj)
    GetHashCode Method int GetHashCode()
    GetType Method type GetType()
    ToString Method string ToString()
    Occurences NoteProperty System.Int32 Occurences=10
    Results NoteProperty System.Int32 Results=45

    Occurences and Results are now Int32 (numbers). There is an article about calculated properties here: http://technet.microsoft.com/en-us/library/ff730948.aspx

    David

    by qjudge at 2012-10-16 08:12:52

    David,

    That worked perfectly, thank you!

    I understand what '[int]' means, but I'm still a little foggy on what exactly is happening in the 'Select-Object' section of code in your example. Could you explain that more (or point me to a webpage)? Still trying to 'un-noob' myself...

    Thanks a bunch either way!

    Quinn

    by Makovec at 2012-10-16 15:52:07

    Good to hear 🙂 Sure, that Select-Object thing is a bit messy anyway 😉

    Let's have the same example but just with one column to have it shorter. This is input file input.csv:

    Number
    1
    5
    2
    10
    3

    so when you import it to PowerShell you'll receive this:
    PS> Import-Csv .\input.csv
    Number
    ------
    1
    5
    2
    10
    3

    Let's say you want to create completely new column based on that data (real world example should be converting kB to GB, file size to MB, etc.) I want to have second column showing power of given number. So I need these calculated properties for that. It has this pattern:

    @{ Label='Label name'; Expression={} }
    which is a hash table containing new column name (Label) and value (Expression). Label is just easy text – name of newly created property, but the whole thing happen in Expression part. In our case, I'd like to just write something like "Number*Number". If I would need this in console I'll write:
    PS> $number * $number
    and it's exactly (OK – almost 🙂 the same I can put to my Expression part. As I am working in pipeline I can access that Number property using $_ placeholder, so should be:
    $_.Number * $_.Number
    the last change we will do is related to string to number conversion, so final statement should be:
    [int]$_.Number * [int]$_.Number
    which we can add to Select-Object cmdlet:
    Import-Csv .\input.csv | Select-Object Number, @{Label='Power'; Expression={ [int]$_.Number * [int]$_.Number } }

    What will happen now: For every record in CSV file, Select-Object will take the Number (first property), then check Expression of second (calculated) property and will calculate power of this number. For this result creates property named Power . Then will do the same for all incoming numbers. So in general – you are taking properties from pipeline and doing some counts with them. Result is then saved to new property. Which we can see with Get-Member:
    TypeName: Selected.System.Management.Automation.PSCustomObject

    Name MemberType Definition
    ---- ---------- ----------
    Equals Method bool Equals(System.Object obj)
    GetHashCode Method int GetHashCode()
    GetType Method type GetType()
    ToString Method string ToString()
    Number NoteProperty System.String Number=1
    Power NoteProperty System.Int32 Power=1

    Hope it's clearer now 🙂
    You can see more at following addresses:
    http://technet.microsoft.com/en-us/library/ff730948.aspx
    http://powershell.com/cs/blogs/ebookv2/archive/2012/02/07/chapter-4-arrays-and-hashtables.aspx#using-hash-tables-to-calculate-properties
    In help for Format-Table

    David

You must be logged in to reply to this topic.