Convertto-CSV and out-file vs Export-Csv

Welcome Forums General PowerShell Q&A Convertto-CSV and out-file vs Export-Csv

This topic contains 5 replies, has 4 voices, and was last updated by

js
 
Participant
3 months, 1 week ago.

  • Author
    Posts
  • #111107

    Participant
    Points: 1
    Rank: Member

    Hi,

    Was just playing with exporting directory listings to CSV and noticed a little strangeness. I hope someone can enlighten me why this happens?

    Two commands which seem to work the same,

    Get-ChildItem | Select-Object fullname,length | ConvertTo-Csv |out-file -FilePath dir-list.csv
    Get-ChildItem | Select-Object fullname,length | Export-Csv -Path dir-list2.csv

    The resulting files look the same in Notepad++ and when 'cat'ed. But when the file sizes are checked the first command always creates a file about twice the size of the second command. I have opened both files in a hex editor and the larger file shows NULL (hex code 00) characters separating every character which accounts for the size difference. Why is this happening?

  • #111136

    Participant
    Points: 297
    Helping Hand
    Rank: Contributor

    Ran a few tests myself, and I can say that it's not the CSV cmdlets causing the difference. The issue appears to be the Out-File cmdlet, and it's only present in Windows PowerShell 5.1, not PS Core (6.1.0 RC1). Unsure of prior versions, but it's likely that it's a long-standing bug that was fixed for PS Core at some point.

    Instead, I'd suggest using the Set-Content or Add-Content cmdlet.

  • #111137

    Participant
    Points: 269
    Helping Hand
    Rank: Contributor

    Use Notepad++ to check the encoding of the files. There you will see the difference. If you like to have it equally use this:

    Get-ChildItem -exclude 'dir-list*.csv' | Select-Object fullname,length | ConvertTo-Csv -NoTypeInformation |out-file -FilePath dir-list.csv -Encoding utf8
    
    Get-ChildItem -exclude 'dir-list*.csv' | Select-Object fullname,length | Export-Csv -Path dir-list2.csv -Encoding utf8 -NoTypeInformation
  • #111170

    Participant
    Points: 1
    Rank: Member

    I found out what is happening but not the why. I went back to double check the files in Notepad++ as @olaf-soyk suggested but couldn't see any differences. I did notice that Notepad++ had decided that the files had difference encodings.
    The smaller file was UTF-8
    The larger file UCS-2 BE ROM

    I haven't come across UCS-2 BE ROM encoding before but a quick websearch showed it to be a 16-bit encoding as opposed to the UTF-8 which is 8-bit. I suppose it should have been obvious when I saw the extra empty chars in the hex editor!

    Using out-file with -encoding utf8 gives files of equivalent size. There is still some BOM characters at the beginning of the file though. Hope this helps someone.

  • #111196

    Participant
    Points: 269
    Helping Hand
    Rank: Contributor

    BTW: That does not affect the functionality of the files. It takes a little more space and you could save some more "exotic" charachters from the unicode table. But it will work the same as UTF8 encoded files in common environments. 😉

  • #111206
    js

    Participant
    Points: 326
    Helping Hand
    Rank: Contributor

    Yep. I had a thread about this a little while ago. At first I thought it was unix text. PS 5's Out-File (or ">") encodes in what Notepad calls "unicode" and most other commands output in what Notepad calls "ansi". Some applications won't like it, like Infoblox (for csv import).

The topic ‘Convertto-CSV and out-file vs Export-Csv’ is closed to new replies.