Hex changes after piping to findstr ???

This topic contains 11 replies, has 5 voices, and was last updated by  Danielle Boyd 8 months, 3 weeks ago.

  • Author
    Posts
  • #62634

    Danielle Boyd
    Participant

    Re: cat CPO0008.VDI | findstr HDRE > HDRfile

    Hex format of input file: 48 44 52 45 31 38 31 34
    Hex format of output file: FF FE 48 00 44 00 52 00 45 00 31 00 38 00 31 00 34 00

    Any help with why/how this is happening is so greatly appreciated. Thank you!

  • #62638

    Olaf Soyk
    Participant

    findstr is not Powershell. My suggestion would be to stay with the original complete cmdlet names. It is easier to read and easier to understand for yourself and for the people willing to help you.

  • #62641

    nimms
    Participant

    Hello, Danielle.

    Binary data is not what PowerShell can easily handle. When a string (and that's what cat returns) goes through a pipeline, its encoding gets set to $OutputEncoding. PowerShell is just not the right tool in your case. So you need to find another method of finding things in a binary file. For example, you can write a C# program.

    Well, maybe someone can help you do this by using .NET methods from PowerShell.

    • #62649

      Danielle Boyd
      Participant

      Thank you sooo much Nimms for your fantastic explanation! I understand and have an idea of how to remedy it. I really appreciate it, very much!

    • #62658

      Danielle Boyd
      Participant

      Well....so I changed to

      select-string ./CPO0016.vdi -pattern "HDRE" | select-object Line

      And my output is still padded as before...

    • #62664

      Danielle Boyd
      Participant

      Okay, I'm hoping I'm actually going to learn something here πŸ˜€ I just did a little manual test in the PS console shell:

      echo test > test.txt

      When I view test.txt in binary mode, using TextPad, this is what I see:

      FF FE 74 00 65 00 73 00 74 00 0D 00 0A 00

    • #62674

      nimms
      Participant

      Yeah, that's because it saves it in Unicode (UTF-16 Little-Endian). And when PowerShell tries to save raw binary data in Unicode, it screws it up. So please, don't try to do this in PowerShell, it's a pain. Just use an external tool for this. I don't know the exact tool, but it's quite easy to write one yourself.

      Anyway, I think we're going off-topic here.

  • #62692

    Max Kozlov
    Participant

    you may just use Get-Content/Set-Content instead of redirection (>)
    it have -Encoding parameter.

    and I think powershell is a good tool even for this, but it's Select-String for strings, not for byte values πŸ˜‰

  • #62695

    Danielle Boyd
    Participant

    I'm posting to a forum of experts? Which only continues with extended talk that doesn't directly relate.

    select-string ./CPO0016.vdi -pattern "HDRE" | select-object Line > ./HDRfile

    Why is output being excessively embedded? Makes no sense – anyone have a clue?

  • #62700

    Max Kozlov
    Participant

    Not, there is PS commutinty, not all of us – experts πŸ™‚

    @nimms say you about unicode, I say about get/set-content
    You lookging for more expertise here ?

    ok, I try... but want to mention, we have no crystall ball for distant seeing what your file look like and what you want to get from it. and we don't know why you want to search something literal in binary file and want to have it untouched.

    [expert mode on]
    you get unicode encoded file because string in .net internally have unicode encoding and you use string-based cmdlets and redirection that directly save that representation to file.
    if you try to use get/set-content with -encoding parameter you can directly control enconding of your data but lose string searching capabilities. (if you use 'Byte' value)
    [expert mode off]

    now you can read about unicode and read help for set-content. and finally make your work like you like πŸ™‚

  • #62703

    Ron
    Participant

    Redirection is creating a Unicode file. Try this:

    "test"|Set-Content .\test.txt -Encoding Unicode
    Get-HexDump .\test.txt
    00000000  ff fe 74 00 65 00 73 00 74 00 0d 00 0a 00        ΓΏΓΎt.e.s.t.....
    "test"|Set-Content .\test.txt -Encoding Ascii
    Get-HexDump .\test.txt
    00000000  74 65 73 74 0d 0a                                test..

    I grabbed the Get-HexDump function from poshcode if you need it.

    To fix your most recent attempt:

    Get-HexDump .\test.vdi
    00000000  48 44 52 45 31 38 31 34                          HDRE1814
    select-string ./test.vdi -pattern "HDRE" | select-object -expand Line|Set-Content .\test.txt -Encoding Ascii
    Get-HexDump .\test.txt
    00000000  48 44 52 45 31 38 31 34 0d 0a                    HDRE1814..
  • #62710

    Danielle Boyd
    Participant

    Awesome, Ron – thank you! That works and I learned πŸ™‚ Much gratitude, Danielle

You must be logged in to reply to this topic.