How to read and parse a text file

This topic contains 23 replies, has 5 voices, and was last updated by Profile photo of Ankit Parmar Ankit Parmar 1 year, 11 months ago.

  • Author
    Posts
  • #28538
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    I have a text file in which data is in below given format. Its a text delimited file with 10000 lines of data. Main columns are given below though there are many more columns.

    Supplier_Name|Supplier_Address_1|Supplier_Postcode|VAT_Reg
    ACCOMMODATION|P O Box 123|3404|11009341296
    PROVIDERS|P O Box 543|6704|451895748

    this is a file format .. now through powershell script I want to see only those Supplier VAT_Reg id's those are below 11 digit. Kindly guide me how to do this through script as i am new in this field.

    Thanks
    Ankit

  • #28540
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    That's a delimited text file, which is very easy to read in PowerShell with the Import-Csv cmdlet. (Even though it's not a comma-delimited file, there's a parameter which lets you switch that). Try this:

    Import-Csv -Path c:\path\to\your\file.txt -Delimiter '|' |
    Where-Object { $_.VAT_Reg.Length -lt 11 }
    
  • #28542
    Profile photo of Chris Wolfenden
    Chris Wolfenden
    Participant

    Are you just looking to see a column of VAT_Reg numbers?

    EDIT: Sorry read to quick and missed the the piece about less than 11 digits

  • #28545
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @ Dave – I am trying to run your code but it is not giving me any output. I gave my file path where my file is placed but it is showing nothing on executing.

  • #28546
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @chris – Yes I want to see the VAT_reg column with the all other columns name for that particular supplier whose VAT_Reg id is below than 11 digit.

  • #28550
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @dave – Its working now.. but it is showing results for all supplier's but my requirement is to see only result for those supplier's whose VAT_Reg id is below then 11 digit..

    It is giving result in below format –

    Party_Identification_No :
    Company_Number :
    Company_Name :
    Supplier_Number : SSD0040271
    Supplier_Name : PROVIDERS
    Supplier_Address_1 :
    Supplier_Address_2 :
    Supplier_Address_3 :
    Supplier_Postcode :
    VAT_Reg : 2611341208
    Bank_Account_Number_1 :
    Bank_Account_Number_2 :
    Bank_Account_Number_3 :
    Bank_Account_Number_4 :
    Bank_Account_Number_5 :
    Bank_Account_Number_6 :
    Bank_Account_Number_7 :
    Bank_Account_Number_8 :
    IBAN_Number :
    Swift_Code_1 :
    Swift_Code_2 :
    Supplier_EDI_ID :
    Currency : AUD

  • #28561
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    Please post your current command you are running. Dave's command should do what you are asking, so we would need to look at the command you are running.

  • #28563
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    Hi Rob,

    This is the command what i am running on my workstation.

    Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' |
    Where-Object { $_.VAT_Reg.Length -lt 11 }

    I have gave path with and without single quotes but both time giving the same results.

    Thanks
    Ankit

  • #28564
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    Ok, so let's troubleshoot and see what is going on. When importing from a CSV, everything is typically a string, which contains Length property. If for some reason the value is being converted to a different type, (e.g. int) it could be causing your issue. Try running this and see what the output is:

    Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' | foreach{ "VAT_Reg {0} is {1} characters long and is type {2}" -f $_.VAT_Reg,$_.VAT_Reg.Length, $_VAT_Reg.GetType().Name)
    
  • #28581
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    Hi Simmers,

    Apologies for the late reply.. I just stuck with this site login problem and it was late night too.. really its a weird site.. Anyhow i executed your given query but again its giving error

    You cannot call a method on a null-valued expression.
    At line:1 char:195

    I tried to execute this query with both curly braces and parenthesis. In your code you have started with curly braces and end up with parenthesis so on running first time its giving error so I have run with both the parameters.

    Thanks
    Ankit

  • #28582
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    Try this, as a troubleshooting step. Please post the output here:

    Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' |
    Get-Member
    
  • #28585
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    Hi Dave,

    Please find the below given output... Thanks

    TypeName: System.Management.Automation.PSCustomObject

    Name MemberType Definition
    —- ———- ———-
    Equals Method bool Equals(System.Object obj)
    GetHashCode Method int GetHashCode()
    GetType Method type GetType()
    ToString Method string ToString()
    Bank_Account_Number_1 NoteProperty System.String Bank_Account_Number_1=
    Bank_Account_Number_2 NoteProperty System.String Bank_Account_Number_2=
    Bank_Account_Number_3 NoteProperty System.String Bank_Account_Number_3=
    Bank_Account_Number_4 NoteProperty System.String Bank_Account_Number_4=
    Bank_Account_Number_5 NoteProperty System.String Bank_Account_Number_5=
    Bank_Account_Number_6 NoteProperty System.String Bank_Account_Number_6=
    Bank_Account_Number_7 NoteProperty System.String Bank_Account_Number_7=
    Bank_Account_Number_8 NoteProperty System.String Bank_Account_Number_8=
    Company_Name NoteProperty System.String Company_Name=
    Company_Number NoteProperty System.String Company_Number=
    Currency NoteProperty System.String Currency=
    IBAN_Number NoteProperty System.String IBAN_Number=
    Party_Identification_No NoteProperty System.String Party_Identification_No=
    Supplier_Address_1 NoteProperty System.String Supplier_Address_1=
    Supplier_Address_2 NoteProperty System.String Supplier_Address_2=
    Supplier_Address_3 NoteProperty System.String Supplier_Address_3=
    Supplier_EDI_ID NoteProperty System.String Supplier_EDI_ID=
    Supplier_Name NoteProperty System.String Supplier_Name=UNKNOWN_SUPPLIER
    Supplier_Number NoteProperty System.String Supplier_Number=00000
    Supplier_Postcode NoteProperty System.String Supplier_Postcode=
    Swift_Code_1 NoteProperty System.String Swift_Code_1=
    Swift_Code_2 NoteProperty System.String Swift_Code_2=
    VAT_Reg NoteProperty System.String VAT_Reg=

    Thanks
    Ankit

  • #28586
    Profile photo of Curtis Smith
    Curtis Smith
    Participant

    Ankit, Dave's code should absolutely do the trick. I copied and pasted your data from the original question into a file called "input.txt", then ran Dave's online against it. It seemed to do the trick quite nicely.

    Here is the Import-Csv before filtering out the ones longer than 11 characters


    PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|'

    Supplier_Name Supplier_Address_1 Supplier_Postcode VAT_Reg
    ------------- ------------------ ----------------- -------
    ACCOMMODATION P O Box 123 3404 11009341296
    PROVIDERS P O Box 543 6704 451895748

    And here it is after applying the where-object to only get those with less than 11 characters


    PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg.Length -lt 11}

    Supplier_Name Supplier_Address_1 Supplier_Postcode VAT_Reg
    ------------- ------------------ ----------------- -------
    PROVIDERS P O Box 543 6704 451895748

  • #28589
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @dave – Here is the below output by the command you have given..Thanks

    TypeName: System.Management.Automation.PSCustomObject

    Name MemberType Definition
    —- ———- ———-
    Equals Method bool Equals(System.Object obj)
    GetHashCode Method int GetHashCode()
    GetType Method type GetType()
    ToString Method string ToString()
    Bank_Account_Number_1 NoteProperty System.String Bank_Account_Number_1=
    Bank_Account_Number_2 NoteProperty System.String Bank_Account_Number_2=
    Bank_Account_Number_3 NoteProperty System.String Bank_Account_Number_3=
    Bank_Account_Number_4 NoteProperty System.String Bank_Account_Number_4=
    Bank_Account_Number_5 NoteProperty System.String Bank_Account_Number_5=
    Bank_Account_Number_6 NoteProperty System.String Bank_Account_Number_6=
    Bank_Account_Number_7 NoteProperty System.String Bank_Account_Number_7=
    Bank_Account_Number_8 NoteProperty System.String Bank_Account_Number_8=
    Company_Name NoteProperty System.String Company_Name=
    Company_Number NoteProperty System.String Company_Number=
    Currency NoteProperty System.String Currency=
    IBAN_Number NoteProperty System.String IBAN_Number=
    Party_Identification_No NoteProperty System.String Party_Identification_No=
    Supplier_Address_1 NoteProperty System.String Supplier_Address_1=
    Supplier_Address_2 NoteProperty System.String Supplier_Address_2=
    Supplier_Address_3 NoteProperty System.String Supplier_Address_3=
    Supplier_EDI_ID NoteProperty System.String Supplier_EDI_ID=
    Supplier_Name NoteProperty System.String Supplier_Name=UNKNOWN_SUPPLIER
    Supplier_Number NoteProperty System.String Supplier_Number=00000
    Supplier_Postcode NoteProperty System.String Supplier_Postcode=
    Swift_Code_1 NoteProperty System.String Swift_Code_1=
    Swift_Code_2 NoteProperty System.String Swift_Code_2=
    VAT_Reg NoteProperty System.String VAT_Reg=

    Thanks
    Ankit

  • #28593
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @Curtis @dave – Your script is right. I am sorry.. I am wrong..I also made one dummy text delimited file and insert only below values to it and it fetched the actual output from the file i.e. only for "PROVIDERS"

    Supplier_Name Supplier_Address_1 Supplier_Postcode VAT_Reg
    ————- —————— —————– ——-
    ACCOMMODATION P O Box 123 3404 11009341296
    PROVIDERS P O Box 543 6704 451895748

    But the issue is that is my real file have data of more than 10,000 of suppliers and it has in some different format.. What I mean to say is that it has some blank delimited columns also. Might be because of that it is not picking up the right output over there. Could you please share with me your mail id's so that I can share the data with you to sort out the issue.

    Thanks
    Ankit

  • #28594
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @Curtis @dave – My data is like in below given format in actual file – First row are columns name and then after that I have copied 3 rows of supplier data.

    Party_Identification_No|Company_Number|Company_Name|Supplier_Number|Supplier_Name|Supplier_Address_1|Supplier_Address_2|Supplier_Address_3|Supplier_Postcode|VAT_Reg|Bank_Account_Number_1|Bank_Account_Number_2|Bank_Account_Number_3|Bank_Account_Number_4|Bank_Account_Number_5|Bank_Account_Number_6|Bank_Account_Number_7|Bank_Account_Number_8|IBAN_Number|Swift_Code_1|Swift_Code_2|Supplier_EDI_ID|Currency

    |||00000|UNKNOWN_SUPPLIER||||||||||||||||||

    |||VN0000001|ACOMMODATION LINEN|P O Box 445|MARRICKVILLE|accounts@alinen.com.au|2204|86001098095|||||||||||||AUD

    |||VN0000008|AIR LIQUIDE|P O Box 2092|FORTITUDE VALLEY BC|ALHaccounts@airliquide.com|4006|99852644702|||||||||||||AUD

    Thanks
    Ankit

  • #28595
    Profile photo of Curtis Smith
    Curtis Smith
    Participant

    The exact same command works on this new dataset.

    Here is the dataset before it is filtered:


    PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|'

    Party_Identification_No :
    Company_Number :
    Company_Name :
    Supplier_Number : 00000
    Supplier_Name : UNKNOWN_SUPPLIER
    Supplier_Address_1 :
    Supplier_Address_2 :
    Supplier_Address_3 :
    Supplier_Postcode :
    VAT_Reg :
    Bank_Account_Number_1 :
    Bank_Account_Number_2 :
    Bank_Account_Number_3 :
    Bank_Account_Number_4 :
    Bank_Account_Number_5 :
    Bank_Account_Number_6 :
    Bank_Account_Number_7 :
    Bank_Account_Number_8 :
    IBAN_Number :
    Swift_Code_1 :
    Swift_Code_2 :
    Supplier_EDI_ID :
    Currency :

    Party_Identification_No :
    Company_Number :
    Company_Name :
    Supplier_Number : VN0000001
    Supplier_Name : ACOMMODATION LINEN
    Supplier_Address_1 : P O Box 445
    Supplier_Address_2 : MARRICKVILLE
    Supplier_Address_3 : accounts@alinen.com.au
    Supplier_Postcode : 2204
    VAT_Reg : 86001098095
    Bank_Account_Number_1 :
    Bank_Account_Number_2 :
    Bank_Account_Number_3 :
    Bank_Account_Number_4 :
    Bank_Account_Number_5 :
    Bank_Account_Number_6 :
    Bank_Account_Number_7 :
    Bank_Account_Number_8 :
    IBAN_Number :
    Swift_Code_1 :
    Swift_Code_2 :
    Supplier_EDI_ID :
    Currency : AUD

    Party_Identification_No :
    Company_Number :
    Company_Name :
    Supplier_Number : VN0000008
    Supplier_Name : AIR LIQUIDE
    Supplier_Address_1 : P O Box 2092
    Supplier_Address_2 : FORTITUDE VALLEY BC
    Supplier_Address_3 : ALHaccounts@airliquide.com
    Supplier_Postcode : 4006
    VAT_Reg : 99852644702
    Bank_Account_Number_1 :
    Bank_Account_Number_2 :
    Bank_Account_Number_3 :
    Bank_Account_Number_4 :
    Bank_Account_Number_5 :
    Bank_Account_Number_6 :
    Bank_Account_Number_7 :
    Bank_Account_Number_8 :
    IBAN_Number :
    Swift_Code_1 :
    Swift_Code_2 :
    Supplier_EDI_ID :
    Currency : AUD

    And here is the data set filters for VAT_Reg values less than 11 characters in length. Only 1 of the 3 records is returned, and it's returned because because the VAT_Reg field for that record contains no value. In reality, your dataset has a bunch of fields in the records that have no value in them.


    PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg.Length -lt 11}

    Party_Identification_No :
    Company_Number :
    Company_Name :
    Supplier_Number : 00000
    Supplier_Name : UNKNOWN_SUPPLIER
    Supplier_Address_1 :
    Supplier_Address_2 :
    Supplier_Address_3 :
    Supplier_Postcode :
    VAT_Reg :
    Bank_Account_Number_1 :
    Bank_Account_Number_2 :
    Bank_Account_Number_3 :
    Bank_Account_Number_4 :
    Bank_Account_Number_5 :
    Bank_Account_Number_6 :
    Bank_Account_Number_7 :
    Bank_Account_Number_8 :
    IBAN_Number :
    Swift_Code_1 :
    Swift_Code_2 :
    Supplier_EDI_ID :
    Currency :

  • #28596
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    Yes @Smith you are right.. that's why I have copy pasted some of the data in this blog.. hope you have seen it.. Please have a look and kindly suggest how to do in this case?

    Thanks
    Ankit

  • #28597
    Profile photo of Curtis Smith
    Curtis Smith
    Participant

    Ankit, I don't understand. Has the question changed? As I understand the current question it is how do you show all the records where the VAT_Reg value is less than 11 characters in length for your dataset. I believe we have have already shown how to do that. Put your data in a file and then use the command supplied to show only the records with less than 11 characters. I'm not sure what more you are trying to accomplish.

  • #28598
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    @Curtis – Sorry If i am wrong because i am newbie in PowerShell and not experts like you but the command you have given me that is showing that Suppliers name also in output which have VAT_Reg id's with blank values. I don't want to see the suppliers in output with blank VAT_Reg id's. Is there any way out to remove those suppliers from the output?

    Thanks
    Ankit

  • #28600
    Profile photo of Curtis Smith
    Curtis Smith
    Participant

    Ok, so to confirm, you want all records where the VAT_Reg number is less than 11 characters, and not blank correct?

  • #28601
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    yes.. exactly what I want .. thanks

  • #28602
    Profile photo of Curtis Smith
    Curtis Smith
    Participant

    That is a very simple addition to the existing command

    Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg -AND $_.VAT_Reg.Length -lt 11}
    

    The in the above code I added to the Where-Object Cmdlet and additional comparison, $_.VAT_Reg
    Leaving that value by itself tells the Where-Object Command to check the VAT_Reg field in the current record to see if there is a value there. If there is no value, it returns $false and is not sent to the out put. If there is a value there, it continues on and validates the second condition which requires the length to be less than 11 characters. So the whole command together does this:
    1) Get Imports the input.txt file and splits it into columns based on the '|' delimiter
    2) For each record/row of data in the input file, it checks to see if the VAT_Reg column/field for that row has a value and that the value is less than 11 characters
    3) If both conditions were met, the record/row is sent to the output and is subsequently displayed on the screen.

  • #28603
    Profile photo of Ankit Parmar
    Ankit Parmar
    Participant

    Thanks a ton Curtis and Dave for giving your valuable time. Finally it is started showing result and the output what I want and thanks for the detail explanation.

    Thanks & Regards
    Ankit

You must be logged in to reply to this topic.