# How to read and parse a text file

This topic contains 23 replies, has 5 voices, and was last updated by  Ankit Parmar 1 year, 5 months ago.

• Author
Posts
• #28538

Ankit Parmar
Participant

I have a text file in which data is in below given format. Its a text delimited file with 10000 lines of data. Main columns are given below though there are many more columns.

ACCOMMODATION|P O Box 123|3404|11009341296
PROVIDERS|P O Box 543|6704|451895748

this is a file format .. now through powershell script I want to see only those Supplier VAT_Reg id's those are below 11 digit. Kindly guide me how to do this through script as i am new in this field.

Thanks
Ankit

• #28540

Dave Wyatt
Moderator

That's a delimited text file, which is very easy to read in PowerShell with the Import-Csv cmdlet. (Even though it's not a comma-delimited file, there's a parameter which lets you switch that). Try this:

Import-Csv -Path c:\path\to\your\file.txt -Delimiter '|' |
Where-Object { $_.VAT_Reg.Length -lt 11 }  • #28542 Chris Wolfenden Participant Are you just looking to see a column of VAT_Reg numbers? EDIT: Sorry read to quick and missed the the piece about less than 11 digits • #28545 Ankit Parmar Participant @ Dave – I am trying to run your code but it is not giving me any output. I gave my file path where my file is placed but it is showing nothing on executing. • #28546 Ankit Parmar Participant @chris – Yes I want to see the VAT_reg column with the all other columns name for that particular supplier whose VAT_Reg id is below than 11 digit. • #28550 Ankit Parmar Participant @Dave – Its working now.. but it is showing results for all supplier's but my requirement is to see only result for those supplier's whose VAT_Reg id is below then 11 digit.. It is giving result in below format – Party_Identification_No : Company_Number : Company_Name : Supplier_Number : SSD0040271 Supplier_Name : PROVIDERS Supplier_Address_1 : Supplier_Address_2 : Supplier_Address_3 : Supplier_Postcode : VAT_Reg : 2611341208 Bank_Account_Number_1 : Bank_Account_Number_2 : Bank_Account_Number_3 : Bank_Account_Number_4 : Bank_Account_Number_5 : Bank_Account_Number_6 : Bank_Account_Number_7 : Bank_Account_Number_8 : IBAN_Number : Swift_Code_1 : Swift_Code_2 : Supplier_EDI_ID : Currency : AUD • #28561 Rob Simmers Participant Please post your current command you are running. Dave's command should do what you are asking, so we would need to look at the command you are running. • #28563 Ankit Parmar Participant Hi Rob, This is the command what i am running on my workstation. Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' | Where-Object {$_.VAT_Reg.Length -lt 11 }

I have gave path with and without single quotes but both time giving the same results.

Thanks
Ankit

• #28564

Rob Simmers
Participant

Ok, so let's troubleshoot and see what is going on. When importing from a CSV, everything is typically a string, which contains Length property. If for some reason the value is being converted to a different type, (e.g. int) it could be causing your issue. Try running this and see what the output is:

Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' | foreach{ "VAT_Reg {0} is {1} characters long and is type {2}" -f $_.VAT_Reg,$_.VAT_Reg.Length, $_VAT_Reg.GetType().Name)  • #28581 Ankit Parmar Participant Hi Simmers, Apologies for the late reply.. I just stuck with this site login problem and it was late night too.. really its a weird site.. Anyhow i executed your given query but again its giving error You cannot call a method on a null-valued expression. At line:1 char:195 I tried to execute this query with both curly braces and parenthesis. In your code you have started with curly braces and end up with parenthesis so on running first time its giving error so I have run with both the parameters. Thanks Ankit • #28582 Dave Wyatt Moderator Try this, as a troubleshooting step. Please post the output here: Import-Csv -Path C:\Users\ankit\Downloads\SC\SupplierExtract.txt -Delimiter '|' | Get-Member  • #28585 Ankit Parmar Participant Hi Dave, Please find the below given output... Thanks TypeName: System.Management.Automation.PSCustomObject Name MemberType Definition —- ———- ———- Equals Method bool Equals(System.Object obj) GetHashCode Method int GetHashCode() GetType Method type GetType() ToString Method string ToString() Bank_Account_Number_1 NoteProperty System.String Bank_Account_Number_1= Bank_Account_Number_2 NoteProperty System.String Bank_Account_Number_2= Bank_Account_Number_3 NoteProperty System.String Bank_Account_Number_3= Bank_Account_Number_4 NoteProperty System.String Bank_Account_Number_4= Bank_Account_Number_5 NoteProperty System.String Bank_Account_Number_5= Bank_Account_Number_6 NoteProperty System.String Bank_Account_Number_6= Bank_Account_Number_7 NoteProperty System.String Bank_Account_Number_7= Bank_Account_Number_8 NoteProperty System.String Bank_Account_Number_8= Company_Name NoteProperty System.String Company_Name= Company_Number NoteProperty System.String Company_Number= Currency NoteProperty System.String Currency= IBAN_Number NoteProperty System.String IBAN_Number= Party_Identification_No NoteProperty System.String Party_Identification_No= Supplier_Address_1 NoteProperty System.String Supplier_Address_1= Supplier_Address_2 NoteProperty System.String Supplier_Address_2= Supplier_Address_3 NoteProperty System.String Supplier_Address_3= Supplier_EDI_ID NoteProperty System.String Supplier_EDI_ID= Supplier_Name NoteProperty System.String Supplier_Name=UNKNOWN_SUPPLIER Supplier_Number NoteProperty System.String Supplier_Number=00000 Supplier_Postcode NoteProperty System.String Supplier_Postcode= Swift_Code_1 NoteProperty System.String Swift_Code_1= Swift_Code_2 NoteProperty System.String Swift_Code_2= VAT_Reg NoteProperty System.String VAT_Reg= Thanks Ankit • #28586 Curtis Smith Participant Ankit, Dave's code should absolutely do the trick. I copied and pasted your data from the original question into a file called "input.txt", then ran Dave's online against it. It seemed to do the trick quite nicely. Here is the Import-Csv before filtering out the ones longer than 11 characters  PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|'   Supplier_Name Supplier_Address_1 Supplier_Postcode VAT_Reg ------------- ------------------ ----------------- ------- ACCOMMODATION P O Box 123 3404 11009341296 PROVIDERS P O Box 543 6704 451895748  And here it is after applying the where-object to only get those with less than 11 characters  PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg.Length -lt 11}

 

Supplier_Name Supplier_Address_1 Supplier_Postcode VAT_Reg ------------- ------------------ ----------------- ------- PROVIDERS P O Box 543 6704 451895748 

• #28589

Ankit Parmar
Participant

@Dave – Here is the below output by the command you have given..Thanks

TypeName: System.Management.Automation.PSCustomObject

Name MemberType Definition
—- ———- ———-
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Bank_Account_Number_1 NoteProperty System.String Bank_Account_Number_1=
Bank_Account_Number_2 NoteProperty System.String Bank_Account_Number_2=
Bank_Account_Number_3 NoteProperty System.String Bank_Account_Number_3=
Bank_Account_Number_4 NoteProperty System.String Bank_Account_Number_4=
Bank_Account_Number_5 NoteProperty System.String Bank_Account_Number_5=
Bank_Account_Number_6 NoteProperty System.String Bank_Account_Number_6=
Bank_Account_Number_7 NoteProperty System.String Bank_Account_Number_7=
Bank_Account_Number_8 NoteProperty System.String Bank_Account_Number_8=
Company_Name NoteProperty System.String Company_Name=
Company_Number NoteProperty System.String Company_Number=
Currency NoteProperty System.String Currency=
IBAN_Number NoteProperty System.String IBAN_Number=
Party_Identification_No NoteProperty System.String Party_Identification_No=
Supplier_EDI_ID NoteProperty System.String Supplier_EDI_ID=
Supplier_Name NoteProperty System.String Supplier_Name=UNKNOWN_SUPPLIER
Supplier_Number NoteProperty System.String Supplier_Number=00000
Supplier_Postcode NoteProperty System.String Supplier_Postcode=
Swift_Code_1 NoteProperty System.String Swift_Code_1=
Swift_Code_2 NoteProperty System.String Swift_Code_2=
VAT_Reg NoteProperty System.String VAT_Reg=

Thanks
Ankit

• #28593

Ankit Parmar
Participant

@Curtis @Dave – Your script is right. I am sorry.. I am wrong..I also made one dummy text delimited file and insert only below values to it and it fetched the actual output from the file i.e. only for "PROVIDERS"

————- —————— —————– ——-
ACCOMMODATION P O Box 123 3404 11009341296
PROVIDERS P O Box 543 6704 451895748

But the issue is that is my real file have data of more than 10,000 of suppliers and it has in some different format.. What I mean to say is that it has some blank delimited columns also. Might be because of that it is not picking up the right output over there. Could you please share with me your mail id's so that I can share the data with you to sort out the issue.

Thanks
Ankit

• #28594

Ankit Parmar
Participant

@Curtis @Dave – My data is like in below given format in actual file – First row are columns name and then after that I have copied 3 rows of supplier data.

|||00000|UNKNOWN_SUPPLIER||||||||||||||||||

|||VN0000001|ACOMMODATION LINEN|P O Box 445|MARRICKVILLE|accounts@alinen.com.au|2204|86001098095|||||||||||||AUD

|||VN0000008|AIR LIQUIDE|P O Box 2092|FORTITUDE VALLEY BC|ALHaccounts@airliquide.com|4006|99852644702|||||||||||||AUD

Thanks
Ankit

• #28595

Curtis Smith
Participant

The exact same command works on this new dataset.

Here is the dataset before it is filtered:

 PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|'

 Party_Identification_No : Company_Number : Company_Name : Supplier_Number : 00000 Supplier_Name : UNKNOWN_SUPPLIER Supplier_Address_1 : Supplier_Address_2 : Supplier_Address_3 : Supplier_Postcode : VAT_Reg : Bank_Account_Number_1 : Bank_Account_Number_2 : Bank_Account_Number_3 : Bank_Account_Number_4 : Bank_Account_Number_5 : Bank_Account_Number_6 : Bank_Account_Number_7 : Bank_Account_Number_8 : IBAN_Number : Swift_Code_1 : Swift_Code_2 : Supplier_EDI_ID : Currency : Party_Identification_No : Company_Number : Company_Name : Supplier_Number : VN0000001 Supplier_Name : ACOMMODATION LINEN Supplier_Address_1 : P O Box 445 Supplier_Address_2 : MARRICKVILLE Supplier_Address_3 : accounts@alinen.com.au Supplier_Postcode : 2204 VAT_Reg : 86001098095 Bank_Account_Number_1 : Bank_Account_Number_2 : Bank_Account_Number_3 : Bank_Account_Number_4 : Bank_Account_Number_5 : Bank_Account_Number_6 : Bank_Account_Number_7 : Bank_Account_Number_8 : IBAN_Number : Swift_Code_1 : Swift_Code_2 : Supplier_EDI_ID : Currency : AUD 

Party_Identification_No : Company_Number : Company_Name : Supplier_Number : VN0000008 Supplier_Name : AIR LIQUIDE Supplier_Address_1 : P O Box 2092 Supplier_Address_2 : FORTITUDE VALLEY BC Supplier_Address_3 : ALHaccounts@airliquide.com Supplier_Postcode : 4006 VAT_Reg : 99852644702 Bank_Account_Number_1 : Bank_Account_Number_2 : Bank_Account_Number_3 : Bank_Account_Number_4 : Bank_Account_Number_5 : Bank_Account_Number_6 : Bank_Account_Number_7 : Bank_Account_Number_8 : IBAN_Number : Swift_Code_1 : Swift_Code_2 : Supplier_EDI_ID : Currency : AUD 

And here is the data set filters for VAT_Reg values less than 11 characters in length. Only 1 of the 3 records is returned, and it's returned because because the VAT_Reg field for that record contains no value. In reality, your dataset has a bunch of fields in the records that have no value in them.

 PS F:\temp\Powershell> Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg.Length -lt 11}   Party_Identification_No : Company_Number : Company_Name : Supplier_Number : 00000 Supplier_Name : UNKNOWN_SUPPLIER Supplier_Address_1 : Supplier_Address_2 : Supplier_Address_3 : Supplier_Postcode : VAT_Reg : Bank_Account_Number_1 : Bank_Account_Number_2 : Bank_Account_Number_3 : Bank_Account_Number_4 : Bank_Account_Number_5 : Bank_Account_Number_6 : Bank_Account_Number_7 : Bank_Account_Number_8 : IBAN_Number : Swift_Code_1 : Swift_Code_2 : Supplier_EDI_ID : Currency :  • #28596 Ankit Parmar Participant Yes @Smith you are right.. that's why I have copy pasted some of the data in this blog.. hope you have seen it.. Please have a look and kindly suggest how to do in this case? Thanks Ankit • #28597 Curtis Smith Participant Ankit, I don't understand. Has the question changed? As I understand the current question it is how do you show all the records where the VAT_Reg value is less than 11 characters in length for your dataset. I believe we have have already shown how to do that. Put your data in a file and then use the command supplied to show only the records with less than 11 characters. I'm not sure what more you are trying to accomplish. • #28598 Ankit Parmar Participant @Curtis – Sorry If i am wrong because i am newbie in PowerShell and not experts like you but the command you have given me that is showing that Suppliers name also in output which have VAT_Reg id's with blank values. I don't want to see the suppliers in output with blank VAT_Reg id's. Is there any way out to remove those suppliers from the output? Thanks Ankit • #28600 Curtis Smith Participant Ok, so to confirm, you want all records where the VAT_Reg number is less than 11 characters, and not blank correct? • #28601 Ankit Parmar Participant yes.. exactly what I want .. thanks • #28602 Curtis Smith Participant That is a very simple addition to the existing command Import-Csv input.txt -delimiter '|' | Where-Object {$_.VAT_Reg -AND $_.VAT_Reg.Length -lt 11}  The in the above code I added to the Where-Object Cmdlet and additional comparison,$_.VAT_Reg
Leaving that value by itself tells the Where-Object Command to check the VAT_Reg field in the current record to see if there is a value there. If there is no value, it returns \$false and is not sent to the out put. If there is a value there, it continues on and validates the second condition which requires the length to be less than 11 characters. So the whole command together does this:
1) Get Imports the input.txt file and splits it into columns based on the '|' delimiter
2) For each record/row of data in the input file, it checks to see if the VAT_Reg column/field for that row has a value and that the value is less than 11 characters
3) If both conditions were met, the record/row is sent to the output and is subsequently displayed on the screen.

• #28603

Ankit Parmar
Participant

Thanks a ton Curtis and Dave for giving your valuable time. Finally it is started showing result and the output what I want and thanks for the detail explanation.

Thanks & Regards
Ankit

You must be logged in to reply to this topic.