This topic contains 7 replies, has 4 voices, and was last updated by
March 6, 2015 at 11:47 am #23086
I get a file sent to me by our staff in Accounting with information for users who are no longer employed. My eventual goal is to be able to import the file into a script and have it confirm that the users are disabled. However, I'm stuck on cleaning up the file that gets exported from the program that is used to keep track of employees, and am wondering if there is a way in Powershell to clean this up a bit. An example line in the text that is exported is below (confidential information is obviously cleaned up):
1 000000 0000000000000000000000000NAME 0 SURNAME BOX 111 SOMEWHERE, OUTTHERE A1A 1A1 000000000000000000000000JOBTITLE - 00.00 U 0000000000.00000000.00000000.00000000.00 000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.0000000.0000 0000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.000000 00.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.000000000000 000 0000000000 000000.00 000
Everything that is a 0 in that line is a number or letter other than 0, otherwise this would be incredibly simple. Each line also contains the exact same amount of characters, with the capital variables being the main variance in length.
I've tried using Trim, Trimstart, and Trimend, but it seems that they need specific characters, not a numbered amount. I've tried Replace, but it also seems to need specific characters (unless I'm doing it wrong).
EDIT: Out of that string, I need NAME, SURNAME, BOX (address), SOMEWHERE,OUTTHERE (city and province), A1A 1A1 (postal code), and JOBTITLE.
March 6, 2015 at 11:55 am #23088ParticipantPoints: 0Rank: Member
Without seeing the original data, it's kinda hard to interpret what this data is and what is needed. Any way you can give an example of actual data (fake actual data of course)
March 6, 2015 at 11:59 am #23089ParticipantPoints: 21Rank: Member
Matt, I haven't tried with your data , but I've had success in the past with the Substring Method. Here's a link to an article ( the relevant portion is near the end) https://technet.microsoft.com/en-us/library/ee692804.aspx
March 6, 2015 at 11:59 am #23090MemberPoints: 0Rank: Member
I'm not sure I understand the question. Do you know exactly how many characters you need to remove from the beginning and end of the string, but don't know what those characters are? If that's the case, the Substring method of the string class would probably help.
Beyond that, I'd probably need to know more about the data and what you want to get out of it. In your example line, what do you want it to look like after you've manipulated it?
March 6, 2015 at 12:33 pm #23096
I have attached the file and sanitized it. Please ignore the goofy names – it's Friday 🙂
Also, it comes out as a .blk, not .txt – however I have changed its extension so it can be uploaded.
March 6, 2015 at 12:47 pm #23098ParticipantPoints: 0Rank: Member
Looks like the first, middle, and last names always start on the same index location. If that is the case, you could just strip out what you don't need.
March 6, 2015 at 12:47 pm #23099MemberPoints: 0Rank: Member
Looks to me like this is just a fixed-width data file. (Not sure what the ~0100 on the first line represents; you'd probably want to skip that.) Extracting data from a fixed-width file is fairly straightforward. You can use Substring() as I mentioned earlier (with an example below), or you can use ADO with a Schema.ini file (details here: https://msdn.microsoft.com/en-us/library/ms974559.aspx , though that's an old article with VBScript code that you'd need to convert to PowerShell.)
Example with Substring:
# first data line from your sample file $line = '1 000000 000000000RP00000000000000NAME SURNAME STREET ADDRESS CITY, PROVINCE A1A 1A1 000000000000000000000000JOB TITLE - 00.00 U 0000000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00000000.00ACCOUNTANT LN 5555555555 000000.00 EES ' $name = $line.Substring(40,20).Trim() $initial = $line.Substring(60,4).Trim() $surname = $line.Substring(64, 28).Trim() # etc
Extracting other bits of data would amount to the same thing (Substring(index, length).Trim()); you just need to figure out the starting index and length of each column.
March 6, 2015 at 12:55 pm #23100
Awesome. Thanks for the help!
The topic ‘Help removing random characters from multiple strings’ is closed to new replies.