Author Posts

November 13, 2017 at 9:56 am

Hi All

I have a requirement wherein I want to remove all the Texts prior to a particular xml tag. Also, I dont want any junk characters to be in once this conversion happens.

Sample xml file sample.xml

As depicted in my sample.xml file, I want to create a new xml in a different path using sample.xml file, where I want to delete all the texts prior to tag . so my target xml would be as below:

or, in other words, I want my target xml file to have everything between the tags and

November 13, 2017 at 10:06 am

Sample xml file sample.xml

 










As depicted in my sample.xml file, I want to create a new xml in a different path using sample.xml file, where I want to delete all the texts prior to tag . so my target xml would be as below:







or, in other words, I want my target xml file to have everything between the tags  and  




November 13, 2017 at 10:09 am

instead of my file starting with line 1 and line2. I want that the powershell should trim off the line 1 and line 2 so that I am just left with line 3 in all my xml files

line 1 –
line 2-
line 3-

November 13, 2017 at 11:47 am

Did you try

Get-Content -Path 'Your Sample XML File' | Select-Object -Skip 2

?

November 13, 2017 at 12:53 pm

Hi Oalf

Thanks for your reply!

This will not work since the XML files are not always formatted. So,we can have this spanned in two different lines.
At times, we have this in the same line. So skip – 2 will not work.

I am looking forward to a program, which can trim off all the texts from an xml file before a particular keyword, and, write to a new file in a separate directory.

Thanks
Rahul Kumar

November 13, 2017 at 1:44 pm

If you know this particular key word you just have to search for it and delete everything in front of it. What's the actual problem?

November 13, 2017 at 2:17 pm

Hey,

Here is one way. Probably not the most elegant, and it does not export the new file at the end, but I imagine you can sort that part out.

#grab your file
$file = Get-Content -Path C:\MyScripts\myFile.txt

#put the array on one line
$oneLine = $file -join ''

#index your keyword here
$keyword = $oneLine.IndexOf("line 3")

#grab both sides - before and after your keyword
$beforeKeyword = $oneLine.Substring(0,$keyword)

#here is the string you want. export it somehow.
$afterKeyword = $oneLine.Substring($keyword)

If you see this, Olaf, please show me the method you mentioned.

Thank you

*EDIT*

I just found this Where() method, and it is awesome. You can use 'SkipUntil', if you have a keyword to use. Like this...

#grab your file
$file = Get-Content -Path C:\MyScripts\myFile.txt

#set your keyord
$keyword = "my keyword"

#use the Where() method with a scriptblock to match on $keyword, and skip everything in the collection until keyword is found
$keepAfterKeyword = $file.Where({$_ -match $keyword}, 'SkipUntil')

Pretty awesome.

November 13, 2017 at 2:41 pm

Thanks, Let me try this.

November 13, 2017 at 2:45 pm

Hi Olaf,

I imagine the problem to be that Rahul does not know how to do what you are suggesting. Please have a look at my methods above and share yours. We can all learn somehing :).

Thank you

November 13, 2017 at 5:09 pm

Skip until is not working.

error below:

Method invocation failed because [System.Object[]] doesn't contain a method named 'Where'.
At line:6 char:33
+ $keepAfterKeyword = $file1.Where <<<< ({$_ -match $keyword}, 'SkipUntil') + CategoryInfo : InvalidOperation: (Where:String) [], RuntimeException + FullyQualifiedErrorId : MethodNotFound

November 13, 2017 at 5:11 pm

Where is your code?

November 13, 2017 at 5:14 pm

Hi

Thanks for your reply!

I think, I am unable to post codes here. I have xml files where I have to remove everything prior
to a particular tag. The way the xml files are created have no specific order of placement of that
DTD tag. It can be in line 1 or line2 or line2. So we cannot always rely on line numbers. If we
can remove everthing in that file prior to that specific tag and write the contents into a new file,
then that should be okay.

Thanks

November 13, 2017 at 5:16 pm

I am not referencing any line numbers. I am using Get-Content and looking for a keyword. That is what you are asking to do.

Post the code you are using. Thanks

November 13, 2017 at 5:16 pm



$file_temp = "C:\DTD_R2_RAW"
$xml_in = "C:\DTD_R2_REM"
$file_archive="C:\D2_RAW_ARCHIVE"

$xml_files = Get-ChildItem $file_temp *.XML 

if($xml_files)
{
foreach ($file in $xml_files){
$file1 = Get-Content -Path $file
$keyword = ""
$keepAfterKeyword = $file1.Where({$_ -match $keyword}, 'SkipUntil')
cat $keepAfterKeyword | sc $xml_in\$file
}
}

November 13, 2017 at 5:21 pm

Hi

thanks for your inputs, one more comment worth mentioning. Iam trying to load xml files
into oracle via sql ldr. Not sure, why most if the files error out with this error.
Apparently junk characters. Any way to deal with this so that this is taken care of

Record 4: Rejected – Error on table TABLE_XML, column XMLDATA.
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00210: expected '<' instead of '¿' Error at line 1 ORA-06512: at "SYS.XMLTYPE", line 5

November 13, 2017 at 5:38 pm

There are some strange things going on with your code. Look at this example and try to make it work for you. This works great for me. First create a directory to hold all of your output files. Then look at this...

#set your directory
$file_temp = "C:\DTD_R2_RAW"

#grab your files
$xml_files = Get-ChildItem $file_temp *.XML -Recurse

#designate your keyword
$keyword = "my keyword"

#create your new 'keep' folder
New-Item -ItemType Directory C:\DTD_R2_RAW\Keep

#if there are files, do something...
if ($xml_files) {

    #for each file, skip all characters until your find the keyword, then output everything from that point
    ForEach ($x in $xml_files) {

        $file = Get-Content -Path ($file_temp + '\' + $x.Name)
        
        $keep = $file.Where({$_ -match $keyword}, 'SkipUntil') | Out-File C:\DTD_R2_RAW\keep\$($x.name)  

    }
}

November 14, 2017 at 4:35 am

thanks for extending your help, much appreciated!

My Powershell version is version 2 and apparently the where method is not present there.
Any workaround please

November 14, 2017 at 4:50 am

HI,

thanks for your reply, Appreciate your help.

getting below error, apparently where clause will not work with my version of Powershell.

Mode LastWriteTime Length Name
—- ————- —— —-
d—- 14-Nov-17 10:00 AM Keep
Method invocation failed because [System.Object[]] doesn't contain a method named 'Where'.
At C:\MYLAN\PROJECT\ARGUS_UPGRADE\BFC\EMA_Rule_Increased_Files\CODE\PROCESSING\Camel.ps1:33 char:28
+ $keep = $file.Where <<<< ({$_ -match $keyword}, 'SkipUntil') | Out-File Thanks

November 14, 2017 at 1:45 pm

Try using | Where-Object instead. Or update Powershell. You should be on version 5. Maybe later if it is available.

You could also try the more long-winded approach I gave first...

#grab your file
$file = Get-Content -Path C:\MyScripts\myFile.txt

#put the array on one line
$oneLine = $file -join ''

#index your keyword here
$keyword = $oneLine.IndexOf("line 3")

#grab both sides - before and after your keyword
$beforeKeyword = $oneLine.Substring(0,$keyword)

#here is the string you want. export it somehow.
$afterKeyword = $oneLine.Substring($keyword)

But if I were you, I would update your Windows Management Framework.