Finding and renaming duplicate file names

This topic contains 7 replies, has 4 voices, and was last updated by  FastEddie 11 months ago.

  • Author
  • #55820


    Greetings all. I've got a seemingly simple problem but being a PS newbie I'm running into some obsticals. Here is the business logic: In a folder there are pdf files with two different signatures. Half with underscores the other half with no underscores. All of the file names consist of dates. For example:


    I need to modify the files without underscores and convert them to look like the ones with underscores.
    When the conversion is complete, there may be some files that share the same name. (not necessarialy duplicates, just same file name)
    I need to identify which of the newly changed names is the same as the underscored files.
    Flag them somehow and rename them to a different timestamp for that day.
    The new timestamp will be in the past. Specifically the first seconds of the day.
    For example the file name above would be renamed from 20160930_053010.pdf to 20160930_000001.pdf.
    If there are more than one duplicate the timestamp rename would just increment by 1 second.
    I'm able to create an array of the first two file signatures but I'm having trouble copying the Non-Underscore names and converting them to names with the underscore. Could someone tell me how to rename the non-underscored files properly?
    Any help would be appreciated.
    Also please let me know if I'm completely doing this wrong and should go a different direction.
    Here is the code:

    $src = "D:\PDFs\MyFolderA"
    $dest = "D:\PDFs\MyFolderB"
    $a= New-Object System.Collections.ArrayList
    $srcFiles=Get-ChildItem -Path $src -Filter *.pdf 
    foreach($file in $srcFiles) {
        ## only process files with correct signature. does not process files like 'yaddayadda.pdf'
        if ( $file.Name -match "\d{14}.pdf" -OR $file.Name -match "\d{8}_\d{6}.pdf" ) {
            #write-host $file
            ## process the acceptable files
            $myThing1 = Get-ChildItem $src | Where-Object {$_.Name -match "\d{14}.pdf" }  ## signatue A (all numbers. no underscore)
            $myThing2 = Get-ChildItem $src | Where-Object {$_.Name -match "\d{8}_\d{6}.pdf" } ## signature b (with underscore)  
            $myThing3 = $myThing1  #copy the ones without underscore to mything3
            ## email the bad file name
            #$PSEmailServer = ""
            #Send-MailMessage -From "" -To "" -Subject "Bad TAAD File" -Body "This is a bad file name: $file"
        } # end if ( $file.Name -match "\d{14}.pdf" -OR $file.Name -match "\d{8}_\d{6}.pdf" )
    } # end foreach($file in $srcFiles)
    ### This is not workig correctly ###
    foreach($item in $myThing3) {
        $firstPart = $item.Name.toString().Substring(0,8)
        $secondPart = $item.Name.toString().Substring(8,6)
        $myThing3 = $firstPart + "_" + $secondPart + ".pdf" 
        write-host "howdy" : $item
    write-host $myThing1
    write-host $myThing2
    write-host $myThing3

    Here is the output :

    howdy : 20160927063055.pdf
    howdy : 20160930063000.pdf
    howdy : 20160930063009.pdf
    howdy : 20160930063010.pdf
    20160927063055.pdf 20160930063000.pdf 20160930063009.pdf 20160930063010.pdf
    20160928_061543.pdf 20160930_063000.pdf 20160930_063009.pdf 20160930_063010.pdf 20160930_063011.pdf

    It looks like only the last file is being renamed correctly. 20160930_063010.pdf is correct but none of the others got renamed.

  • #55823

    Max Kozlov

    First, in line $srcFiles=Get-ChildItem -Path $src -Filter *.pdf
    you already get all files list, thus $myThing1 and $myThing2 assignment
    doesn't needed. It just retrieve all pdf files again two times.
    and in $myThing3 variable you just assign the same $myThing1 every cycle
    Instead of this you need to

    1. check if $file.Name meet your mask(-match)
    2. construct new file name $newName = ....
    3. Test it for existance (Test-Path) if exists, modify name and test it again
    4. Rename-Item $file -NewName $NewName
  • #55847

    mohit goyal

    It should be something like this:

    $src = "D:\dfdf"
    $dest = "D:\dfdf\new"
    $a = New-Object System.Collections.ArrayList
    #write-host $file
    ## process the acceptable files
    $myThing1 = Get-ChildItem $src | Where-Object {$_.Name -match "\d{14}.pdf" }  ## signatue A (all numbers. no underscore)
    $myThing2 = Get-ChildItem $src | Where-Object {$_.Name -match "\d{8}_\d{6}.pdf" } ## signature b (with underscore)  
    foreach($item in $myThing1){
        $incrementer = 1
        $firstPart = $item.Name.toString().Substring(0,8)
        $secondPart = $item.Name.toString().Substring(8,6)
        $newname = $firstPart + "_" + $secondPart + ".pdf" 
        if(($mything2.Name.Contains($newName)) -or ($a.Contains($newname))){
            $secondPart = $incrementer++
            $secondPart = "00000" + [string] $secondPart
            $newname = $firstPart + "_" + $secondPart + ".pdf"
        write-host "Converting $($ to $newname" 
        #rename file here
  • #55885


    Thank you both for the suggestions. @mohit_goyal I tried the script but the IF statement is never executed. $myThing2 doesn't exist inside the IF statement. I tried to scope it by using $Global:myThing2 but that didn't work. Also, I don't thing the .contains method can be used I tested with the -contains instead. I would expect $myThing2 should display data inside of the IF statement because it's created above it but I guess that's not the case. Is there a way around this? Here is the error I'm getting:

    You cannot call a method on a null-valued expression.
    At line:18 char:32
    +     if(($mything2.Name.Contains < <<< ($newName)) -or ($a.Contains($newname))){
        + CategoryInfo          : InvalidOperation: (Contains:String) [], RuntimeException
        + FullyQualifiedErrorId : InvokeMethodOnNull
    Converting 20160928065548.pdf to 20160928_065548.pdf
  • #55888

    Edmond Yee

    If there is nothing else in those folders with an underscore, you could use this line as $mything2

    $myThing2 = Get-ChildItem $src | Where-Object {$_.Name -match '(?< =\d{8})(?=_)' } ## signature b (with underscore)
  • #55900


    Thanks Edmond Yee. I tried that but it still didn't work. The problem is not with populating the $myThing2 variable. That returns data just fine. It's just that I can't access the data from inside the IF clause. There is data but I just can't get to it for some crazy reason.

  • #55951

    mohit goyal

    It works in my environment and contains() should work. That's probably because $mything2 is empty in your environment to begin with? I assume you corrected the path for source and destination folders.

  • #56032


    @mohit_goyal This is very interesting. I had to upgrade my version of Powershell from 2.0 to 4.0 in order for it to work. I'm now able to access the $myThing2 variable from within the foreach block and the file name is being changed in the IF clause. There is still some tweaking to be done because the increment should be in a counter but the essential part is working. Thank you for your help with this and thanks to all who took the time to reply.

You must be logged in to reply to this topic.