Welcome Forums General PowerShell Q&A Performance Issue With Out-File

Viewing 3 reply threads
  • Author
    Posts
    • #42072
      Von
      Participant
      Topics: 2
      Replies: 8
      Points: 1
      Rank: Member

      I have a script that connects to eDirectory, and gives me a list of groups & group members. The script works, returning 570,515 rows of data. If I Write-Host the results to screen, it takes about 5 minutes to complete. However, I don’t want results to screen, I want the results in a CSV or TXT so I can get the data loaded into a SQL table.

      When I try to Out-File the results using a $Query.SizeLimit = 100, I can get that portion of the data in about 1.5 minutes
      When I try to Out-File the results using no size limit, the script never finishes.

      This is the entire code. PI data is replaced with #

      #Setup Modules
      Import-Module ActiveDirectory
      Add-Type -AssemblyName System.DirectoryServices
      
      #Setup eDirectory Connection Variables
      $eDirPath = 'LDAP://####/o=####'
      $eDirUser = '########'
      $eDirPWD = '########'
      $eDirAuthType = 'None'
      
      #Establish eDirectory Connection and Enumerate
      $Root = New-Object System.DirectoryServices.DirectoryEntry -argumentlist $eDirPath,$eDirUser,$eDirPWD,$eDirAuthType
      $Query = New-Object System.DirectoryServices.DirectorySearcher
      $Query.SearchRoot = $Root
      #$Query.SizeLimit = 100 #limits results for testing purposes. Comment-out for full results
      $Query.Filter = "(|(ObjectClass=CCGroupApplication)(ObjectClass=CCGroupRole))"
      $SearchResults = $Query.FindAll()
      
      #Take all requested group names and group members, pipe them to CSV
      $CSVoutput = @() # creates an empty $CSVoutput variable
      ForEach ($Result in $SearchResults) `
        {
         $CCGroupALL = [PSCustomObject]$Result.Properties
              ForEach ($Item in $CCGroupALL)
                  {
                      $Group = $Item.cn
                      ForEach ($Member in $Item.member)
                          {
                          #Replace strips everything after the member ID
                          $Member = $Member -Replace "uid=",""
                          $Member = $Member -Replace "cn=",""
                          $Member = $Member -Replace ",ou=people",""
                          $Member = $Member -Replace ",ou=Clients",""
                          $Member = $Member -Replace ",ou=Employees",""
                          $Member = $Member -Replace ",ou=direct",""
                          #Write-Host "$Group",",","$Member"   #runs in 5 minutes
                          $CSVoutput += "$Group , $Member"    #adds each "cn , member" to the $CSVoutput variable
                          }
                  }
        }
      
      $CSVoutput | Out-File C:\Users\####\Desktop\CWSiGroupMembers.csv
      
      

      The data is returned as the “GroupName , GroupMember” for each result. Like:

      GroupA , jsmith
      GroupA , sjones
      GroupB , jsmith
      GroupC , bsmith

      Ultimately, I’m hoping someone can help me get the results into a file, even if the script takes a while to complete. I’m going to schedule it to run as a nightly job in SQL Server.

    • #42208
      Participant
      Topics: 2
      Replies: 376
      Points: 1
      Rank: Member

      I can’t test this on large data but improvements can be
      1. Don’t collect big data in arrays, because array += value rebuild array
      2. Use piping and native export-csv cmdlet instead manual string building

      thus, code schema may be like that:

      # [initialization...]
      # Here you must limit DirectoryEntry fields for output, this can speedup your query
      $Query.FindAll() | Foreach-Object {
       # [data preparation if needed]
       $_ | select-Object -Property property1, property2, @{n='Prperty3';e={$_.ValueForProperty3}}
      } | Export-Csv -Encoding Utf8 -Delimiter -'delimiterchar' -Path OutputPathTo\File.csv
      

      btw, if you Import-Module ActiveDirectory why you use DirectorySearcher ?

      Get-ADObject -LdapFilter '(filter here)'  -Property CN, Member -ResultSetSize $null |
      Select-Object CN, @{n='Members'; e={$_.Member -replace 'r1' -replace 'r2' -replace 'r3' }} |
      Export-Csv -Path OutputPathTo\File.csv
      
    • #42216
      Participant
      Topics: 2
      Replies: 376
      Points: 1
      Rank: Member

      and…. I don’t see your data but believe you can do just one -replace using regular expressions

      for example
      cn=john,ou=management
      cn=ivan,ou=support
      can be replaced to
      john
      ivan
      that way
      $data -replace '^cn=([^,]+),.*','$1'
      here I extract (...) all characters after cn= that must be in line beginning ^ and before , into $1 and replace full string (greedy .* after main regex) to extracted value

    • #42338
      Von
      Participant
      Topics: 2
      Replies: 8
      Points: 1
      Rank: Member

      Hi Max,
      Thanks for the ideas! I used the regex you suggested. I also realized that a significant number of unneeded results could be eliminated using a Where clause in the ForEach loop.

                      ForEach ($Member in $Item.member | 
                          Where {$_ -notlike 'cn=[0-9]*' -and $_ -notlike 'uid=[0-9]*'})
                          {
      

      The new filter I added allows the Out-File process to complete in about 4.5 minutes. I understand that my code could be better, faster, etc., but the timing is acceptable for my purposes.
      Thanks again for your help.

Viewing 3 reply threads
  • The topic ‘Performance Issue With Out-File’ is closed to new replies.