Script performance - Storing 10K+ in array

This topic contains 10 replies, has 3 voices, and was last updated by  Jeremy Murrah 4 months ago.

  • Author
    Posts
  • #91931

    Jeff
    Participant

    I created a script that stores several user accounts with the account that created them in an array. Once the script loops through all users in the text file, it exports the objects in the array to a CSV file. When I am working with over 10,000 users in input file, the script takes over 3 hours to run and maxes out the memory on the machine. Any ideas to help speed this up? Can I potentially use a runspace and run various processes in parallel and instead of saving each object the same array, I will just save a log file and then compile all the log files when it's done? Open to suggestions. Thanks!

    $customObjects = @()
    $Users = gc "C:\TEMP\AllManagedUsers.txt"
    
    foreach ($User in $Users)
    {
    	try
    	{
    		$userObjectLDAP = ''
    		$objectOwner = ''
    		$UserFilter = "(&(objectCategory=user)(objectClass=user)(distinguishedname=$User)(!(userAccountControl:1.2.840.113556.1.4.803:=2)))"
    		$userObject = ([adsisearcher]$Userfilter).FindOne().Path
    		If ($userObject)
    		{
    			$userObjectLDAP = [adsi]$userObject
    			$objectOwner = $userObjectLDAP.PSBase.get_ObjectSecurity().GetOwner([System.Security.Principal.NTAccount]).Value.ToString()
    			
    			$Object = [pscustomobject] @{
    				'User Account'        = $UserObjectLDAP.samaccountname[0]
    				'Created by'	      = $objectOwner
    				'Date created'        = $userObjectLDAP.whenCreated[0]
    				'Distinguished Name'  = $userObjectLDAP.distinguishedname[0]
    			}
    			
    			$customObjects += $Object
    		}
    	}
    	catch
    	{
    		"$User `r`n $($_.Exception.Message)" | Out-File C:\TEMP\UsersErrors.txt -Append
    	}
    }
    
    $customObjects | Export-csv 'C:\TEMP\ActiveADAccounts-CreatedBy.csv' -NoTypeInformation	
    
    
  • #91933

    Sam Boutros
    Participant

    There are 2 issues here:
    1. Slow performance
    It's clear that most of the script time is IO time not CPU time
    1.1. Consider using a RAM disk. There are few good free tools out there still.
    1.2. Consider using other then GC, see http://www.happysysadm.com/2014/10/reading-large-text-files-with-powershell.html

    2. Memory max issues
    2.1. Try to rewrite your script in a way that it reads and manipulates one or few records at a time to overcome memory max issues. i.e. rewrite your script to operate in a paged fashion
    2.2. Explicitly call the Garbage Collector to do its job and reclaim RAM

  • #91943

    Jeremy Murrah
    Participant

    your memory usage is caused by the way you're building that customobject variable to contain everything before outputting it. To reduce memory usage you want to take advantage of the pipeline. If you rewrite your foreach loop into a function, you can output each user object one at a time to the pipeline, then just pipe that function out to a CSV or something. Basically something like this:

    Function Get-LDAPUser {
     Param(
       [Parameter(ValueFromPipeline=$True)]
       [String[]]$Username
    )
    BEGIN{}
    PROCESS{
      Foreach ($User in $Username){
      #put all your processing code here and just output the object rather than setting it to a variable.
      #do stuff
      write-output $output
    }
    }
    }
    

    Then to call that function use it in a pipeline fashion like this:

      gc "C:\TEMP\AllManagedUsers.txt" | Get-LDAPUser | Export-csv 'C:\TEMP\ActiveADAccounts-CreatedBy.csv' -NoTypeInformation
    #or
    $Users = gc "C:\TEMP\AllManagedUsers.txt"
    Get-LDAPUser -username $Users | Export-csv 'C:\TEMP\ActiveADAccounts-CreatedBy.csv' -NoTypeInformation	
    
    • #91951

      Jeff
      Participant

      Thanks – This does appear to be faster. However, this does not appear to export in the format I want. I want a final CSV file with the 4 columns I created for the custom object (I don't want everything displayed in the console). If I do "write-output" and then pipe that all to export-csv, it comes out with a lot of random info... I also cannot use:

      gc "C:\TEMP\AllManagedUsers.txt" | Get-LDAPUser | Export-csv 'C:\TEMP\ActiveADAccounts-CreatedBy.csv' -NoTypeInformation

      When using the above, it only uses the last user listed in the text file. I need to specify the username parameter to have it loop through all users in the text file. Any suggestions?

    • #91954

      Jeff
      Participant

      Nevermind – I figured out what I did wrong. It is exporting the object to CSV without a problem and is FAST. However, the first example you suggested is still not working. How do I fix that so that I can pipe gc to the function and have it loop through each user?

      Not working (only uses last user)

       gc "C:\TEMP\AllManagedUsers.txt" | Get-LDAPUser | Export-csv 'C:\TEMP\ActiveADAccounts-CreatedBy.csv' -NoTypeInformation
  • #91982

    Jeremy Murrah
    Participant

    what does the allmanagedusers.txt file look like? does it have a header or any weird formatting?

    • #91985

      Jeff
      Participant

      It is just a user account name on each line. Works fine if I do :

      get-ldapuser -username (get-content C:\Temp\File.txt) | export-csv C:\Temp\output.csv

      but not:

      get-content C:\Temp\File.txt | get-ldapuser | export-csv C:\Temp\output.csv
  • #91988

    Jeremy Murrah
    Participant

    weird, it should work. post up your new script, maybe it's something weird with the looping

    • #91990

      Jeff
      Participant
      function Get-ADUserAccountCreator
      {
      	[CmdletBinding()]
      	param
      	(
      		[Parameter(Mandatory = $true,
      				   ValueFromPipeline = $true)]
      		[String[]]$Username
      	)
      	
      	foreach ($User in $Username)
      	{
      		try
      		{
      			$userObjectLDAP = ''
      			$objectOwner = ''
      			$UserFilter = "(&(objectCategory=user)(objectClass=user)(samaccountname=$User)(!(userAccountControl:1.2.840.113556.1.4.803:=2)))"
      			$userObject = ([adsisearcher]$Userfilter).FindOne().Path
      			If ($userObject)
      			{
      				$userObjectLDAP = [adsi]$userObject
      				$objectOwner = $userObjectLDAP.PSBase.get_ObjectSecurity().GetOwner([System.Security.Principal.NTAccount]).Value.ToString()
      				
      				$Object = [pscustomobject] @{
      					'User Account'	    = $UserObjectLDAP.samaccountname[0]
      					'Created by'	    = $objectOwner
      					'Date created'	    = $userObjectLDAP.whenCreated[0]
      					'Distinguished Name' = $userObjectLDAP.distinguishedname[0]
      				}
      				
      				Write-Output $Object
      			}
      			Else
      			{
      				Write-Host "$User not found in AD or is disabled." -ForegroundColor red	
      			}
      		}
      		catch
      		{
      			"$User - $($_.Exception.Message)" | Out-File C:\TEMP\UsersErrors.txt -Append
      		}
      	}
      }
      

      When piping into the function, it seems to only list the last user in the file.

  • #91993

    Jeff
    Participant

    Ahh apparently I need the process block. Good to know. Think I'm good now. Thanks.

    This works:

    function Get-ADUserAccountCreator
    {
    	[CmdletBinding()]
    	param
    	(
    		[Parameter(Mandatory = $true,
    				   ValueFromPipeline = $true)]
    		[String[]]$Username
    	)
    	BEGIN{}
    	PROCESS
    	{
    		foreach ($User in $Username)
    		{
    			try
    			{
    				$userObjectLDAP = ''
    				$objectOwner = ''
    				$UserFilter = "(&(objectCategory=user)(objectClass=user)(samaccountname=$User)(!(userAccountControl:1.2.840.113556.1.4.803:=2)))"
    				$userObject = ([adsisearcher]$Userfilter).FindOne().Path
    				If ($userObject)
    				{
    					$userObjectLDAP = [adsi]$userObject
    					$objectOwner = $userObjectLDAP.PSBase.get_ObjectSecurity().GetOwner([System.Security.Principal.NTAccount]).Value.ToString()
    					
    					$Object = [pscustomobject] @{
    						'User Account'	    = $UserObjectLDAP.samaccountname[0]
    						'Created by'	    = $objectOwner
    						'Date created'	    = $userObjectLDAP.whenCreated[0]
    						'Distinguished Name' = $userObjectLDAP.distinguishedname[0]
    					}
    					
    					Write-Output $Object
    				}
    				Else
    				{
    					Write-Host "$User not found in AD or is disabled." -ForegroundColor red
    				}
    			}
    			catch
    			{
    				"$User - $($_.Exception.Message)" | Out-File C:\TEMP\UsersErrors.txt -Append
    			}
    		}
    	}
    	END{}
    }
    
  • #91994

    Jeremy Murrah
    Participant

    wrap all that in a PROCESS{} block. when you do pipeline stuff you have access to BEGIN, PROCESS, and END blocks. The BEGIN block runs once at the beginning, the Process block automagically runs once for each pipeline object, then the END block runs. If you leave them out I think it defaults to putting it all in the END block, but I don't remember 100%.

You must be logged in to reply to this topic.