PowerShell Performance: The += Operator (and When to Avoid It)

In PowerShell, there are always many different ways to accomplish a given task. Sometimes these different options offer trade-offs in performance and code clarity: faster execution at the expense of higher memory usage (or vice versa), or better performance at the expense of code that isn't as easy to read. Depending on how much data you need to process, the differences between options may not really matter, and you can pick whatever is most aesthetically pleasing. However, if your script needs to scale well with large data sets, you'll want to know how to make sure your script isn't wasting a lot of CPU time or memory. This article, possibly the first in a series, touches on one such performance "gotcha": using the += operator on strings or arrays.

Every so often, I see a blog post or script posted online containing code that looks something like this:

$outputString = ""
$array = @()

for ($i = 0; $i -lt 10; $i++)
{
    $outputString += "Line $i`r`n"
    $array += "Array Element $i"
}

They may not be both appending to a string and to an array in the same block, but this illustrates both ideas at once. When the loop only executes 10 times (or even 1000 times), the performance of this block of code isn't so bad. It runs in less than 100 milliseconds on my computer when I change the loop limit to 1,000. When I bump it up to 10,000, though, it takes over 5 seconds to run (and it just gets worse from there: 12.5 seconds for 15000 elements, 26 seconds for 20000 elements, and so on). The increase in execution time is exponential, not linear. In other words, this code does not scale well at all.

The reason for this is that Arrays and Strings cannot be resized and appended to in the .NET Framework. Every one of those += operators caused .NET to have to create a new Array or String, copy the contents of the original over (plus its one new line or array element), and discard the original. As the size of the string or array goes up, each new copy takes longer and longer to complete.

The .NET Framework offers classes to address both of these performance problems. Instead of appending to Strings directly, there is System.Text.StringBuilder. As an alternative to arrays, you can use either System.Collections.ArrayList or System.Collections.Generic.List. In a PowerShell script, the difference usually doesn't matter; in the next example, I'll use List. This requires me to specify the type of elements that will be contained in the list, but will perform better than ArrayList in some situations (and since this is a Performance post, I may as well use the best option.)

Here's how you can test the performance of the original example code, and compare it to the performance of StringBuilder and List:

Write-Host "Using += operators:"

$outputString = ""
$array = @()

Measure-Command {
    for ($i = 0; $i -lt 20000; $i++)
    {
        $outputString += "Line $i`r`n"
        $array += "Array Element $i"
    }
}

Write-Host "Using StringBuilder and List:"

$stringBuilder = New-Object System.Text.StringBuilder
$list = New-Object System.Collections.Generic.List[System.String]

Measure-Command {
    for ($i = 0; $i -lt 1000000; $i++)
    {
        # Notice that I'm assigning the result of $stringBuilder.Append to $null,
        # to avoid sending any unwanted data down the pipeline.

        $null = $stringBuilder.Append("Line $i`r`n")
        $list.Add("Array Element $i")
    }

    # These lines show you how to convert your StringBuilder and List objects back to String and Array types for later use.

    $outputString = $stringBuilder.ToString()
    $array = $list.ToArray()
}

In this case, the code clarity hasn't suffered at all, in my opinion. $list.Add and $stringBuilder.Append are both very clear in their meaning, just as easy to read as the += operator.

Notice that I snuck in a difference in scale, there. The "+=" block only had to process 20,000 elements, and the StringBuilder / List block was cranked up to a million. The results?

Using += operators:
TotalMilliseconds : 26024.1599

Using StringBuilder and List:
TotalMilliseconds : 8334.3011

Even though they had to process 50 times more data, the StringBuilder and List classes did the job in less than one third the time.

Posted in:
About the Author

Dave Wyatt

Profile photo of Dave Wyatt

Dave Wyatt is a Microsoft MVP (PowerShell) and a member of PowerShell.org's Board of Directors.

16 Comments

  1. PowerShell on Linux exists to help Windows admins to help them transition away from Windows Server. But In a decade, when > 75% of workloads running on Azure are Linux, the tools used to manage those workloads will be designed for and meet the needs of the Linux administrator mindset. Linux administrators have a great set of mature tools and it isn't entirely clear why they'd want to transition to using PowerShell. Linux has won. The growth of Linux on Azure shows that. It's time for Azure to be a Linux First environment - and PowerShell on Linux doesn't really fit because it's primarily trying to graft a new and unproven Microsoft way of doing things onto successful practices and culture. Sure they can do it - but they'd be better off retraining everyone to "Think Linux".

    If Microsoft has (apparently) given up on Windows Server and understand that only a few legacy holdouts will want to run it in Azure in 10 years time (with the majority of workloads being Linux), they are wasting resources trying to reinvent how Linux administrators do things as a way of placating legacy Windows administrators. Legacy Windows administrators should bite the bullet and go "all-in" and adopt existing successful open source administration paradigms. The clock is ticking on their relevance and if they spend precious time investing in a nascent administration technology instead of fully transitioning to an open source mindset, they'll be less employable in future.

    PowerShell on Linux would be a neat idea if Windows Server had a future. It'll remain around in the same way that mainframes are still with us - but Microsoft has no interest in making a compelling case for organizations to choose their product over the free alternative. The future of Windows Server is the current reality of Windows Phone.

    • Thank you SO much for contributing your perspective! I do think - and I'm not a Microsoft fanboy per se - that you're misinformed, or at least under-informed. Time will tell, of course, but I suspect you've brought some personal bias to your viewpoint.

    • Interesting perspective. But I think you missed the point why Microsoft is actually open sourcing PowerShell. If you followed the talks that Jeffrey Snover did the last few years, it became clear that they want to be able to support heterogeneous environments. And for what I've seen now from Microsoft, and especially the PowerShell team, is that they don't have a hidden agenda. Microsoft is not the Microsoft anymore from let's say, 5-10 years ago.

      I agree with you, that there's a lot of Linux on Azure. But many, many companies I come are mainly Windows based infrastructures. Also guys that I know that work for other companies almost only see Windows based infra's.

      Yes, Linux has it's place in this world, but according for Microsoft there's no battle between Windows or Linux. They're citizens in IT which Microsoft wants to support best. And PowerShell is not a tool per se, it's meant to be a management framework. A framework that operates with built-in tools or in the case of Windows, the .NET framework.

      Windows Server has a big future, just look at the developments Microsoft is doing on Nano server.

    • Do you really think in 20 years well be dealing with "OS war" ?

      Do you think well see linux or windows in 20-30 years ? I dont.

      Do you think there is a loosing side or a winning side ? There are never winners in any war.

      From my POV, the shift towards lean kernels to accommodate the cloud, will get us eventually to a unified kernel of some sort getting the best of breed of all OSes, giving developers and IT the option to focus on the tools and the frameworks and less about the underlying layers.

      Running a business that creates OS is becoming very expensive. No one wants to be limited in the tools they want to use, thus the SQL on Linux is a huge huge thing in that sense and it will only get bigger with more such products going the same way. I have yet to see any major party offer anything similar things, coming from the Linux side because it takes money and effort very little companies have, so MS in that sense is helping transform the ecosystem again and its in a very good direction.

      Powershell on Linux exists so I, as a windows admin will have a lower barrier of entrance, if my boss decides one day to invest some our company assets on linux. If I can help my company get the right decisions that will save it money and achieve more and if that means going with a non MS way, guess what, I can still use my skills from the windows side without the hassle of the learning curve.

      For a long time I've been an advocate of learning both windows and Linux, no matter what I do mostly in my work time, as they are just tools to make the job, means to achieve a goal..they are not the goals themselves, and the movement to the cloud just emphasize it even more.

      I think your notion of what open source and free means is what's leading you in the line of thought and that's where I think you were wrong, imho.
      Not saying that my notion of what open source and free means is better, but its somewhat less biased. Nothing is free. Open source doesn't mean security (look at the horrible OpenSSL hole that's been there for two years and only recently been closed, or support-when-you-NEED it, that will always cost money, either by support contracts or having devs that know that specific language to deal with the bugs internaly (which by itself is even more limiting with the amount of languages and frameworks popping every second day).

      As for hidden agendas, you need to remember this is still a business. There's always money involved and business opportunities to be made. MS along they years was always good in creating those opportunities for itself and its partners and it continues to do so, the bottom line will be the tools. If you have ones that do the job for you, keep using them. If MS puts money and effort to create better tools with the community, who's the winner ? Everyone.

      I've seen this in the heated arguments on the PS repo the second it went public. The lack of broader vision some of the Linux base audience showed, the arrogance, the "Its mine, dont touch it" is somewhat alarming. I'm just happy to know that the sysadmins in 20 years, the ones born today well have a different starting point where they will choose the tools and be told what to use by old retiring sysadmins that are trying to hold to their precious seats instead of embracing change and supporting it in the evolving it world.

  2. If I have to copy files from installation folder to destination and I have to do exception handling because everything is automated, then how to do that? What all errors may arise and how to recognize and handle them? Please help.

  3. powershell ought to get the credit it deserves for enabling a developer to rapidly create rich output handling complex decisions based upon datasets gathered from various means and implementing a nearly infinite number of actions based on these. Simply put, it can be, and it is, much more than an admin tool, in the right hands.

  4. To start this is a thing of beauty in it's simplicity.
    Does anyone have experience with how much memory the results occupy and doing Get-job | Receive-Job at the end? If I run say a 1000 or 10,000 will this cause memory problems? I am thinking doing Get-Job -State Complete | Receive-Job & then |remove-job inside the loop (and logging it) would reduce the chance of running the host out of memory, or am I just over complicating it?

  5. a) Remoting
    The primary purpose of PS on *nix will be remoting to Win-Hosts, such as Bash on Windows vice versa.

    Due to the nature of *nix as document driven OS, an object based shell does not make that much sense. We're missing the API level. Jeffrey told us so, long ago.

    b) Religious affairs
    It's not about publishing the code (which is nevertheless great!).
    The GPL especially is the denial of the biz model that drives the revenue of Microsoft. So, indeed, haters will hate. Agree.

    But, in for a penny, in for a pound, PoSh is part of Windows which is an expensive, closed down product, increasingly incapacitating the user.

    c) The role of Community
    Sorry to say that, but the PS community is so much more than the few "Get-Expert -wellknown | Get-Random" MVPs. I know it's hard to see that inside the bubble.

    PoSh itself is gorgeous but - at the end - just a shell, such as Korn, C, Z and all the others.

    Far more important: the promise of a datacenter abstraction layer beyond the borders of specific vendors, automation and the refusal of a click UI.

    In this sense, publishing the underlying code is a statement which can't be exaggerated!

    Great Post, Don!

Leave a Reply

Your email address will not be published. Required fields are marked *