Search
Generic filters
Exact matches only
Filter by Custom Post Type

Executing LINQ Queries in PowerShell - Part 2

And we're back!

Ok, so in the last blog we began a conversation about delegates and using LINQ in PowerShell. In today's post, I'm going to give an example of how it can be incredibly useful. Let's talk about Joins.

Joins

In my line of work, I'm constantly running into the need to combine datasets from multiple sources that relate to each other and pull out some specific properties. Say you have two internal services, one which is used to track production status and another which is used to monitor whether machines are online. To demonstrate this, let's initialize some mock data once again.

#Create empty arrays
$DatasetA = @()
$DatasetB = @()
#Initialize "status" arrays to pull random values from
$ProductionStatusArray = @('In Production','Retired')
$PowerStatusArray = @('Online','Offline')
#Loop 1000 times to populate our separate datasets
1..1000 | Foreach-Object {
    #Create one object with the current iteration attached to the name property
    #and a random power status
    $PropA = @{
        Name = "Server$_"
        PowerStatus = $PowerStatusArray[(Get-Random -Minimum 0 -Maximum 2)]
    }
    $DatasetA += New-Object -Type PSObject -Property $PropA
    #Create a second object with the same name and a random production status
    $PropB = @{
        Name = "Server$_"
        ProductionStatus = $ProductionStatusArray[(Get-Random -Minimum 0 -Maximum 2)]
    }
    $DatasetB += New-Object -Type PSObject -Property $PropB
}

Now we have two datasets with the same server names, one showing production status and the other showing power status. Our goal is to join that data together. In traditional PowerShell, we would likely iterate through one of the sets while doing a filter on the second set and then either add property members to the first set or create all new objects with a combination of properties from both sets. Something like this:

$JoinedData = @()
foreach($ServerA in $DatasetA) {
    $ServerB = $DatasetB | Where-Object Name -eq $ServerA.Name
    $Props = @{
        Name = $ServerA.Name
        PowerStatus = $ServerA.PowerStatus
        ProductionStatus = $ServerB.ProductionStatus
    }
    $JoinedData += New-Object -Type PSObject -Property $Props
}

This works fine. If I wrap it in a Measure-Command it takes right around 8.82 seconds to complete. Not awful, but at enterprise level where you're dealing with ten times that amount of data, you can see how that run time could get out of control. Now let's do the same with LINQ:

$LinqJoinedData = [System.Linq.Enumerable]::Join(
    $DatasetA, 
    $DatasetB, 
    [System.Func[Object,string]] {param ($x);$x.Name},
    [System.Func[Object,string]]{param ($y);$y.Name},
    [System.Func[Object,Object,Object]]{
        param ($x,$y); 
        New-Object -TypeName PSObject -Property @{
        Name = $x.Name; 
        PowerStatus = $x.PowerStatus; 
        ProductionStatus = $y.ProductionStatus}
    }
)
$OutputArray = [System.Linq.Enumerable]::ToArray($LinqJoinedData)

This completed for me in just over 0.4 seconds! Hopefully after last week this syntax doesn't look too daunting, but let's walk through what we just did. We're calling the Join method on System.Linq.Enumerable and then passing it five parameters.

  1. The first dataset we're going to join
  2. The second dataset to join
  3. The delegate which defines the key to compare against on the first dataset
  4. The delegate which defines the key to compare against on the second dataset
  5. Finally, we pass in the delegate which defines what the output should look like

So it looks complicated, but once you use it a few times, it's really not too bad. Now you're probably wondering why I added that final line where I called "[System.Linq.Enumerable]::ToArray($LinqJoinedData)." For that we need to talk about "Deferred Execution vs. Immediate Execution." When you call the Join method, it's not actually joining the data at that time, rather it's building an expression tree which defines the relational algebra needed to perform the join. This defers the execution point to when the data is actually operated against. So in the above example, I called "ToArray()" merely to provide an accurate timespan for how long the join actually takes as opposed to the more traditional PowerShell approach we used before it. If this were production code and I wanted to see  machines with an offline status that are listed as in production, rather than that "ToArray()" line I could simply run this:

$LinqJoinedData.Where({($_.PowerStatus -eq "Offline") -and ($_.ProductionStatus -eq "In Production")})

The Join query would execute at that time and then "Where()" would filter down to just the objects I requested.

And there you have it! If you found this interesting, I encourage you to check out these modules:

Feel free to reach out to me on Twitter or check out my personal site from time to time for other content. If you've seen my recent talk at PowerShell Summit, I'll be posting the blog I referenced there soon about turning my dog into a tea kettle.  (it's not PowerShell related, thus it will be landing somewhere other than here)

Happy tinkering!

-Eli

Executing LINQ Queries in PowerShell - Part 1

Greetings PowerShellers!

Lately, I've been itching to write something up on Microsoft’s Language-Integrated Query (LINQ). You've likely encountered it if you've done any development in C#. LINQ is an incredibly powerful querying tool for performing look-ups, joins, ordering, and other common tasks on large data sets. We have a few similar cmdlets built into PowerShell, but other than the '.Where()' method on collection objects nothing that comes close to the speed at which LINQ operates.

To dig into this topic, we're going to have to do a quick high level overview of a couple of other .NET staples often encountered in the C# world. You see, unlike most .NET methods which accept object types like integers, strings, and the like, LINQ uses static extension methods which only accept delegate object types.

What are delegates? In application development, there is an occasional need for objects within memory to communicate with each other for things such as "button click events." To address this, the Windows API uses function pointers to create callback functions which then report back to other functions in your applications. Within the .Net Framework, these are called delegates.

Delegates are objects that point to another method, or possibly many methods, by storing three key pieces of information: the address of the method on which it makes calls, the parameters (if any) of this method, and the return type (if any) of this method. With this information, a delegate object is able to invoke these methods dynamically at runtime, either synchronously or asynchronously. With this information, a delegate object is able to invoke these methods dynamically at runtime, either synchronously or asynchronously.

A simple example of this in C# looks like this:

using System;
namespace SimpleDelegate
{
    //Delegate declaration
    public delegate void PrintMessage(string msg);
    // Create a class with the method to bind to the delegate
    public class MessagePrinter
    {
        public static void PrintLine(string msg)
        { 
            Console.WriteLine(msg); 
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            // Create a PrintMessage delegate object that
            // "points to" MessagePrinter.PrintLine().
            PrintMessage p = new PrintMessage(MessagePrinter.PrintLine);
            p("Hi Animatronio!");
            Console.ReadLine();
        }
    }
}

Clearly, in this example the use of delegates is not necessary. I'm just trying to frame up how they would be declared and subsequently called. To simplify all of the above, Microsoft has created two generic delegate definitions. For delegates with no output, we can use Action<> and for delegates with output, we can use Func<>. These two beauties are what give us PowerShellers access to LINQ. Today we're going to use Func<> because we want output. The syntax for doing so looks like this:

[Func[int,int]]$Delegate = { param($i); return $i + 1 }

Let's break this down left to right:

  1. Declare Func<>
  2. Tell it the type of parameter(s) to expect. In this case we're passing a single integer parameter.
  3. Tell it the type of output to produce, again an integer will be returned.
  4. Name the delegate variable.
  5. Define the delegate with a scriptblock. We're just doing a very simple addition step on the parameter and returning the output.

And now we've finally arrived at the meat of this article. Let's initialize a mock dataset with ~2 million objects to play with:

$Dataset = @()
0..1000 | Foreach-Object { $Dataset += (Get-Verb)[(Get-Random -Maximum 98)] }
0..10 | ForEach-Object {$Dataset += $Dataset}

Next we'll measure how long it takes to filter down to only the objects which equal "Get" using Where-Object on three different Windows Server OS's running on the same Azure compute instances:

Measure-Command { ($Dataset | Where-Object Verb -eq "Use") }
# 2008 R2: TotalSeconds : 23.3399981
# 2012 R2: TotalSeconds : 61.7634027
# 2016   : TotalSeconds : 18.0190367

Now let's do the same query using LINQ:

[Func[object,bool]] $Delegate = { param($v); return $v.verb -eq "Use" }
Measure-Command { [Linq.Enumerable]::Where($Dataset,$Delegate) }
# 2008 R2: TotalSeconds : 11.3967464
# 2012 R2: TotalSeconds : 25.6511816
# 2016   : TotalSeconds : 12.8999417

As you can see, in the older operating systems, LINQ is over twice as fast (also, what's the deal with 2012 R2??). In 2016, it's only about 50% faster. But of course, calling '.Where()' directly on the object is still by far the fastest way to filter on a dataset:

Measure-Command { $Dataset.Where( {$_.Verb -eq "Use"}) }
# 2008 R2: TotalSeconds : 5.5102392
# 2012 R2: TotalSeconds : 17.5893828
# 2016   : TotalSeconds : 6.1834444

Initially I had suspected it was translating the scriptblock as an anonymous function and tapping into the LINQ extension method behind the scenes, but Bruce Payette set me straight. According to Bruce, It's using a very low level API to invoke the scriptblock. Source code is here.

So if '.Where()' is so much faster, why did I bother writing this? I wanted to open with a familiar concept. The true power in LINQ comes from its SQL-like ability to aggregate and manipulate data. In the next blog, we'll take a look at grouping data, using joins, and why that's awesome.

Until then, happy tinkering!

-Eli

Do Anything in One Line of PowerShell

PowerShell provides a tremendous boon to productivity for computer professionals of all types. But, you have to admit: it can be a bit daunting to get up to speed! Indeed, as someone who has a fair amount of experience using it, I still find myself having to look up how to do things--frequently. So I started keeping track of the recipes I was using the most. And came up with a list of 400 or so, published in 4 parts.

Though I actually wrote these a couple years back they are certainly still relevant today, just covering a bit less of the ever-expanding PowerShell universe of discourse!

(Note that at the end of each web article listed above is a link to download it as a PDF that is more tidily formatted.)

PowerShell Gotchas

You can certainly find a number of articles around that present PowerShell pitfalls that can easily trip you up if you are not careful. I took a different approach in my three-part series, A Plethora of PowerShell Pitfalls.

The first two parts are presented in quiz format, together covering the top 10 "gotchas". They will help you test your awareness to see if you even realized the danger and did not know you've been skirting those traps for awhile. After you've had an opportunity to consider the conundrums presented, I then go into detailed explanations for why they happen and how to fix them.

The third and final part is a compendium of all the common "gotchas" that I put together after reviewing all the other lists out there. The more than 35 entries in the list cover, I believe, a good 98% of the issues you would likely encounter. Yes, there are more esoteric pitfalls as well, but I ran out of web page... 🙂

Part 1: Pesky Parameter Problems

Part 2: A Portion of Potential Puzzles

Part 3: The Compendium

 

Pitfalls of the Pipeline

Pipelining is an important concept in PowerShell. Though the idea did not originate with PowerShell (you can find it used decades earlier in Unix, for example), PowerShell does provide the unique advantage of being able to pipeline not just text, but first-class .NET objects.

Pipelining has several advantages:

  • It helps to conserve memory resources. Say you want to modify text in a huge file. Without a pipeline you might read the huge file into memory, modify the appropriate lines, and write the file back out to disk. If it is large enough you might not even have enough memory to read the whole thing.
  • It can substantially improve actual performance. Commands in a pipeline are run concurrently-even if you have only a single processor, because when one process blocks, for example, while reading a large chunk of your file, then another process in the pipeline can do a unit of work in the meantime.
  • It can have a significant effect on your end-user experience, enhancing the perceived performance dramatically. If your end-user executes a sequence of commands that takes 60 seconds, then until 60 seconds has elapsed he/she would see nothing without pipelining, whereas output could start appearing almost immediately with pipelining.

PowerShell provides a variety of techniques for using pipelining but it is all to easy to do it wrong, so you think you are pipelining but in fact you are not. In my article Ins and Outs of the PowerShell Pipeline, I discuss the most common things that can trip you up with implementing pipelining and how to avoid them.

A Practical Guide for Using Regex in PowerShell

Regular Expressions is often referred to as wizardry or magic and for that reason I stayed away from it for most of my career. I used it only when I had to and most of the time just reused examples that I found online. There's nothing wrong with that of course, but I never took the time to learn it. I thought it was reserved for the elite. Turns out that it's not that complicated and that I had been using it for years without knowing it.

In an effort to shorten the learning curve for others and to show you the value of learning regular expression I've written a blog post titled A Practical Guide for Using Regex in PowerShell. It will walk you through how to use regular expression in PowerShell and gives you a glimpse into how powerful regular expression is.

Below is an example of how to use regular expression to extract a user's name from their distinguished name in Active Directory. To learn more check out this blog post.

matches

Topics Covered

  • -match operator
  • -match operator with regular expression metacharacters
  • -notmatch with where-object
  • -replace operator
  • -split operator
  • Select-String
  • Switch Statements
  • Regex Object

Ultimate PowerShell Prompt Customization and Git Setup Guide

Do you spend hours a day in PowerShell? Switching back and forth between PowerShell windows getting you down? Have you ever wanted "Quake" mode for your terminal?

If we are going to spend so much time in PowerShell, we may as well make it pretty.

Check out the Ultimate PowerShell Prompt Customization and Git Setup Guide for how to:

  • Install and customize ConEmu
  • Enable Quake Mode for your terminal
  • Setup your PowerShell Profile
  • Install and use Posh-Git
  • Generate and use SSH Keys with GitHub
  • Squash Git commits

Create Custom Monitors with PowerShell

Sometimes, as a developer, you want to be be able to keep track of free space on a drive, the size of a log, the load on your CPU, the number of users logged in, etc. With PowerShell, it is typically just a matter of finding the right cmdlet amidst the large (and rapidly growing) pool of cmdlets provided by Microsoft and by third parties. Then you just run Get-Foo to check details about the foo resource. And then you come back 5 minutes later and run it again because you want to see how it changes over time.

But wouldn't it be nice if you could just have it run automatically at regular intervals in a separate window that you could just keep in the corner of your screen? Well, I found the barebones of just such a utility sometime ago (authored by Marc van Orsouw,  aka ‘thePowerShellGuy’). His original post is no longer available, but I expanded upon his code and, over time, added features, bug fixes, and enhancements, making it more useful and more user-friendly. Here are a few screenshots of the Monitor Factory in action.

Monitor the size of a database

Start-Monitor -AsJob {
    Invoke-Sqlcmd 'DBCC SQLPERF(logspace)' |
    Select-Object 'Database Name','Log Size (MB)','Log Space Used (%)',HasErrors
}

Database Size Monitor

Monitor drives on a system
Drive Capacity Monitor

Monitor longest running DB queries
Long-runnning DB Query Monitor

Build Your Own Resource Monitor in a Jiffy reveals how quick and easy it is to get started with the Monitor Factory.

A date with PowerShell

At the beginning of July, we welcomed our 3rd son into the world. As days past my wife and I would say, "wow, he's 11 days old. Can you believe it?!". I'm sure parents out there are relating to this!
This gave me an idea for a fun script that would get your age in years, months and days, tell you how many days until your birthday and your star sign.

I wanted date of birth passed to the function as 'dd/MM/yy'. To keep to this format, I’m using the 'ValidatePattern' Advanced Parameter with a Regular Expression (Regex). The regular expression, "^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/(\d{2})$", will only allow a date in the format of 01/01/16, for example.

Briefly, here is regex syntax I used in some of the expression:

^ Start of string
( .. ) Capturing group
(0[1-9] Match two digits that make up the day. This accepts numbers from 01 to 09
| Acts like a Boolean OR.
/d match any digital character
[12] match any character in the set
/ used to divide the date numbers
{2} Exactly two times
$ End of string

Now that my function parameter variable $Bday has a date, its passed to get-date to be converted from a string to a date. The date in variable $cDate will look like this, '01 January 2016 00:00:00'. The next line in the code will use todays date and subtract the date passed in $cDate variable. The $diff variable will contain the following data which we will use to get our age in years, months and days:

Days : 212
Hours : 12
Minutes : 40
Seconds : 20
Milliseconds : 533
Ticks : 183624205335135
TotalDays : 212.528015434184
TotalHours : 5100.67237042042
TotalMinutes : 306040.342225225
TotalSeconds : 18362420.5335135
TotalMilliseconds : 18362420533.5135

I've contained this first part in our Begin block. The Process block does the main code.

Now I need to get my age in Years, Months and Days. This is where the [math] data type is used. I'm using the 'Truncate' property as I don't want to do anything fancy like round up my numbers. Adding the .typename of Days to my $diff variable and dividing by $daysInYear variable I can get my age in years.

The next two, months and days required a tweak to the algorithm.

I ended up using a maths term called a 'Mod'. Now I’m not talking about youth culture and style in the sixties (Mods and rockers anyone ??), but the Modulus Math Operator. Basically the Modulus Operator returns the remainder when the first number is divided by the second. So for example:
1 mod 3 = 1 (or 1 % 3 = 1)
2 mod 3 = 2
3 mod 3 = 0
4 mod 3 = 1

The operator sign used is % for Modulus. Not to be confused for the alias of foreach in PowerShell. For days in a month, I used the average of 30.

I thought it would be fun to add the star sign as well. I was after something that could tell me, "is this date in this date range?". One of the properties of 'get-date' is DayOfYear.
Finding if a number is in a range is pretty straight forward, For example:

 5 -in 1..10 

Which gives a Boolean result.

Now if I convert my date ranges into days of the year then I can match the day of the year I was born against the ranges of days for star signs. I've used a switch statement to check against multiple conditions. Within a scriptblock I’ve asked if the value I’m passing is 'in' the array of dates for each star sign. The match will return the star sign and is held in the $starSign variable.

The Final part of the process block is to work out how many days until your next birthday. By capturing the current date, formatting the date of birth by removing the year born, adding the current year and finally subtract the amended date of birth against the current date. Phew!
This will leave a number of days until your next birthday. The 'if' statement is added if your birthday has already happened at the time of the code, it simply reverses the sum to give a positive number.

The end block displays the three captured results to the host.

I hope you have enjoyed this post and can see the many options possible for dates in PowerShell.
Feel free to download the script from my GitHub https://github.com/Gbeer7/Get-Age.git

function Get-age {
    param( 
        [Parameter(Mandatory=$true,
                   HelpMessage="Date must be written as dd/mm/yy",
                   Position=0)]
        [ValidatePattern("^(0[1-9]|[12]\d|3[01])/(0[1-9]|1[0-2])/(\d{2})$")]
        [string]$Bday    
    )

Begin {
    # use 'get-date' to convert '$Bday' Variable
    $cDate = (get-date -Date $Bday)

    # from today's date subtract birth date
    $diff = (Get-Date).Subtract($cDate)
}

Process {

    # Work out Years, months and days
    [int]$daysInYear = '365'
    [int]$averageMonth = '30'
      
    # years
    $totalYears = [math]::Truncate( $($diff.Days) / $daysInYear ) 
    
    
    $totalMonths = [math]::Truncate( $($diff.Days) % $daysInYear / $averageMonth ) 
    
    # days
    $remainingDays = [math]::Truncate( $($diff.Days) % $daysInYear % $averageMonth ) 

    # Your star sign
    $thisYear = (get-date).Year
     
    $starSign = 
    switch ($cDate.DayOfYear) {
    
        { $_ -in @( ((get-date 22/12/$thisYear).DayOfYear)..365; 0..((get-date 19/01/$thisYear).DayOfYear) ) } { "Capricorn" }
        { $_ -in @( ((get-date 20/01/$thisYear).DayOfYear)..((get-date 18/02/$thisYear).DayOfYear) ) } { "Aquarius" }
        { $_ -in @( ((get-date 19/02/$thisYear).DayOfYear)..((get-date 20/03/$thisYear).DayOfYear) ) } { "Pisces" }
        { $_ -in @( ((get-date 21/03/$thisYear).DayOfYear)..((get-date 19/04/$thisYear).DayOfYear) ) } { "Aries" }
        { $_ -in @( ((get-date 20/04/$thisYear).DayOfYear)..((get-date 20/05/$thisYear).DayOfYear) ) } { "Taurus" }
        { $_ -in @( ((get-date 21/05/$thisYear).DayOfYear)..((get-date 20/06/$thisYear).DayOfYear) ) } { "Gemini" }
        { $_ -in @( ((get-date 21/06/$thisYear).DayOfYear)..((get-date 22/07/$thisYear).DayOfYear) ) } { "Cancer" }
        { $_ -in @( ((get-date 23/07/$thisYear).DayOfYear)..((get-date 22/08/$thisYear).DayOfYear) ) } { "Leo" }
        { $_ -in @( ((get-date 23/08/$thisYear).DayOfYear)..((get-date 22/09/$thisYear).DayOfYear) ) } { "Virgo" }
        { $_ -in @( ((get-date 23/09/$thisYear).DayOfYear)..((get-date 22/10/$thisYear).DayOfYear) ) } { "Libra" }
        { $_ -in @( ((get-date 23/10/$thisYear).DayOfYear)..((get-date 21/11/$thisYear).DayOfYear) ) } { "Scorpio" }
        { $_ -in @( ((get-date 22/10/$thisYear).DayOfYear)..((get-date 21/12/$thisYear).DayOfYear) ) } { "Sagittarius" }
    } 
    
    # Work out how many days until birthday    
    $now = [DateTime]::Now   
    $dm = get-date $Bday -UFormat "%m/%d/" 
    $Days = [Datetime]($dm + $now.Year) – $Now

    # If birthday has happened this year change sum
    if (!($Days -ge 0)) { $Days = $now - [Datetime]($dm + $now.Year) }               
}
        
End {
    # display
    "`nYou are {0} year(s), {1} month(s) and {2} day(s)" -f $totalYears, $totalMonths, $remainingDays
    "Your Star sign is: " + $starSign
    
    # and...
    if ($cDate.Year -eq (get-date).Year) { 
        "You have another $($daysInYear - $diff.Days) days until your birthday" # If you are under 1 years old
    } else { 
        "You have another $($Days.days) days until your birthday" # over the age of 1 
    }
    
}

}# Function End

Every pithy witticism begins with quotation marks

"To be or not to be". Without getting into a debate over whether Shakespeare was musing about being a logician, suffice to say that in writing prose, the rules of when and how to use quotation marks are relatively clear. In PowerShell, not so much. Sure, there is an about_Quoting_Rules documentation page, and that is a good place to start, but that barely covers half the topic. It assumes you need quotes and then helps you appreciate some of the factors to consider when choosing single quotes or double quotes.

But do you need quotes? Remember PowerShell is a shell/command language so "obviously" you can do things like this:

PS> Delete-Item C:\tmp\foobar.txt
PS> Get-ChildItem *.log
PS> Get-Process svchost, conhost, powershell

It would certainly be cumbersome if you needed to quote each of those arguments, so PowerShell was designed well, in that respect.

But what if you ran the same commands just slightly differently?

PS> "C:\tmp\foobar.txt" | Delete-Item 
PS> "*.log" | Get-ChildItem 

Here you must use quotation marks or you will suffer the wrath of a terminating error from the PowerShell host most certainly!

Those are just a couple of the many examples I consider in When to Quote in PowerShell. Accompanying the full article, I also included a wallchart that condenses all the article's salient points into a single-page reference. Here's a fragment of the wallchart:

Guide to PowerShell Quoting wall chart

Read the article and download the wallchart here.

Skip to toolbar