Dealing with DBNull

Tagged: , ,

This topic contains 5 replies, has 2 voices, and was last updated by Profile photo of Dave Wyatt Dave Wyatt 3 years, 2 months ago.

  • Author
    Posts
  • #13652
    Profile photo of Warren Frame
    Warren Frame
    Participant

    Hi all,

    Have spent a (small) bit of time trying to work around this, no luck thus far. My goal is to provide a set of functions that are essentially wrappers for SQL queries. The issue I am stuck on is working with output that includes System.DBNull value types. Two of the biggest issues (both with comparisons) and their ugly workarounds:

    
        #[DBNull]::Value returns... something... so no quick filters for values that exist or don't exist.
        Get-XXXXServer | Where-Object{$_.VCServerContact} 
        Get-XXXXServer | Where-Object{$_.VCServerContact -and $_.VCServerContact -notlike ""}
        Get-XXXXServer | Where-Object{-not $_.VCServerContact}     
        Get-XXXXServer | Where-Object{$_.VCServerContact -notlike ""} #Not is a little easier, still not intuitive.
    
        #[DBNull]::Value isn't comparable... so no using -gt -lt unless you want a slew of errors
        Get-XXXXServer | Where-Object{$_.VCNumCPUs -gt 8}
        Get-XXXXServer | Where-Object{$_.VCNumCPUs -notlike "" -and $_.VCNumCPUs -gt 8}
    

    You get the picture. Can I work with it? Sure. That being said, now that I'm about to present this to my team of mostly less-experienced PowerShellers, I'm realizing it's a bit of a PITA, and explaining it to people who still have trouble with the concepts of objects will be interesting. I've just been dealing with it because I'm familiar the basics of PowerShell.

    Is there a typical solution for this? Just deal with it? Unless I'm doing it wrong, creating a psobject for each row is incredibly inefficient (i.e. 100 milliseconds for 2000 rows, increased to well over 30 seconds). I'm assuming I'm doing it wrong (Get properties of first row returned, for each consecutive row create an object, loop through and set properties)?

    On a side note, I'm using Invoke-SQLCMD2 from Chad.

    Your insight would be greatly appreciated!

    rcm

  • #13658
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    On a side note, this might make a good suggestion on the Connect site (https://connect.microsoft.com/PowerShell/). This morning I had the idea of trying to get DBNull to automatically cast to $false by setting up a TypeConverter (see http://blogs.msdn.com/b/powershell/archive/2013/06/11/understanding-powershell-s-type-conversion-magic.aspx ), but it looks like PowerShell's "Language-based conversion" kicks in much earlier in the process when the target type is Boolean. That means that we can't change the behavior now, but Microsoft could do so in a future release of PowerShell, if it gets enough interest and they don't think it's a bad idea for some reason.

  • #13665
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    Np, glad it helped! 🙂

  • #13654
    Profile photo of Dave Wyatt
    Dave Wyatt
    Moderator

    It's a little bit of a hack, but casting the value to [string] seems to work without adding much code. DBNull and $null both become the empty string, which evaluates to $false:

    Get-XXXXServer | Where-Object { [string]$_.VCServerContact }
    

    That's essentially the same thing that you're doing with the -notlike "" option; PowerShell's casting the value to a string for you before evaluating the notlike operator.

    As for outputting custom objects and simultaneously scrubbing out DBNull (with a pure PowerShell implementation), here's the fastest option I've found, based on some quick tests (snip of modified Invoke-SQLCMD2):

        switch ($As) 
        { 
            'DataSet'     { Write-Output ($ds) } 
            'DataTable'   { Write-Output ($ds.Tables) } 
            'DataRow'     { Write-Output ($ds.Tables[0]) }
            'PSObject' {
                $properties = New-Object System.Collections.Specialized.OrderedDictionary
    
                foreach ($row in $ds.Tables[0].Rows)
                {
                    $properties.Clear()
                       
                    foreach ($property in $row.PSAdapted.PSObject.Properties)
                    {
                        if ($property.Value -is [System.DBNull])
                        {
                            $value = $null
                        }
                        else
                        {
                            $value = $property.Value
                        }
    
                        $properties[$property.Name] = $value
                    }
    
                    [pscustomobject] $properties
                }
            }
            'SingleValue' { Write-Output ($ds.Tables[0] | Select-Object -Expand $ds.Tables[0].Columns[0].ColumnName ) }
        } 
    

    This seems to take about 10x as long as the 'DataRow' option. Note that the [pscustomobject] syntax requires at least PowerShell 3.0, but going with the 2.0-compatible syntax of New-Object psobject -Property $properties more than doubled the execution time in my tests. Likewise, using Write-Output also added a pretty significant amount of time as compared to just letting PowerShell do the pipeline output itself.

    There's quite a bit of overhead involved in simply accessing members of objects in PowerShell. Taking the logic from that block and moving it out into a C# function is much, much faster:

    $cSharp = @'
        using System;
        using System.Data;
        using System.Management.Automation;
    
        public class DBNullScrubber
        {
            public static PSObject DataRowToPSObject(DataRow row)
            {
                PSObject psObject = new PSObject();
    
                if (row != null && (row.RowState & DataRowState.Detached) != DataRowState.Detached)
                {
                    foreach (DataColumn column in row.Table.Columns)
                    {
                        Object value = null;
                        if (!row.IsNull(column))
                        {
                            value = row[column];
                        }
    
                        psObject.Properties.Add(new PSNoteProperty(column.ColumnName, value));
                    }
                }
    
                return psObject;
            }
        }
    '@
    
    Add-Type -TypeDefinition $cSharp -ReferencedAssemblies 'System.Data','System.Xml'
    
    # Snip from Invoke-SqlCmd2
    
        switch ($As) 
        { 
            'DataSet'     { Write-Output ($ds) } 
            'DataTable'   { Write-Output ($ds.Tables) } 
            'DataRow'     { Write-Output ($ds.Tables[0]) }
            'PSObject' {
                foreach ($row in $ds.Tables[0].Rows)
                {
                    [DBNullScrubber]::DataRowToPSObject($row)
                }
            }
            'SingleValue' { Write-Output ($ds.Tables[0] | Select-Object -Expand $ds.Tables[0].Columns[0].ColumnName ) }
        } 
    
  • #13659
    Profile photo of Warren Frame
    Warren Frame
    Participant

    Dave – great suggestions! Thanks for the input, I think for our purposes, the performance hit from the C# method will be worth the added convenience.

    With all the loose 'let me make this behave the way a non-developer might expect' goodness in PowerShell, it would certainly be nice to have PowerShell take care of this for us! I'll submit a suggestion on Connect.

    *edit: submitted – if anyone agrees, please vote!

    I owe you a drink at the summit : ) Cheers!

  • #13664
    Profile photo of Warren Frame
    Warren Frame
    Participant

    In case anyone else comes across this thread, I've incorporated Dave's code (and a few features, details at the link) in the Invoke-Sqlcmd2.ps1 at here. Contributors / suggestions / feedback welcome!

    Thanks again Dave!

You must be logged in to reply to this topic.