Create lagged field in

This topic contains 3 replies, has 2 voices, and was last updated by  Don Jones 3 weeks, 2 days ago.

  • Author
    Posts
  • #75626

    Patrick77
    Participant

    Hello,
    I am new to PowerShell and am playing with basic data operations and have a question about creating a lagged variable within a datatable. I know how to do this in R, or using SQL ... but what would be a good native PowerShell approach when dealing with datatables?

    Below is a small example data set and the desired output below that. Note: it is lagged by group (ID).

    Please let me know if you have any ideas. Thank you!

    $table = New-Object system.Data.DataTable
    $col1 = New-Object system.Data.DataColumn ID,([string])
    $col2 = New-Object system.Data.DataColumn EFFDT,([datetime])
    $col3 = New-Object system.Data.DataColumn Var,([string])
    $table.columns.add($col1)
    $table.columns.add($col2)
    $table.columns.add($col3)
    $row = $table.NewRow()
    $table.Rows.Add("A", "1/1/2017", "1")
    $table.Rows.Add("A", "1/2/2017", "1")
    $table.Rows.Add("A", "1/3/2017", "2")
    $table.Rows.Add("B", "1/1/2017", "2")
    $table.Rows.Add("B", "1/2/2017", "3")
    $table.Rows.Add("B", "1/3/2017", "3")
    $table.Rows.Add("B", "1/5/2017", "4")
    $table | format-table -AutoSize

    # desired output
    # ID EFFDT Var Lag1Var
    # — —– — ——-
    # A 1/1/2017 12:00:00 AM 1 1
    # A 1/2/2017 12:00:00 AM 1 2
    # A 1/3/2017 12:00:00 AM 2 –
    # B 1/1/2017 12:00:00 AM 2 3
    # B 1/2/2017 12:00:00 AM 3 3
    # B 1/3/2017 12:00:00 AM 3 4
    # B 1/5/2017 12:00:00 AM 4 –

  • #75628

    Don Jones
    Keymaster

    I'm not a data expert – can you tell me what a "lagged" field is?

    • #75631

      Patrick77
      Participant

      Sure. I essentially mean to grab the value from the previous row in a time-series table.

      In the example below, there are IDs A and B. Each row has a date, followed by a field that is an integer in this example. The lag is simply the value from the previous row for each ID and is shown in bold in the last column below.

      ID, Date, Var, LaggedVar
      A, 1/1/2017, 1, 1
      A, 1/2/2017, 1, 2
      A, 1/3/2017, 2, –
      B, 1/1/2017, 2, 3
      B, 1/2/2017, 3, 3
      B, 1/3/2017, 3, 4
      B, 1/5/2017, 4, –

  • #75634

    Don Jones
    Keymaster

    So, PowerShell as a data manipulation language isn't so much a thing. Data tables in PowerShell, for example, aren't "in" PowerShell as much as they're "in" .NET Framework; PowerShell doesn't have anything native that does this. So you're obviously in .NET, but if it has some specific feature for that task, I'm afraid I don't know it (StackOverflow might be a good spot to ask). If I had to do what you're doing, I'd just save the relevant field in a variable each time through the loop, and then use that variable in setting the next row. Variables in PowerShell aren't scoped to loops, so they'll persist through each iteration.

You must be logged in to reply to this topic.