Is there a way to bulkcopy without sorting the data?

Welcome Forums General PowerShell Q&A Is there a way to bulkcopy without sorting the data?

Viewing 10 reply threads
  • Author
    Posts
    • #227845
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      Given the following code, how can i achieve something like this:

      pseudocode:

      [/crayon]

      current code:

      [/crayon]

      Why do i want to do this?

      we are bulk copying data from a csv file to sql table using code below. it works great, but one suggestion we received is to retain the order of the source data, i.e. do not sort it, because they connect eventually to a database in excel that has this data as is from the sql table, and its sorted currently which they wish its not. So the question is: how can we bulkcopy the data into the sql table in the order it is in the csv file, i.e. without sorting?

      csv file (unsorted):

      [/crayon]

      sql table after bulkcopy (sorted):

      sql table

      of course, we have an option to add a column to the csv files that lists numbers sequentially, which would retain the order of the rest of the columns, for example:

      sequence Report Year_sfx
      1 2013
      2 2014
      3 2015
      4 2016
      5 2017
      6 2018
      7 2021
      8 2022
      9 2023
      10 2024
      11 2025
      12 2026
      13 2027
      14 2028
      15 2029
      16 2019 0+12
      17 2019 12+0
      18 2019 8+4
      19 2020 0+12
      20 2020 1+11
      21 BP19
      22 BP20

      but i prefer not having to do this because i have many csv files i’d have to do this for…

    • #227884
      Participant
      Topics: 8
      Replies: 562
      Points: 2,149
      Helping Hand
      Rank: Community Hero

      Good question. It looks like if you’re using bulk copy, there is no guarantee. This answer suggests you add an order column, which you might be able to programmatically while importing the CSV. Hopefully someone else has already figured this out and will chime in. I’ll keep thinking about it in the meantime.

    • #227893
      Participant
      Topics: 13
      Replies: 1758
      Points: 3,153
      Helping Hand
      Rank: Community Hero

      The SQL engine is controlling the bulk data import and doing what is the most efficient way to import the data. Not sure why it’s automatically sorting dates, but the goal is to get data into the database fast, so adding sorting is just extra time. Here is a similar question:

      https://www.codeproject.com/Questions/841643/Bulk-Copy-Sequential-insert-into-sql

      If you’re trying to retain a non-sortable order, then you should add a sortable column for the application data layer to sort it for presentation, such as the sequence number.

    • #227896
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      Good question. It looks like if you’re using bulk copy, there is no guarantee. This answer suggests you add an order column, which you might be able to programmatically while importing the CSV. Hopefully someone else has already figured this out and will chime in. I’ll keep thinking about it in the meantime.

      thanks Doug, i checked out the link, the accepted answer is exactly what the other option i mentioned in my post, so thats something i could consider if theres no way to retain the same order while bulkcopying. I also tagged you in a previous thread you helped me out with in creating that export from excel to csv script. I figured since the code was perfected there, might as well tag you over there about ti instead of opening a new thread completely 🙂

      the second answer in the link suggests creating a staging table. but in my case, wouldnt the datatable technically be this “temporary” staging like table? i tested out the import-csv, it retains the order of the csv file, then i tested out the out-datatable after the import from csv, and it also retains the csv file order. so it looks like it gets sorted at the sqlbulkcopy part…really hoping its as simple as the pseudocode i mentioned, or at least close, because that would make things sooo much easier!

    • #227902
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      The SQL engine is controlling the bulk data import and doing what is the most efficient way to import the data. Not sure why it’s automatically sorting dates, but the goal is to get data into the database fast, so adding sorting is just extra time. Here is a similar question:

      https://www.codeproject.com/Questions/841643/Bulk-Copy-Sequential-insert-into-sql

      If you’re trying to retain a non-sortable order, then you should add a sortable column for the application data layer to sort it for presentation, such as the sequence number.

      yep, thats what we’re trying to do, retain a non-sortable order, essentially keeping the same order of the csv file data.

      the sequence number is definitely an option, but if its avoidable in a much easier way such as piping a non-sort functionality to the sqlbulkcopy, then that would make things much simpler 🙂

    • #227923
      Participant
      Topics: 8
      Replies: 562
      Points: 2,149
      Helping Hand
      Rank: Community Hero

      All I was trying to say is you don’t have to do your order column manually.
      Given this CSV file

      You can add orderID programmatically.

      Output:

      And if you would rather the orderID column be on the end, just switch the position of the properties.

    • #227929
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      All I was trying to say is you don’t have to do your order column manually.

      Given this CSV file

      PowerShell
      8 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 17.9048px; width: 7.20119px; left: 45px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      8
      columna,columnb,columnc
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      You can add orderID programmatically.

      PowerShell
      7 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 17.9048px; width: 7.20119px; left: 45px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      $count = 0
      import-csv C:\temp\original.csv | foreach{
      $_ | Select-Object @{N=‘OrderID’;E={$count}},*
      $count++
      } | Export-Csv c:\temp\exportwithorderid.csv NoTypeInformation
      import-csv C:\temp\exportwithorderid.csv
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      Output:

      PowerShell
      9 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 17.9048px; width: 7.20119px; left: 45px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      8
      9
      OrderID columna columnb columnc
      ——- ——- ——- ——-
      0 a b c
      1 a b c
      2 a b c
      3 a b c
      4 a b c
      5 a b c
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      And if you would rather the orderID column be on the end, just switch the position of the properties.

      <textarea class=”ace_text-input” style=”opacity: 0; height: 17.9048px; width: 7.20119px; left: 45px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      Select-Object *,@{N=‘OrderID’;E={$count}}
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      Ohhh that makes sense! Maybe I can add it to the import itself basically, not the csv file itself, and just pipe it directly to the out-datatable

      That’s actually another good idea, I like it! Will try it out and let you know!

      • This reply was modified 4 months, 1 week ago by cataster16.
    • #227941
      Participant
      Topics: 13
      Replies: 1758
      Points: 3,153
      Helping Hand
      Rank: Community Hero

      Another option:

      Output:

    • #228202
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      Another option:

      PowerShell
      4 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      $services = Get-Service | Select First 5 Property Name, Status
      $services | Select @{Name=‘Index’;Expression={[array]::IndexOf($services, $_)}}, Name, Status
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      Output:

      PowerShell
      8 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      8
      Index Name Status
      —– —- ——
      0 AarSvc_d253b Stopped
      1 AESMService Running
      2 AJRouter Stopped
      3 ALG Stopped
      4 AppIDSvc Stopped
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      All I was trying to say is you don’t have to do your order column manually.

      Given this CSV file

      PowerShell
      8 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      8
      columna,columnb,columnc
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      a,b,c
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      You can add orderID programmatically.

      PowerShell
      7 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      $count = 0
      import-csv C:\temp\original.csv | foreach{
      $_ | Select-Object @{N=‘OrderID’;E={$count}},*
      $count++
      } | Export-Csv c:\temp\exportwithorderid.csv NoTypeInformation
      import-csv C:\temp\exportwithorderid.csv
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      Output:

      PowerShell
      9 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      6
      7
      8
      9
      OrderID columna columnb columnc
      ——- ——- ——- ——-
      0 a b c
      1 a b c
      2 a b c
      3 a b c
      4 a b c
      5 a b c
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      And if you would rather the orderID column be on the end, just switch the position of the properties.

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      Select-Object *,@{N=‘OrderID’;E={$count}}
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      ok so it turns out that because the table has a clustered index (PK) it wont work even with a sequence column, because the ordering is determined based on the Primary Key (the Report Year)

      hi

      • This reply was modified 4 months, 1 week ago by cataster16.
    • #228214
      Participant
      Topics: 13
      Replies: 1758
      Points: 3,153
      Helping Hand
      Rank: Community Hero

      Yes, but how is the data being consumed? The sequence allows you to do:

      Also, you should not have spaces in your headers, just asking for pain. Storing the data in the database is not the same as consuming the data in a report or application. It would be like this:

      This is getting into Relational Data structure best practices, but variables and table headers should not have spaces.

    • #228253
      Participant
      Topics: 41
      Replies: 113
      Points: 647
      Rank: Major Contributor

      Yes, but how is the data being consumed? The sequence allows you to do:

      PowerShell
      4 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      SELECT Sequence, ‘Report Year’
      FROM myTable
      ORDER BY Sequence
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      Also, you should not have spaces in your headers, just asking for pain. Storing the data in the database is not the same as consuming the data in a report or application. It would be like this:

      PowerShell
      5 lines

      <textarea class=”ace_text-input” style=”opacity: 0; height: 18px; width: 6.59781px; left: 44px; top: 0px;” spellcheck=”false” wrap=”off”></textarea>

      1
      2
      3
      4
      5
      SELECT SEQ AS ‘Sequence’,
      REPORT_YEAR AS ‘Report Year’
      FROM myTable
      ORDER BY Sequence
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

      This is getting into Relational Data structure best practices, but variables and table headers should not have spaces.

      unfortunately, the data retrieval process is outta my scope because the tables are just the datasource that feeds data into a tabular cube. the data is fed through a process known as “Process Database“, which i can’t really specify a query in since its a backend microsoft process.

      the users end up connecting to the cube though in excel…maybe they can specify the query there somehow? MDX/DAX perhaps?

      as for the naming convention and spacing, ya i agree, i actually built our POC version by modifying every single column name to include underscores to eliminate the undesirable characters, but because we keep getting a new file everytime and they wanted us to recreate the cube with more dimensions, i just decided to roll it out however it is in the file. theres like 80 columns and i dont have time to redo this multiple times with different column names. so meh, it is what it is lol

Viewing 10 reply threads
  • The topic ‘Is there a way to bulkcopy without sorting the data?’ is closed to new replies.