export data from csv to sql database

This topic contains 14 replies, has 4 voices, and was last updated by Profile photo of michael glenn michael glenn 1 year, 10 months ago.

  • Author
    Posts
  • #22013
    Profile photo of michael glenn
    michael glenn
    Participant

    im trying to take info from a csv (9 column headers) and export it to an sql database using powershell. been looking at tons of examples from the internet and im confused.
    the brunt of the data will be a one time thing.. then, it will be a daily thing where new student names will be added as needed. hopefully doing a check to see if student id is in the database.. if not.. add this info to the database.
    suggestions would be appreciated.

  • #22015
    Profile photo of Dan Potter
    Dan Potter
    Participant

    do you have a table designed yet?

  • #22017
    Profile photo of michael glenn
    michael glenn
    Participant

    I have been looking at suggestions for that and I did design a table but was unable to get that to work correctly saw I deleted that table ( but could easily create one again).. then saw scripts that created the table prior to inputting the data. so I have been trying both ways.

  • #22018
    Profile photo of michael glenn
    michael glenn
    Participant
    # Database variables 
    $sqlserver = "pc1\sqlexpress" 
    $database = "students" 
    $table = "info" 
      
    # CSV variables 
    $csvfile = "C:\pathtofile\hs-stu-pass.csv" 
    $csvdelimiter = "," 
    $firstRowColumns = $false 
      
    ################### No need to modify anything below ################### 
    Write-Host "Script started..." 
    $elapsed = [System.Diagnostics.Stopwatch]::StartNew()  
    [void][Reflection.Assembly]::LoadWithPartialName("System.Data") 
    [void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient") 
      
    # 50k worked fastest and kept memory usage to a minimum 
    $batchsize = 50000 
      
    # Build the sqlbulkcopy connection, and set the timeout to infinite 
    $connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;" 
    $bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock) 
    $bulkcopy.DestinationTableName = $table 
    $bulkcopy.bulkcopyTimeout = 0 
    $bulkcopy.batchsize = $batchsize 
      
    # Create the datatable, and autogenerate the columns. 
    $datatable = New-Object System.Data.DataTable 
      
    # Open the text file from disk 
    $reader = New-Object System.IO.StreamReader($csvfile) 
    $columns = (Get-Content $csvfile -First 1).Split($csvdelimiter) 
    if ($firstRowColumns -eq $true) { $null = $reader.readLine() } 
      
    foreach ($column in $columns) {  
        $null = $datatable.Columns.Add() 
    } 
      
    # Read in the data, line by line 
    while (($line = $reader.ReadLine()) -ne $null)  { 
        
        $null = $datatable.Rows.Add($line.Split($csvdelimiter)) 
     
        $i++; if (($i % $batchsize) -eq 0) {  
            $bulkcopy.WriteToServer($datatable)  
            Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())." 
            $datatable.Clear()  
        }  
    }  
      
    # Add in all the remaining rows since the last clear 
    if($datatable.Rows.Count -gt 0) { 
        $bulkcopy.WriteToServer($datatable) 
        $datatable.Clear() 
    } 
      
    # Clean Up 
    $reader.Close(); $reader.Dispose() 
    $bulkcopy.Close(); $bulkcopy.Dispose() 
    $datatable.Dispose() 
      
    Write-Host "Script complete. $i rows have been inserted into the database." 
    Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())" 
    # Sometimes the Garbage Collector takes too long to clear the huge datatable. 
    [System.GC]::Collect()
    

    the furthest I got was with this script.. but confused with table existing before running the script or letting the script build the tables.

  • #22022
    Profile photo of Don Jones
    Don Jones
    Keymaster

    You can go either way. T-SQL has an "IF EXISTS" statement you can run, so that a CREATE TABLE statement would only run if the table didn't exist. The difficulty here is really more in using T-SQL than PowerShell per se, I think.

    Take a look at our "Building Trend and Historical Reports" ebook – the accompanying module (in our GitHub repo) has examples that might prove useful.

  • #22033
    Profile photo of michael glenn
    michael glenn
    Participant

    ok.. thanks don.. i'll look into that. thanks for the direction.. its appreciated.

  • #22035
    Profile photo of michael glenn
    michael glenn
    Participant

    im having a hard time finding the examples.... is there a link?

  • #22039
    Profile photo of Dan Potter
    Dan Potter
    Participant

    Here's an example of inserting a row into a table.

    $dbconn = New-Object System.Data.SqlClient.SqlConnection
    $dbconn.connectionstring = "Server=$sqlserver; Database='Reporting'; Integrated Security=TRUE"
    $dbconn.Open()
    $dbwrite = $dbconn.CreateCommand()
    $dbwrite.CommandText = "INSERT INTO $table (UPN,AzureRMS,Office_PP,Lync,Office_Online,Sharepoint,Exchange,Visio,Project,ExchangeArchive,Department,LastModification,Added_by)`
    VALUES ('$col1','$col2','$col3','$col4','$col5','$col6','$col7','$col8','$col9','$col10','$col11','$col12','$col13')"
    $dbwrite.ExecuteNonQuery()
    $dbconn.Close()

  • #22043
    Profile photo of michael glenn
    michael glenn
    Participant

    ok...so UPN,AzureRMS,etc would be column names. and VALUES ('$col1','$col2', .... the $col1 would be the values to be inserted into those columns?

  • #22063
    Profile photo of michael glenn
    michael glenn
    Participant
    $logQuery = new-object -ComObject "MSUtil.LogQuery"
    
    $inputFormat = new-object -comobject "MSUtil.LogQuery.CSVInputFormat"
    
    $outputFormat = new-object -comobject "MSUtil.LogQuery.SQLOutputFormat"
    
    $outputFormat.server = "HP600-WIN8-MG\sqlexpress"
    
    $outputFormat.database = "studentinfo"
    
    $outputFormat.driver = "SQL Server"
    
    $outputFormat.createTable = $true
    
    $query = "SELECT stuid, lastname, firstname, gradelevel, status, homeroom, hrteacher, newpassword INTO stuinfo FROM c:\sqlproject\HS-Stu-pass.csv"
    
    $null = $logQuery.ExecuteBatch($query,$inputFormat,$outputFormat)
    
    

    ok..everyones suggestions and thoughts put me in this direction.....much appreciated. this is what I did to get the csv info into the sql database ( with no tables in it) along with "logparser"
    worked great.. created the table along with the columns, and 1400 lines exported to sql in 4 seconds.

    going forward im going to download a new csv on a daily basis that will contain new students along with existing students ( as far as the sql database is concerned)
    I will need to export new students into the database, and possibly change the data in the gradelevel, status, homeroom, hrteacher columns of the existing students in the sql database.
    my question is what would be the best way to approach this?

  • #22065
    Profile photo of Tim Pringle
    Tim Pringle
    Participant

    Personally, I prefer Dan's approach. Using this allows you to dynamically generate the values from the variables. Also, it then means that all you really need to worry about is getting the right SQL code.

    Why not forget about the PowerShell code for a bit, and try to build up the different type of TSQL that you'd need for these actions? You can use SQL Server Management Studio for this and validate your results straight away. Then once you've got the right TSQL, all you need to do is modify your code accordingly with the right queries.

  • #22068
    Profile photo of michael glenn
    michael glenn
    Participant

    thanks for the response Tim.

    my sql/tsql skills are in the beginning stages. so using sql server management studio for this is difficult at best... I just don't have enough grasp of it yet.. but im reading and learning in between posts here. combining this with powershell is really challenging but very enjoyable.
    is my assumption of dan's coding correct?
    UPN, AzureRMS, etc would be column names. and then VALUES ('$col1','$col2', …. the $col1 would be the values to be inserted into those columns?

    if so.. how would I assign values to those variables? im guessing a foreach block taking the info from a csv.. is this correct?

    thanks again for the response.. I appreciate the time you and everyone has taken to give pointers.

  • #22082
    Profile photo of Dan Potter
    Dan Potter
    Participant

    right

    $users = import-csv userlist.csv

    $dbconn = New-Object System.Data.SqlClient.SqlConnection
    $dbconn.connectionstring = "Server=$sqlserver; Database='Reporting'; Integrated Security=TRUE"
    $dbconn.Open()

    Foreach($i in $users){

    $col1 = $i.upn
    $col2 = $i.azurerms
    and so on.

    $dbwrite = $dbconn.CreateCommand()
    $dbwrite.CommandText = "INSERT INTO $table (UPN,AzureRMS,Office_PP,Lync,Office_Online,Sharepoint,Exchange,Visio,Project,ExchangeArchive,Department,LastModification,Added_by)`
    VALUES ('$col1','$col2','$col3','$col4','$col5','$col6','$col7','$col8','$col9','$col10','$col11','$col12','$col13')"
    $dbwrite.ExecuteNonQuery()

    }

    $dbconn.Close()

    Or something like it.

  • #22087
    Profile photo of michael glenn
    michael glenn
    Participant

    thanks dan for the response. its appreciated.

  • #22281
    Profile photo of michael glenn
    michael glenn
    Participant

    just wanted to drop a thank you to everyone that responded with pointers. my head has been buried in powershell / sql and I have accomplished so much. but of course there is more to learn and conquer.
    thanks again Dan, Tim, and Don for all suggestions.

You must be logged in to reply to this topic.