csv to sql database insert\update

This topic contains 23 replies, has 4 voices, and was last updated by Profile photo of Rob Simmers Rob Simmers 1 year, 12 months ago.

  • Author
    Posts
  • #25674
    Profile photo of michael glenn
    michael glenn
    Participant

    I had posted a questing back in jan. about updating a sql database from a csv file, and received some great suggestions. I tried adding to that question but it wouldn't let me( too old perhaps?).
    I was able to add rows to the database with the code I created, but it appeared to add duplicates. so I have tried the following code and I am getting errors.. such as "incorrect syntax near the key word 'from' " and "Exception calling "ExecuteNonQuery" with "0" argument(s): "Conversion failed when converting the varchar value 'luis' to data type int."

    $userlist = Import-CSV C:\scripts\HS-Student-with-pass.csv
    $database = 'HSStudents'
    $server = 'HP600-WIN8-MG\SQLEXPRESS'
    $table = 'dbo.StudentInfo'
    $SQLServer = "HP600-WIN8-MG\sqlexpress"
    $SQLDBName = "HSStudents"
    $SqlConnection = New-Object System.Data.SqlClient.SqlConnection
    $SqlConnection.ConnectionString = "Server = $SQLServer; Database = $SQLDBName; Integrated Security = True"
    $SqlConnection.Open()
    
    Foreach($user in $userlist){
    
        $col1 = $user.stuid 
        $col2 = $user.lastname
        $col3 = $user.firstname
        $col4 = $user.gradelevel
        $col5 = $user.status
        $col6 = $user.homeroom
        $col7 = $user.hrteacher
        $col8 = $user.newpassword
    
    
        $dbwrite = $SqlConnection.CreateCommand()
        $dbwrite.CommandText =  "IF NOT EXISTS (select from $table where stuid = $col1 )
                                INSERT INTO $table (stuid, lastname,firstname,gradelevel,status,homeroom,hrteacher,newpassword)
                                VALUES ('$col1','$col2','$col3','$col4','$col5','$col6','$col7','$col8')"
                                
        
        $dbwrite.ExecuteNonQuery() # | out-null
    
    }
    
    $SqlConnection.Close()
    

    Im trying to add an "if not exists... then insert new data row". the "stuid" column is the unique column.
    any suggestions appreciated as to what im doing wrong.... ive been googling for days and this is the best point ive gotten too as the best way to update an existing sql database with info that has been received via a csv file.

  • #25675
    Profile photo of Don Jones
    Don Jones
    Keymaster

    That isn't really how the EXISTS keyword works. Have you tried testing that query with static values, in a SQL Server Management Studio window?

    I generally break these into two queries – check to see if the value exists, and then either update or insert based on that.

  • #25677
    Profile photo of michael glenn
    michael glenn
    Participant

    no.. I haven't tried that.... will try it now. im in agreement on the 2 queries as an end result... I will need to see if the "stuid" exists in the database already.. if not insert the new stuid row... and if it does exists.. update the status and/or grade level fields if required.

  • #25678
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    I think your SQL would look more like this for what you are trying to do. I like here strings to build query because you can ident the query, but keep in mind you cannot indent the here string itself. So, in the example, $sqlCmd= @" and "@ cannot be indented or have any whitespace after them:

    $sqlCmd = @"
    select @stuID = stuid from $table where stuid = $col1
    
    IF @stuID IS NULL
    	BEGIN
    		INSERT INTO [dbo].[$table] ...
    "@  
    
    $dbwrite.CommandText = $sqlCmd
     
    
  • #25679
    Profile photo of michael glenn
    michael glenn
    Participant

    for some reason... I am now getting back an error that I would expect..."violation of primary key constraint....." where as before I wasn't getting that. both from the management studio window when running the query with static values, and from the script im trying to create to update/insert from the csv file. curiously I have 2 rows in the csv that aren't in the database and it doesn't look like its adding them.

    Rob.... would here strings be beneficial if my direction is to have a situation such as "I will need to see if the "stuid" exists in the database already.. if not insert the new stuid row… and if it does exists.. update the status and/or grade level fields if required." ?
    as opposed to the direction im am currently in? ( in essence I think its a temporary table.. im taking a csv... putting it in to a temp table, then acting on that data. am I correct in this?)

  • #25680
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    Here strings are just to help keep formatting of SQL to make it more readable. To do an Insert\Update, you could do something like this.

    # Create DB connection
    Get-Content C:\MyCSV.csv | foreach {
    
    $sqlCMD = @"
    Declare @STUDENT_ID nvarchar(50) = '$($_.stuID)'
    Declare @STUDENT_NAME nvarchar(250) = '$($_.Name)'
    Declare @GRADELEVEL smallint = '$($_.GradeLevel)'
    
    
    IF EXISTS (select stuid from $table where stuid = @STUDENT_ID )
    	BEGIN
            UPDATE [dbo].[$table]
               SET [stuid] = @STUDENT_ID
                  ,[sName] = @STUDENT_NAME
                  ,[sGrade] = @GRADE
             WHERE stuid = @STUDENT_ID
    	END
    ELSE
    	BEGIN
            INSERT INTO [dbo].[$table]
                ([stuid]
                ,[sName]
                ,[sGrade])
            VALUES
                (@STUDENT_ID
                ,@STUDENT_NAME
                ,@GRADE)
    	END
    "@
    
        $dbwrite = $SqlConnection.CreateCommand()
        $dbwrite.CommandText =  $sqlCMD
        $dbwrite.ExecuteNonQuery() # | out-null
    }
    #Close DB connection
    
  • #25706
    Profile photo of michael glenn
    michael glenn
    Participant

    ok.. been working on this all day yesterday.. and this morning. here is the code I got so far.

    $table = 'StudentInfo'
    $SQLServer = "HP600-WIN8-MG\sqlexpress"
    $SQLDBName = "HSStudents"
    
    $SqlConnection = New-Object System.Data.SqlClient.SqlConnection
    $SqlConnection.ConnectionString = "Server = $SQLServer; Database = $SQLDBName; Integrated Security = True"
    $SqlConnection.Open()
    
    
    # Create DB connection
    Get-Content C:\scripts\HS-Student-with-pass.csv | foreach {
    
    
    
    $sqlCMD = @"
    Declare @STUDENT_ID int = '$($_.stuid)'
    Declare @STUDENT_LASTNAME nvarchar(50) = '$($_.lastname)'
    Declare @STUDENT_FIRSTNAME nvarchar(50) = '$($_.firstname)'
    Declare @GRADELEVEL int = '$($_.gradelevel)'
    Declare @STATUS nvarchar(50) = '$($_.status)'
    Declare @HOMEROOM nvarchar(50) = '$($_.homeroom)'
    Declare @HR_TEACHER nvarchar(50) = '$($_.hrteacher)'
    Declare @NEW_PASSWORD int = '$($_.newpassword)' 
    
    IF EXISTS (select stuid from $table where stuid = @STUDENT_ID )
    	BEGIN
        UPDATE [dbo].[$table]
           SET [stuid] = @STUDENT_ID
              ,[lastname] = @STUDENT_LASTNAME
              ,[firstname] = @STUDENT_FIRSTNAME
              ,[gradelevel] = @GRADELEVEL
              ,[status] = @STATUS
              ,[homeroom] = @HOMEROOM
              ,]hrteacher] = @HR_TEACHER
              ,[newpassword] = @NEW_PASSWORD
         WHERE stuid = @STUDENT_ID
    	END
    ELSE
    	BEGIN
            INSERT INTO [dbo].[$table]
                ([stuid]
                ,[lastname]
                ,[firstname]
                ,[gradelevel
                ,[status]
                ,[homeroom]
                ,[hrteacher]
                ,[newpassword])
            VALUES
                (@STUDENT_ID
                ,@STUDENT_LASTNAME
                ,@STUDENT_FIRSTNAME
                ,@GRADELEVEL
                ,@STATUS
                ,@HOMEROOM
                ,@HR_TEACHER
                ,@NEW_PASSWORD)
    	    END
    "@
        
        $dbwrite = $SqlConnection.CreateCommand()
        $dbwrite.CommandText = $sqlCMD
        $dbwrite.ExecuteNonQuery()
    }
    $SqlConnection.close()
    

    im getting;
    Exception calling "ExecuteNonQuery" with "0" argument(s): "Incorrect syntax near ']'."
    At C:\scripts\sql project\update-insert-csv-to-db-beta-ver1.ps1:70 char:5
    + $dbwrite.ExecuteNonQuery()
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : SqlException

    ive checked for spaces and such. am I building the connection incorrectly? what am I missing?

  • #25708
    Profile photo of Simon Wåhlin
    Simon Wåhlin
    Participant

    I can see two typos in your here-string:

    Starting bracket is backwards here:

    ,]hrteacher] = @HR_TEACHER
    

    Missing ending bracket here:

    ,[gradelevel
    
  • #25709
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    Keep in mind, you should be able to just echo $sqlCMD and then copy and paste it into SQL mgmt. studio and run it. Validate that all of the variables at the top are being filled in with student information that you expect.

  • #25710
    Profile photo of michael glenn
    michael glenn
    Participant

    thank you for the assistance guys.. its much appreciated.

    wow.. thanks simon....ive been staring at the screen for hours.. perhaps its time I enlarge the type a little bit. I even cut and pasted the script into ssms and that didn't help me find them.
    once I made those corrections the script ran fine with no errors.... but did not create 2 users that are in the csv , but not yet in the database.

    in the following example;
    Declare @STUDENT_ID int = '$($_.stuid)'
    $_.stuid the stuid part should be representative of the header inside the csv? am I correct?

    because these variables are inside the here string.. I cant put in a write-host statement to troubleshoot.. how would I go about troubleshooting this?
    Rob... im looking at running the commands inside the ssms and im not understanding how to validate the variables being filled.. how can I do that?

  • #25712
    Profile photo of michael glenn
    michael glenn
    Participant

    I see how to show results options.... but nothing is showing up. am I too assume that means there are no variables being filled?

  • #25713
    Profile photo of Simon Wåhlin
    Simon Wåhlin
    Participant

    I think you should use Import-CSV, not Get-Content.
    If you use Import-CSV, referring to $($_.stuid) in the SQL command would refer to the value in the column 'stuid'.

    Add:

    Write-Verbose -Message $sqlCMD -Verbose
    

    or if you prefer write-host:

    Write-Host -Object $sqlCMD
    

    Directly after you defined $sqlCMD, that way you can validate the query being generated.

    Also I would recommend commenting out the rows actually executing your query until you are satisfied that it is correct.

  • #25714
    Profile photo of michael glenn
    michael glenn
    Participant

    ok... changed to "import-csv"

    import-csv C:\scripts\HS-Student-with-pass.csv | foreach {
    

    now im getting ;
    Exception calling "ExecuteNonQuery" with "0" argument(s): "Conversion failed when converting the varchar value 'dylan' to data type int."
    At C:\scripts\sql project\update-insert-csv-to-db-beta-ver1e.ps1:67 char:1
    + $dbwrite.ExecuteNonQuery() | Out-Null
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : SqlException
    im getting this for various user names in the (im assuming in the csv and not the db).
    along with this showing up occasionaly;

    Exception calling "ExecuteNonQuery" with "0" argument(s): "Incorrect syntax near 'ASIA'.
    Unclosed quotation mark after the character string '
    IF EXISTS (select stuid from StudentInfo where stuid = @STUDENT_ID )
    BEGIN
    UPDATE [dbo].[StudentInfo]
    SET [stuid] = @STUDENT_ID
    ,[lastname] = @STUDENT_LASTNAME
    ,[firstname] = @STUDENT_FIRSTNAME
    ,[gradelevel] = @GRADELEVEL
    ,[status] = @STATUS
    ,[homeroom] = @HOMEROOM
    ,[hrteacher] = @HR_TEACHER
    ,[newpassword] = @NEW_PASSWORD
    WHERE stuid = @STUDENT_ID
    END
    ELSE
    BEGIN
    INSERT INTO [dbo].[StudentInfo]
    ([stuid]
    ,[lastname]
    ,[firstname]
    ,[gradelevel]
    ,[status]
    ,[homeroom]
    ,[hrteacher]
    ,[newpassword])
    VALUES
    (@STUDENT_ID
    ,@STUDENT_LASTNAME
    ,@STUDENT_FIRSTNAME
    ,@GRADELEVEL
    ,@STATUS
    ,@HOMEROOM
    ,@HR_TEACHER
    ,@NEW_PASSWORD)
    END'."

    im guessing this is a good indication that it is attempting to fill variables where as before I would get no return on screen and nothing got created in the database that is in the csv that isn't in the db yet.

    if I change the for each statement to this;

    $userlist = import-csv C:\scripts\HS-Student-with-pass.csv 
     foreach ($user in $userlist) {
    

    those errors go away and the script runs.. but again.. nothing gets created in the database.

  • #25715
    Profile photo of michael glenn
    michael glenn
    Participant

    reviewing the csv I found the answer to why the Incorrect syntax near 'ASIA' " error is being tossed..... the name in the cell is D'Asia... so how would one deal with excepting a single apostrophe charactor in the process?

    haven't found anything for the "Conversion failed when converting the varchar value 'dylan' to data type int." error yet.

  • #25718
    Profile photo of Simon Wåhlin
    Simon Wåhlin
    Participant

    If you change the | foreach-object {} to foreach($User in $UserList){} you will have to replace all $_ with $User.

    The most important difference between the keyword foreach used like this:

    foreach($User in $UserList)
    {
        #Do something
    }
    

    and the cmdlet Foreach-Object (which has an alias that is Foreach, not to be mixed up with the keyword above) used like this:

    Import-CSV -Path $Path | Foreach-Object {
        #Do someting
    }
    

    is that the first one will load the whole $UserList in memory, then loop while the second one will loop through each object as they come through the pipeline.
    This means that the first example usually allocates much more memory.
    The second example typically uses a lot less memory.

    Which one that is best is something you have to figure out for each use case.

    For handling quotes within your data, I think they can be escaped in SQL by doubling them. Simply make sure that any single quote is replaced by two single quotes.
    This is probably something you want to do in a separate function, but for now you could do something like this:

    $($_.lastname -replace "'","''")
    

    And at last, the error about expecting datatype int is probably SQl expecting a number but getting letters.

  • #25732
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    The conversion error is because in the declaration of variables doesn't match what is defined in SQL. For instance, this:

    Declare @STATUS nvarchar(50) = '$($_.status)'
    

    If the status column is defined with datatype INT and your variable is varchar, you would get that kind of error. So, I'm lazy and I'm working on a SQL project right now and threw this together that automagically creates the variables. It basically will pull the table schema info and generates the SQL declarations with datatypes since I'm always adding\removing and adjusting data types as I develop the database:

    $SQLServer = "SERVER\INSTANCE" #use Server\Instance for named SQL instances!
    # Set DB Name
    $SQLDBName = "Database"
    # Set DB User ID
    $SQLDBUserID = "sa"
    # Set DB Password
    $SQLDBUserPW = "P@ssword"
    
    $tables = Invoke-Sqlcmd -ServerInstance $SQLServer -Database $SQLDBName -Username $SQLDBUserID -Password $SQLDBUserPW -Query "SELECT * FROM information_schema.tables" | Select TABLE_NAME
    $tables
    foreach ($table in $tables) {
        "/*{0}*/" -f $Table.TABLE_NAME
        $records = Invoke-Sqlcmd -ServerInstance $SQLServer -Database $SQLDBName -Username $SQLDBUserID -Password $SQLDBUserPW -Query "select * from information_schema.columns where table_name = '$($Table.TABLE_NAME)'" | Select Column_Name, Data_Type, CHARACTER_MAXIMUM_LENGTH
        $records | foreach{
            if (![string]::IsNullOrEmpty($_.CHARACTER_MAXIMUM_LENGTH)){
                $charLength = $_.CHARACTER_MAXIMUM_LENGTH
                if ($charLength -eq -1) { $charLength = "MAX"}
                "Declare @{0} {1}({2}) = '`$(POWERSHELL_VAR)'" -f $_.COLUMN_NAME, $_.DATA_TYPE, $charLength
            }
            else {
                "Declare @{0} {1} = '`$(POWERSHELL_VAR)'" -f $_.COLUMN_NAME, $_.DATA_TYPE
            }
        }
    }
    

    Which gives me this:

    /*tblAppDetails*/
    Declare @ID int = '$(POWERSHELL_VAR)'
    Declare @FILTER_ID int = '$(POWERSHELL_VAR)'
    Declare @COMPUTER_ID int = '$(POWERSHELL_VAR)'
    Declare @INSTALL_DATE datetime = '$(POWERSHELL_VAR)'
    Declare @VERSION_ID nvarchar(20) = '$(POWERSHELL_VAR)'
    Declare @DISPLAY_NAME nvarchar(50) = '$(POWERSHELL_VAR)'
    Declare @REPORT_DATE datetime = '$(POWERSHELL_VAR)'
    
  • #25733
    Profile photo of michael glenn
    michael glenn
    Participant

    thank you for the assistance simon..and everyone else who chimed in at various points. again.. much appreciated. I got the script to work... I think the issue with the exception errors was the csv file was being saved in Excel as "CSV MS-DOS"... I tried to save it as "CSV(Comma Delimited)" and ran the script with no other changes made. it not only ran.. but the 2 users I had in the csv that weren't in the database were added to the database.

    I still have to wrestle/research with the apostrophe issue in the names field and decide how I want to deal with that going forward.
    also..., I want to come up with some way to document feedback (perhaps to a text file) as to what users were newly created in the database ( with all fields being reported).

    suggestions on what ways I may be able to do that would be appreciated so as to cut down on my research time.

  • #25734
    Profile photo of michael glenn
    michael glenn
    Participant

    rob.. just saw your post and im looking it over now. thanks for your input.. much appreciated.

  • #25737
    Profile photo of michael glenn
    michael glenn
    Participant

    rob.... from researching, I found just what you mentioned as far as the conversion errors. I went over my sql table to make sure I wasn't making a mistake with that and I didn't find any. all the sql columns were created and matched up with the script correctly as far as int, nvarchar, etc.... again.. the only change I made was how the csv file was saved.. and those conversion errors went away. I am going to spend some more time looking over your script so I may learn from it... thanks again.

  • #25739
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    In your database, is there an (Primary) identity key? Basically an autogenerated ID for each record?

  • #25742
    Profile photo of michael glenn
    michael glenn
    Participant

    no rob. not in this version of the database... I have built.. dropped.. built and dropped the databaset numerous times during this creation/learning process. I did though, assign primary key / index to the "stuid" column name. because I did that.. I was unsure if assigning an ID which increments automatically would be a good idea or not.

  • #25744
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    You have two basic options. The first is you would break the queries into Select\Insert\Update into separate commands. Run the Select query and if nothing is returned, then you would run the Insert and have the current information that is being inserted and do something with it. The second, if stuID is a Identity key, then you can place this under the Insert:

    SELECT  scope_identity();
    

    This will get the last inserted row identity. You are using ExecNonQuery() which returns rows affected, so I believe that would need to be ExecReader() so that you could get the returned stuID that was created. You could place them in an array and then run another command like:

    $array = 1,4,5,6,7
    $query = "Select * From tablename Where stuID In ('{0}')" -f ($array -join "','")
    

    which would give you a command to return all columns from all of the ids:

    Select * From tablename Where stuID In ('1','4','5','6','7')
    
  • #25746
    Profile photo of michael glenn
    michael glenn
    Participant

    Rob,
    I kinda like the way the script is working now.... any searchs I do with this data is gonna go off the student ID .. it is unique throughout the district.
    moving forward with this design/project.... im going to be adding 7 other schools to the mix. so making another database or adding a table for each school will be my direction. ( not sure yet of how to design this) from what ive read of sql so far... with the student id being unique, im thinking a separate table for each building. so searchs can go off building and student id if I know the building a student is supposed to be in.
    basically the database is a long term storage of the assigned password for all users. we are rolling out google docs.. all students require passwords, and active directory accounts. the support of the network has escalated ten fold due to this.. so im trying to script as much as possible to make it easier to maintain. creation of new students each year.. not to mention changes throughout the year, and a "roll up" of students going up a grade(which hasn't been thought through yet) is a night mare. I have scripts that create students based off a daily downloaded csv file... I already create home drives and email addresses and assign 4 digit passwords.

    so im liking your second option.
    so I would replace execreader() and remove ExecnonQuery()? would this cause any issues with the way the script is currently written?
    your example of the newly created array is a little confusing to me. im going to have to research/read in order to understand it better.
    my powershell skills are much better then my sql skills... im not an expert in either.. but im learning. im also prone to building things way out of my skill level.. but hey.... how else does one learn. im loving the addition of sql to powershell skilss that I have been doing of late. I have so many ideas running through my head.
    once again.. thanks for the assistance.

  • #25748
    Profile photo of Rob Simmers
    Rob Simmers
    Participant

    So, I would highly suggest reading up on relational databases and normalization. If your student id's are unique in all schools, then that should be the primary key of that table. When reading about database, you should learn about one to one, one to many and many to many relationships. A student has one school so there would be a table with school information (tblSchools), like ID, Name, Address, Main Number that are associated with that school. You would add a School_ID to your Student table (tblStudent) and associate the Student with School. This would ensure your are not repeating the same thing over and over (e.g. Name, Address) for each student and simply have an ID associated with a single record that contains the school information. There are many free articles and resources explaining normalization, so I would read up and if you are unsure ask on a SQL forum. I would recommend everything in a single database from what you've mentioned

    A non-query insinuates it's not going to return data. So, you would need to execute the insert and get the ID back. Here is a snippet of the code I've used in the past to do DB queries:

    # Set DB server
    $SQLServer = "MyServer" #use Server\Instance for named SQL instances!
    # Set DB Name
    $SQLDBName = "MyDB"
    # Set DB User ID
    $SQLDBUserID = "user"
    # Set DB Password
    $SQLDBUserPW = "myPW"
    # Generate a query string
    $SqlQuery = @"
    Select * From Table
    "@
    
    # Create a SQL connection
    $SqlConnection = New-Object System.Data.SqlClient.SqlConnection
    # Set the connection string
    $SqlConnection.ConnectionString = "Server = $SQLServer;Database = $SQLDBName;User ID = $SQLDBUserID;Password = $SQLDBUserPW;"
    # Create a SQL command
    $SqlCmd = New-Object System.Data.SqlClient.SqlCommand
    # Set the command text to the query string
    $SqlCmd.CommandText = $SqlQuery
    # Set the command connection to the SQL connection
    $SqlCmd.Connection = $SqlConnection
    # Create a new SQL adapter
    $SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
    # Set the adapter query to the SQL command
    $SqlAdapter.SelectCommand = $SqlCmd
    # Create a new SQL DataSet
    $SqlCmd.CommandTimeout = 120
    $DataSet = New-Object System.Data.DataSet
    # Fill the DataSet with the SQL Adapter info
    $nRecs = $SqlAdapter.Fill($DataSet)
    $nRecs | Out-Null
    # Close the SQL connection
    $SqlConnection.Close()
    
    $objTable = $DataSet.Tables[0]
    #Create a Powershell object with data
    $objTable | Select Column1, Column2
    

You must be logged in to reply to this topic.