Sql Server update/insert
I had a project where I needed to have a SQL Server database replicate/mirror another database. The goal was to import a huge set of ascii files (each file representing a table), and perform bulk inserts. This was done once a month to ensure that the entire database was exactly up to date. Additionally, every day there would be a set of addendum tables. This was a much smaller set of tables that contained a relatively small amount of data, and some records would need to be inserted while others needed to be updated.
In order to ensure that we do not add duplicate records, I needed a way to go through the entire addendum/change file and find records that already exist versus the ones which are new. The typical approach would be to loop through each record and check to see if the item already exists. If it does, we need to update, if not we need to insert. While this is a straightforward approach, the problem was that is that it required writing an UPDATE statement that would require naming every single column. There were over 50 tables, so it would have required writing a unique update statement for each of them. It also would involve using Cursors to loop through each record of the temporary addendum table. I needed an easier approach as the project was time sensitive.
Since the tables had their own keys, and since we did not use identity columns as the primary key, I took advantage of that. This is what I came up with:
DELETE from MAIN_TABLE Where KEYFIELD in (Select KEYFIELD from TEMP_TABLE). INSERT INTO MAIN_TABLE SELECT * FROM TEMP_TABLE
Where MAIN_TABLE is the full table and TEMP_TABLE is a temporary sql server table that holds the addendum/changes. It's pretty straightforward. I remove all records from the original table that exist in the daily addendum/change files, and then I insert them all back into the main table. This involves no UPDATEs, just inserts. Since the tables match up exactly, there's no need to specify column names, and I can do this very easily for all 50+ tables.
One thing to keep in mind is that the main database files were pretty big while the update files were pretty small. Tens of thousands of records in main table compared to tens or hundreds in "update" file/table. If this wasn't the case, I wouldn't recommend this approach.
