Wednesday, March 7, 2012

Best method or is it even worth the effort?

Hi,

I have a question regarding a project I am working on and I'm not sure what the best way to do it may be, I am new to SSIS and would like to gain some exposure using it, but don't know if this project is overkill or even possible using SSIS.

*** PROBLEM ****

Basically I have a large number of flat files each file contains different columns and data. What I want to do is import all of the data from the flat files into corresponding tables in SQL Server. The tables in SQL server are already defined with data structure to match each file. The first 10 characters of the flat file are also the same name as the SQL table. So it seems easy enough to just import flat file to table one by one, but I cannot figure out how to match the flat file structure to each table without setting up a task for each table (there are roughly 50-60 tables). Another reason I want to do it with SSIS is these flat files have to be imported on a weekly basis.

**********************

Is it in my best interest to just create a stored procedure that basically loops through the files and imports the data? Or do you think using SSIS is the best solution? I have looked at SSIS a bit and thought maybe I could use the bulkinsert task, but it appears it cannot dynamically change due to column specs. I figured out how to loop through the flat files using the foreach task, but I could only get it to import into one SQL table. I also started looking at the dataflow task and thought about creating a script task, but I'm not sure if that would work before I spend a lot of time trying to figure that out.

Any suggestions or thoughts are appreciated.

Thank you

Are you saying that there are many record layouts/formats in the same file? Or one file has only one format, but there are multiple formats and multiple files with those formats?

Either way, if you have 50 file formats, seems like you're going to have to configure 50 flat file source connections. There may be a good way with the script task, but that's not my strength. But I would choose SSIS over other method - at the least it sounds like you should look into the conditional split transformation, and variables. Let us know about the above question.

|||

To implement this without scripting (I'm under the impression that your 50 files / tables have different formats), you'd need to create 50 data flows.

It certainly could be implemented via a script, but I'm not sure what benefit you'd get from writing the script in SSIS versus just creating a VB or C# application.

|||

Not sure what it is you are doing, but it sounds EXACTLY like what I have recently done - except with 45 files/tables. I started to go down the route of individual DFs, but instead created individual packages for each of the files I have to process. Either way will work. Basically, we have a parent package that kicks off the children packages to process each file. This is done through iterating through our known files for each organization that will submit these 45 files.

This is working as designed and have not ran into any troubles. We are still working on getting some better error logging now. Please feel free to post if you have any questions.

Regards

|||

I was hoping to create something a little less cumbersome, although I think that will work and may be my last resort. I have something right now that seems to work somewhat I need to do further testing and development to see if it will meet my needs.

What I have done is create a simple foreach loop and inside the foreach loop I have a script task that takes the return value from the foreach loop which is the file name, and parses the string into values that I can pass into the bulk insert task.

In my script I set the Dts.Connections.Item("BulkFile").ConnectionString to the filename variable from the foreach loop and then I parse the string to get the name of my table and set a user defined variable I just called "TableName" with the parsed string value. I then open the Bulk Insert Task and set the DestinationTableName expression to my user defined variable "TableName".

My initial findings appear to work well, but so far I have only imported six of the table dynamically I still have 50+ to go. I have to create some stored procedures to clean out the tables before the bulk insert due to primary key constraints etc... but basically once a week I am flushing all of the data in these tables and will be replacing them with the data from the flat files.

No comments:

Post a Comment