Thursday, March 22, 2012

Best Practices: Recording Error Information (Beyond ErrorCode and ErrorColumn)

I'm sorry for asking this, as I'm sure tha the answer is in one of the 208 matches I found searching on "ErrorCode". Unfortunately, this project is overdue and I need a solution fairly soon.

I should add that I've only been developing in SSIS since August 2007.

I have a complicated package, loading about 17 outputs of the XML Source into staging tables. I have been using the error outputs of any standard components I use, out of faith that, if Microsoft provided them, then they should be useful for something. I've been directing all of the error outputs for one of the 17 "tables" from the XML Source into a Union All, and then into an "Error staging table", for each of the 17 outputs. This table includes all of the possible columns, but permits nulls for all of them. It also incldues the ErrorCode and ErrorColumn.

Unfortunately, if these latter two are useful for anyting, I haven't found it yet.

Right now, working with our first "mostly real" data, I'm getting 100% of my input rows written into error tables. Unfortunately, the information in the tables are of limited use in determining what went wrong. For instance, the ErrorColumn seems only to be populated if there was a specific error with a specific column. The Lookup component, doesn't seem to populate ErrorColumn, even if only one column was used for the lookup! No information about the component producing the error output is supplied in the error output, either, which makes it difficult to determine which of the five or so possible error outputs is the one that produced the particular error row.

This proves that I'm missing something simple. How do people handle errors? In my custom components, I learned to use the Fire* methods to produce detailed messages before redirecting the row to the error output, but this sort of thing is not available through the standard components.

Do I really have to create a separate "add useful information to the error output" component, and use it on each error output?

Sorry for the attitude, but just when I think I'm winning, SSIS brings me back down to earth!

If you want to identify which component produced an error then simply hook a derived column component up to the error path that adds the name of the component producing the error to the data path. Thats that one taken care of!

"Do I really have to create a separate "add useful information to the error output" component, and use it on each error output?"

Whats the problem with doing that? Granted you could argue that this information should be included anyway but the rationale for it NOT being there is that it could potentially take up A LOT of memory - a rationale that I agree with. hence you have the opportunity to do it yourself.

Note that if youre using a UNION ALL to collate all of your error rows into one place then you only need to do this once.

-Jamie

|||

Jamie Thomson wrote:

If you want to identify which component produced an error then simply hook a derived column component up to the error path that adds the name of the component producing the error to the data path. Thats that one taken care of!

"Do I really have to create a separate "add useful information to the error output" component, and use it on each error output?"

Whats the problem with doing that?

The main problem is that I don't know how!

The second problem is that there are about 50 or so places I'd have to add one of these components.

In particular, how can a downstream component learn about the component next-upstream? I also don't know how I'd "do it only once" downstream of the Union All.

|||

I've implemented a Derived Column transform for the first few of my error outputs, and it works well enough.

My next issue is that I have two custom components that are very heavily used. They produce good error messages through the Fire* methods. I now would like to take the messages they produce and place them into my new ErrorMessage column. I added a dummy Derived Column transform to the start of the chain of transforms in order to make the ErrorMesage column available with the correct data type and size throughout the transform chain.

The problem is that both of my components do nonstandard things with their collection of input columns, so I can't just add ErrorMessage as an input column. I know how to get the buffer index for an input column and how to use it to get values from the buffer for the column, and how to set them into the buffer.

My question is: can I get a buffer index for a column not in my input columns collection and use that index to set the value of that column? I see that there's a Virtual Input and Virtual Input Column Collection, but I haven't seen examples of how to use it.

This does sound like a hack, and I promise to fix it once I get the leisure to write a UI for these components so they don't have to do strange things with their input column collections.

Thanks.|||

John Saunders wrote:

Jamie Thomson wrote:

If you want to identify which component produced an error then simply hook a derived column component up to the error path that adds the name of the component producing the error to the data path. Thats that one taken care of!

"Do I really have to create a separate "add useful information to the error output" component, and use it on each error output?"

Whats the problem with doing that?

The main problem is that I don't know how!

Oh right. Sorry. Go here:

SSIS Nugget: Get error descriptions

(http://blogs.conchango.com/jamiethomson/archive/2005/08/08/SSIS-Nugget_3A00_-Get-error-descriptions.aspx)

John Saunders wrote:

The second problem is that there are about 50 or so places I'd have to add one of these components.

I'm not understanding why. If you push all the error outputs into a UNION ALL then you just need to add the code that I linked to above in a single component downstream of the UNION ALL. Unless I'm misunderstanding.

John Saunders wrote:

In particular, how can a downstream component learn about the component next-upstream?

It can't but I don't know why you'd want to.

-Jamie

|||

John Saunders wrote:

My question is: can I get a buffer index for a column not in my input columns collection and use that index to set the value of that column? I see that there's a Virtual Input and Virtual Input Column Collection, but I haven't seen examples of how to use it.

Spot on. Yes, you can do that. To be honest, the virtual input is what you see in the designer. The inputitself contais the columns that are selected at design-time. hence the columns in the input are a subset of teh volumns in the virtual input.

This code will return the virtual input and let you process the columns

Code Snippet

IDtsVirtualInput90 vInput = ComponentMetadata.InputCollection[0].GetVirtualInput();

foreach (IDtsVirtualInputColumn90 in vInput)

{

///blah blah blah

}

[that's done from memory so don't hang me if it doesn't work Smile ]

By the way, I will be publishing some code, via my blog and codeplex, in the next couple of days that does exactly this i.e. process data based on the virtual input. Keep a look out for that. Addrss is below.

-Jamie

No comments:

Post a Comment