I am getting an "Violation of UNIQUE KEY constraint 'AK_User'. Cannot insert duplicate key in object 'dbo.tblUsers when trying to copy data from an excel file to sql db using SSIS.
Is there any way of ingnoring this error, and let the package continue to the next record without stopping?
What I need is if it inserts three records but the first record is a duplicate, instead of failing, it should continue with the other records and insert them.
There is a System variable called propagate which can be used to continue or stop the execution of package .
1.Create an ON-Error event handler for the task which is failing .Generally it is created for the entire Data Flow Task.
2.Press F4 to get the list of all variables and click on the Icon at the top
to show System Variable.By default Propagate variable will be True ,you need to change it to false ,which basically means that SSIS wont propagate the Error to other component and let the execution continue
Update 1:
To skip the bad rows there are basically 2 ways to do so :-
1.Use Lookup
Try to match the primary key column values in source and destination and then use Lookup No Match Output to your destination.If the value doesn't match with the destination then insert the rows else just skip the rows or redirect to some table or flat file using Lookup Match Output
Example
For more details on Lookup refer this article
2.Or you can redirect the error rows to a flat file or a table .Every SSIS Data Flow components has a Error Output .
For example for Derived component ,the error output dialogue box is
But this condition may not helpful to u in your case as redirect error rows in destination doesn't work properly .If an error occurs it redirects the entire data without inserting any row in the destination .I think this happens because OLEDB destination does a bulk insert or inserts data using transactions.So try to use lookup to achieve your functionality .
Related
I am attempting to create a flow which will be used to update the members of various SharePoint Permission Groups. I ran into an issue with one of the actions not executing due to the fact it said that the value could not be found. After much trial and error I still could not figure out why it was failing so I started to remove actions and steps from the flow. I've taken it all the way back to my trigger and 1 action and I can't figure out what is causing my issue. Here is the setup.
I have a list with the following fields:
Employee Name - Person or Group
Folder - Choice Column
Action - Choice Column
Flow is triggered when an item is created or modified and has the trigger condition of #not(equals(triggerBody()?['Action'],'Updated'))
1st action is just a Get items
When I add an entry to the list and select a person, a Folder and an Action the flow will run. But when it does it is deleting or removing the selected choice in the Folder column leaving it blank. Why would it do that? In the 2 steps I'm not even specifically calling that field and if it could be due to the fact that it is a choice field, why isn't the Action column value also not removed? It is not my intent to delete or remove field values.
I need the value in that field to not be removed as I intend to call on it later in a concat string but I can't call what isn't there.
What is going on?
Update #1: As an update I deleted the original flow and rebuilt it again with just the 2 steps but without the trigger condition. Re-ran the flow and immediately the option selected in the "Folder" column is removed from the list. None of the list columns are set as "required" and the choice fields are not multi-select.
Update #2: In looking at the trigger action settings the Split On statement is #triggerOutputs()?['body/value']. In looking at the sample I was using to build my flow they show the statement to be #triggerBody()?['value']. There doesn't seem to be any way for me to change the statement, could this have anything to do with why my field value is being removed from the list?
I have 2 SharePoint sites (SiteA and SiteB) In siteA I have an excel file called LocationA, when this file is edited (rows added, rows edited or rows deleted) I want to reflect these changes in another excel file called LocationB which is stored in SiteB (I have not added the delete operation yet but suggestions on how I might do so are welcomed).
The issue is that the flow is adding rows instead of updating the existing rows in LocationB.
Please find my flow below (it is running without errors but the output is the problem)
Note
The expression in the filter array is string(items('Apply_to_each')?['ID']) which changes the ID field to String
The expression in condition 2 is empty(body('Filter_array')) this condition checks if the list item exists in excel
Because you are using the action to add a row in the power automate, you need to change the action for example, Update a row. That will work.
What I want to do is take data from a dbf file and insert it in a table. Which I've already done. Since there are many files, a For-Each Container is being used. However, before inserting it into a table, I want to look at the date fields and compare it to a date variable. If the dates match the variable, then move on to the step of the flow. But if any of the dates don't match the variable, then that file and its contents are discarded and the next file is looked at.
How do I accomplish this in SSIS?
You're looking for the Conditional Split Component within your Data Flow Task.
Assuming your source column is MyDate and you have an SSIS Variable called #[User::ReferenceDate] then you'd apply an expression like
[MyDate] == #[User::ReferenceDate]
That will evaluate to True when the dates match, false otherwise.
In your Conditional Split, add a row into the component.
OutputName: DatesMatched
Condition: [MyDate] == #[User::ReferenceDate]
Default output name: DatesUnmatched
Now when you connect the output from this to your destination, it'll ask whether you want to route the data using the DatesMatched or DatesUnmatched path. Use the DatesMatched path.
As I re-read this, if any of the dates don't match the variable, then that file and its contents are discarded then you're looking at double processing the file. The first time to read it all in and validate it. The second time, optional, will actually load to the database.
From your Conditional Split, add a RowCount to the DatesUnmatched path. Use a Variable of type Integer/Int32 named CountDatesUnmatched. In a perfect world, that will be zero when the validation of the file completes.
In the Precedent Constraint between the Validation Data Flow and the actual Import Data Flow, double click the connector line and change the evaluation criteria from Constraint to Expression and Constraint. Leave the value as Success and in the Expression use #[User::CountDatesUnmatched] == 0 That data flow will only light up if both conditions are true: parsing was successful and no rows were sent to the Row Count component.
Finally, you can cheat and sometimes this approach makes sense. If you're using an OLE DB Destination, then you can use the MaximumInsertCommitSize of the default 2B and a data access mode of fast load. This translates to "Everything is going to commit or none of it is". That can lock up your target table and cause your transaction log to grow heavily depending on how much data you're loading. Use the Conditional Split as described above but for the DatesUnmatched path, induce a failure. A Derived column with divide by zero or a script task with an explicit FireError event will cause that transaction to go belly up. You'd need to do some magic in the OnError event handler to not abort the overall file processing but it's a lazy hack (or one that is useful when double reading the file is prohibitive but impacting the database is less so)
Is there any way to use a variable in the from part (for example SELECT myColumn1 FROM ?) in a task flow - source without having to give the variable a valid default value first?
To be more exact in my situation it is so that I'm getting the tablenames out of a table and then use a control workflow to foreach over the list of tablenames and then call a workflow from within that then gets data from these tables each. In this workflow I have the before mentioned SELECT statement.
To get it to work properly I had to set the variable to a valid default value (on package level) as else I could not create the workflow itself (as the datasource couldn't be created as the select was invalid without the default value).
So my question here is: Is there any workaround possible in this case where I don't need a valid default value for the variable?
The datatables:
The different tables which are selected in the dataflow have the exact same tables in terms of columns (thus which columns, naming of columns and datatypes of columns). Only the data inside of them is different (thus its data for customer A, customer B,....).
You're in luck as this is a trivial thing to implement with SSIS.
The base problem for most people is that they come at SSIS like it's still DTS where you could do whatever you want inside a data flow. They threw out the extreme flexibility with DTS in favor of raw processing performance.
You cannot parameterize the table in a SQL statement. It's simply not allowed.
Instead, the approach that people take is to use Expressions. In your case, assuming you had two Variables of type String created, #[User::QualifiedTableName] and #[User::QuerySource]
Assume that [dbo].[spt_values] is assigned to QualifiedTableName. As you loop through the table names, you will assign the value into this variable.
The "trick" is to apply an expression to the #[User::QuerySource]. Make the expression
"SELECT T.* FROM " + #[User::QualifiedTableName] + " AS T;"
This allows you to change out your table name whenever the value of the other variable changes.
In your data flow, you will change your OLE DB Source to be driven by a query contained in a variable instead of the traditional table selection.
If you want an example of where I use QuerySource to drive a data flow, there's an example on mixing an integer and string in an ssis derived column
Create a second variable. Set its Expression to create the full
Select statement, using the value of the first variable.
In the Data Source, use "SQL command from variable" option for the
Data Access Mode property.
If you can, set a default value for the variable you created in step
That will make filling out the columns from your data source much easier.
If you can't use a default value for the variable, set the Data
Source's ValidateExternalMetadata property to False.
You may have to open the data source with the Advanced Editor and
create Output columns manually.
I want to know which is best strategy to aboard the following problem in Talend:
I need to load data from a set of delimited files that are stored in a directory with names like (SAMPLE1.DAT, SAMPLE2.DAT, ... , SAMPLEX.DAT)
The target will be a table in a MySQL database
I have to load all data at once because after this task I need to work with all records in the same table
I'm a bit confused because I don't know if it possible in Talend. I was seeing the tFileInputDelimited component but I didn't find the way to solve it.
Thanks
To read several files from one directory, you would use the tFileList component. It allows you to specify a directory and a file name pattern. All files in the directory matching the pattern will be processed, one after the other.
You need to use an "Iterate" link from the tFileList component to those components that describe what you want to do with each file. In your case, you would start with a tFileInputDelimited component (read the file) and connect the main output of that to a tMysqlOutput component. The MySQL component will, by default, just append the data to an existing table, so that should get you the result you want.
In the tFileInputDelimited component, you would not use a fixed filename, but a variable filename which is set by the tFileList component for each iteration (your loop variable, so to speak of). The name of that loop variable can be seen in the "outline" view in the studio, usually in the bottom left corner.
You would use components tFileInputDelimited into tMap (optional) into tmysqlOutput
Step 1 : configure some components like this, except you will use the delimited file input:
Step 2 : configure the component settings for the delimited file, click the disk for the wizard :
Step 3 : configure your database by right clicking on Db Connection under metadata, then following wizard:
Step 4 : Right click on each component and choose Row > Main > drag to next step in flow.
Step 5 : Open your tMap and map the columns from the file schema to the database schema.
Step 6 : Run the job, it should work if you have followed all the wizard, if there are errors just hover over the red component and it usually describes errors pretty well. You will see as the job runs how many records it has transferred.
Step 7 : after you have made it that far, create a tfiledelimited output with the same schema as the input, right click on the input choose Row > Rejects and drag that to the new delimited output, this is where and records that are rejected by the tmap will be sent.