Check if the input file names with the file names in the config table - filter

I have a folder which contains many files and I got a configuration table in sql database which contains the list of file names which I need to load to Azure Blob Storage.
I tried getting the file names from the source folder using 'Get Metadata' activity and then used Filter activity to filter the file name but this way I have to hard code the filename inside the filter.
Can someone please let me know a way to do this?

here is an example:
I have below files in a folder.
And the below in sql Config table
This is how the sample pipeline looks like.
1. Lookup list of files from sql config table and using foreach actvity append to an array variable. In my example it is in config_files.
2. Using GetMetadata, list the childItems in the folder, and append the file names into another variable. In my example it is files
3. Use SetVariable activity to store the result i.e. the files that match from the entries in config table.
Expression: #intersection(variables('files'),variables('config_files'))

Related

File And File Grouping SQL server

I have a filegroup Named (Year2020) which contains There different .ndf files, for example Summer.ndf Winter.ndf, Fall.ndf.
Now I want to create a Fall table and I want my table to be saved in Fall.ndf file not on Summer.ndf not on Winter.ndf Is there a way to do things like this? I am using SQL Server.
The problem is all are in the same filegroup Named year2020....how can we save it exactly where we want ??
When I save the fall table it goes into summer.ndf not on Fall.ndf

In SSIS, add the EXCEL file name as a column in the results dataset

Using SSIS, uploading from an EXCEL file to SQL Server database table, I need to add the EXCEL file name as a column in the results dataset. I am able to create an EXCEL FILE source, get the data from the EXCEL, and load it into an OLE DB Destination but I am missing the step to add the filename to the dataset.
Most files like this are loaded through a foreach loop where you are looking in a folder for a file that matches a pattern... d:\data*.xlsx for example.
In that foreach you are saving the the path to a variable.
In the data flow you can add a derived column and add that variable to your data flow and ultimately to the database.

Multiple field names

I have a txt file, which I have to insert into a database.
My problem is that in some files I have header "customer_" instead of "customer".
I don’t know how to fix this in Pentaho. I’ve tried "select values" but I have no idea how it works.
My transformation for now : get file names -> csv file input -> tx file output -> table output.
You have Metadata Injection capabilities built in Pentaho Data Integration, but just "any" file won't work, you need some kind of logic to determine that "customer_" or whatever you get maps to the "customer" column in the database.
Once you have the logic to build of the variations of possible columns in the origin file to columns in the table, you can inject that metadata to your transformation.

Copy most recent file only from one blob to another using Azure Data Factory

I am trying to copy the most recent file from the set of files with format "filename_yyyymmdd" from one container to another using ADF. I want a single file to be copied from those set of file.
How we can achive this ?
Please try this:
1.create two variable and set value of latestFile as filename_19700101.
2.use GetMetadata activity to get child items.
3.use ForEach activity to loop child items and check Sequential option.Expression:
#activity('Get Metadata1').output.childItems
4.set value of currentFile as #item().name
5.use If condition activity to compare date by this expression:
#greater(split(variables('currentFile'),'_')[1],split(variables('latestFile'),'_')[1])
In True case, set latestFile value.
7.Finally, pass #variables('latestFile') to file path in dataset within Copy activity

Power Query From Folder as Merge, Not Append

I need to import multiple files from a folder and I need each file's contents to be new columns in the resultant table.
There are multiple examples all over the web of how to include multiple files from a folder as an append (e.g., PowerQuery multiple files and add column) but I need the contents of each file to be merged as new columns in the original table.
Any help will be greatly appreciated.
I came up with my own answer. Once you append the files you can pivot on the file name to turn them into columns.

Resources