I am new to Informatica so need your help.
I have one staging table where data comes everyday and I need to extract data from this staging table and convert it into Dat file format and place in into a folder. so that these dat files could be a feed for another process.
I dont know how informatica does this (Conversion of data from Staging table to Dat). So please help me to know how Informatica fetch the data from staging table, transform it into Dat file and place it into a folder.
Thanks & Regards,
Vikram
To create a pipe-delimited flat file...
Go to the Target Designer - Select Target->Create then choose Flat File. Then double click on the file, and in the 'Table' tab, at the bottom right select 'Advanced' and choose your delimiter. Then you can add your columns, specify the file location and all is well!
You will need to define a source definition based on your staging table, a target definition based on your final file format and then create the mapping,session and workflow that link the two.
.Dat file is not a complete description for the file, since any file can be renamed to a .dat file. You'll need to decide how the data would be separated in this file (commas? tabs? pipes?). Remember all downstream processes will then use this file as input, so you need to publish this format too.
Related
I need to transfer around 20 CSV files inside a folder named ActivityPointer in an azure blob storage container to Azure SQL database in a single data factory pipeline, but ActivityPointer contains 20 CSV files and another folder named snapshots inside it. So when I try to create a pipeline and give * to select all the CSV files inside ActivityPointer it includes the snapshots folder too, which should not be included. Is there any possibilities to complete this task. Also I can't create another folder to transform the snapshots folder into it. What can I do now? Anyone can please help me out.
Assuming you want to copy all CSV files within ACtivityPointer folder,
You can use wildcard expression as below :
you can provide path till Active folder and than *.csv
Copy data is also considering the inner folder while using wildcards (even if we use .csv in wildcard file path). So, we have to validate whether it is a file or folder. Please look at the following demonstration.
First use Get Metadata on the required folder with field list as Child items. The debug output will be:
Now use this to iterate through child items using For each activity.
#activity('Get Metadata1').output.childItems
Inside for each, use if condition activity to check whether the current item is a file or not. Use the following condition.
#equals(item().type,'File')
When this is true, you can use copy data to complete copying the file to target table (Ignore the false case). I have create file_name parameter in my source dataset passing its value as #item().name().
This will help you to achieve your requirement. The following is the debug output. I have 4 files and 1 folder. The folder will be ignored, and the rest will be copied into the target table.
I have requirement like daily i am receiving diffrent type of files like Excel,CSV,Avaro,JSON etc
I need to fetch list of files names like
tablea.xls
tablea.csv etc
I need convert all the file from different format to CSV.
This things we need to do using ADF.
Thanks ,
Use the Get Metadata activity to list files and the Copy activity to convert the format. Copy can change formats but can not do much in the way of transform. Specify the format you want in the Sink section of the Copy config. Try some things out and some tutorials and come back if you get specific errors.
Create a stored procedure that will read the .csv file from oracle server path using read file operation, query the data in some X table and write the output in .csv file.
here after read .csv file, compare .csv file data with table data and need to update few columns in .csv file.
Oracle works best with data in the database. UPDATE is one of the most frequently used commands.
But, modifying a file which resides in some directory seems to be somewhat out of scope. There are other programming languages you should use, I believe. However, if a hammer is the only tool you have, every problem looks like a nail.
I can think of two options.
One is to load file into the database. Use SQL*Loader to do that if file resides on your PC, or - if you have access to the database server and DBA granted you read/write privileges on a directory (an Oracle object which points to a filesystem directory) - use it as an external table. Once you load data, modify it and export it back (i.e. create a new CSV file) using spool.
Another option is to use UTL_FILE package. It also requires access to the database server's directory. Using the A(ppend) option, you can add rows to the original file, but I don't think that you can edit it so this option - at the end - finishes like the previous one - with creating a new file (but this time using UTL_FILE).
Conclusion? Don't use a database management system to modify files. Use another tool.
how to work on specific part of cvs file uploaded into HDFS ?
I'm new in Hadoop and i have an a question that is if i export an a relational database into cvs file then uploaded it into HDFS . so how to work on specific part (table) in file using MapReduce .
thanks in advance .
I assume that the RDBMS tables are exported to individual csv files for each table and stored in HDFS. I presume that, you are referring to column(s) data within the table(s) when you mentioned 'specific part (table)'. If so, place the individual csv files into the separate file paths say /user/userName/dbName/tables/table1.csv
Now, you can configure the job for the input path and field occurrences. You may consider to use the default Input Format so that your mapper would get one line at time as input. Based on the configuration/properties, you can read the specific fields and process the data.
Cascading allows you to get started very quickly with MapReduce. It has framework that allows you to set up Taps to access sources (your CSV file) and process it inside a pipeline say to (for example) add column A to column B and place the sum into column C by selecting them as Fields
use BigTable means convert your database to one big table
I have to load the target table copy of data into the text file at the time of workflow running.(i.e)Whatever data are going into the the target table that should come into the text file also.
You can have two targets. One can be your relational target and the other can be a flat-file target.