I have a csv flowfile with single record. I need to create its file name based on couple of column values in the csv file. Can you please let me know how we can do it by using the column name only not the position of the column as column position may change. Example
CSV File
Name , City, State, Country, Gender
John, Dallas, Texas, USA, M
File name should be John_USA.csv
I am trying extract text processor and pulling the first data row using -
row = ^.\r?\n(.)
And then updateattribute processor I am pulling the values from the columns using below expression
${row:getDelimitedField(1)}_${row:getDelimitedField(4)}.csv
But this use the position of the column not the column name. How can I build it using the column name not the position of columns
The way I will do it (maybe be not the efficient one):
Convert the CSV to json
Pass content to attributes (so you can access the field you want like dictionnary (key-value))
Update Attributes
Convert it back to CSV (thus you can control the schema, and the position of the fields).
Related
I'm having a csv file with two columns for example column A and Columb B. Column B consists of string value like this : I am, doing good. so when I try to insert this data into a database only the string I am is getting inserted. I just want to know what attribute I need to add to the process group so that I am, doing good will get inserted to the database
The attached image consists of the attributes in the current process group
I have column family with generated columns and heavy data in it.
Every column has name like PREFIX + TIMESTAMP.
I need to extract timestamp from column name before than I make a decision whether I need to get data from it.
How can I get only names of columns in CF?
As a beginner to TOS for BD, I am trying to read two csv files in Talend OS, i have inferred the metadata schema from the same CSV file, and setup the first row to be header, and delimiter as comma (,)
In my code:
The tMap will read the csv file, and do a lookup on another csv file, and generate two output files passed and reject records.
But while running the job i am getting below error.
Couldn't parse value for column 'Product_ID' in 'row1', value is '4569,Laptop,10'. Details: java.lang.NumberFormatException: For input string: "4569,Laptop,10"
I believe it is considering entire row as one string to be the value for "Product_ID" column
I don't know why that is happening when i have set the delimiter and row separator correctly.
Schema
I can see no rows are going from the first tInputFileDelimited due to above error.
Job Run
Input component
Any idea what else can i check?
Thanks in advance.
In your last screenshot, you can see that the Field separator of your tFileInputDelimited_1 is ; and not ,.
I believe that you haven't set up your component to use the metadata you created for your csv file.
So you need to configure the component to use the metadata you've created by selecting Repository under Property Type, and selecting the delimited file metadata.
To count the rows of csv file we can use Get Files Rows Count Input in etl. How to find the number columns of a csv file?
Just read the first row of the CSV file using Text-File-Input setting header rows to 0. Usually, the first row contains field names. If you read the whole row into a single field, you can use Split-Field-To-Rows to have a single fieldname per row and the number of rows tells you the number of fields. There are other ways, but this one easily prepares for a subsequent metadata injection - if that's what you have in mind.
No Need of Metadata injection , In Split-Field-To-Rows, check "Include rownum in output" and give some name to that Variable. Then apply sort rows on that Variable, use Sample rows, then you will get number of fields which are present in the file.
I have a couple of questions on reading data from excel sheet using vbscript code and storing
them in a dictionary object.
1) This is more excel specific. Can I do Data filtering on an excel based on what I select in a master column. For Ex: I have a column name TicketType which has a list of values namely Type1, Type2, Type3. If I select Type1, the data in the remaining columns should show data specific to Type 1, when I select Type2, the remaining columns should change to data specific to Type2.
2) If point one is possible in excel, I would want to use VBScript to select the value first in the master column and then read the filtered data in to a dictionary. I would request for some help here. My whole idea is not to create more rows and instead use one single row and keep filtering and reading.
Thanks
Srinivas