Talend tInputFileDelimited component java.lang.NumberFormatException for CSV file - etl

As a beginner to TOS for BD, I am trying to read two csv files in Talend OS, i have inferred the metadata schema from the same CSV file, and setup the first row to be header, and delimiter as comma (,)
In my code:
The tMap will read the csv file, and do a lookup on another csv file, and generate two output files passed and reject records.
But while running the job i am getting below error.
Couldn't parse value for column 'Product_ID' in 'row1', value is '4569,Laptop,10'. Details: java.lang.NumberFormatException: For input string: "4569,Laptop,10"
I believe it is considering entire row as one string to be the value for "Product_ID" column
I don't know why that is happening when i have set the delimiter and row separator correctly.
Schema
I can see no rows are going from the first tInputFileDelimited due to above error.
Job Run
Input component
Any idea what else can i check?
Thanks in advance.

In your last screenshot, you can see that the Field separator of your tFileInputDelimited_1 is ; and not ,.
I believe that you haven't set up your component to use the metadata you created for your csv file.
So you need to configure the component to use the metadata you've created by selecting Repository under Property Type, and selecting the delimited file metadata.

Related

Adding column at the end to pipe delimited file in NiFi

I have this particular pipe delimited file in a SFTP server
PROPERTY_ID|START_DATE|END_DATE|CAPACITY
1|01-JAN-07|31-DEC-30|101
2|01-JAN-07|31-DEC-30|202
3|01-JAN-07|31-DEC-30|151
4|01-JAN-07|31-DEC-30|162
5|01-JAN-07|31-DEC-30|224
I need to transfer this data to S3 bucket using NiFi. In this process I need to add another column which is today's date at the end.
PROPERTY_ID|START_DATE|END_DATE|CAPACITY|AS_OF_DATE
1|01-JAN-07|31-DEC-30|101|20-10-2020
2|01-JAN-07|31-DEC-30|202|20-10-2020
3|01-JAN-07|31-DEC-30|151|20-10-2020
4|01-JAN-07|31-DEC-30|162|20-10-2020
5|01-JAN-07|31-DEC-30|224|20-10-2020
what is the simple way to implement this in NiFi?
#Naga here is a very similar post that describes the ways to solve adding a new column on CSV:
Apache NiFi: Add column to csv using mapped values
The simplest way is ReplaceText to append the same "|20-10-2020" to each line. ReplaceText settings will be evaluate line by line and Regex: $1|20-10-2020. The other methods are additional ways to do that more dynamically, for example if the date isnt static.

Pentaho-spoon Reading text file

We have a text file with multiple fields and it is being used in different transformations. When I do 'Get Fields' in Text File Input, I get fields as follows:
I don't need all these fields for the next step so I kept only required fields(i.e. 1st, 3rd,18th and 19th) as follows and removed other fields in Text File Input as there are '?' per parameter in the next step.
But it is picking the value of initial fields only.
I even tried using 'Position' as per the file, but no luck. Can anyone please tell me what I am missing here?
Text File Input reads the columns sequentially even though you specify certain column names in the Fields tab.
Select all the fields in the Fields tab of the Text File Input and use Select Values as the next step and there select only the required fields.

Creating CSV within bash containing field which has commas

Hoping someone can assist with this query. I am attempting to replicate a CSV output and everything is fine at the moment aside from one field data that I am attempting to insert.
One of the fields contains multiple sets of data which contain commas to separate them however they are in the same field.
Is there any way in bash to generate this CSV so it can actually provide a CSV output that acknowledges that that particular field has multiple entries delimited by commas and not attempt to add those entries into a different field.
Cheers
wingZero
Details:
Field1, Field2, Field3
Data1, DATA2,DATA22,DATA222, Data3

Send a Flat file attachment in the workflow in Informatica Developer

In a mapping we use delimited flat file having 3 columns.The column separated through comma. But i have a requirement that in between the column there is a column having 2 comma.So how should I process the column in the mapping?
You should have information quoted with "" so whatever is within " is skiped. this way you could differentiate between comma of a piece of information or as a column separator.
We don't know what have you tried, but count the number of commas for each line and separate accordingly (if possible).

How to start reading from second row using CSV data config element in jmeter

CSV data config always reads from first row. I want to add column headers in the CSV file. Hence I want the CSV config to start reading from second row.
Below is the setting in Thread Group.
No.of threads = 1
Loop Count= 10 (depends on no.of rows in CSV file)
What version of JMeter are you using? It seems like leaving the Variable Names field empty will do the trick. More info here:
Versions of JMeter after 2.3.4 support CSV files which have a header
line defining the column names. To enable this, leave the "Variable
Names" field empty. The correct delimiter must be provided.
Leave the variable names empty like on an attached snapshot.
In csv file add header. Header values will act as variable names which could be used as parameters in requests.
~sample facility.csv file
facility1,name
GG1LMD,test1
Request
There is a field called "Ignore first line (only used if Variable Names is not empty)", set this to true.

Resources