Adding column at the end to pipe delimited file in NiFi - apache-nifi

I have this particular pipe delimited file in a SFTP server
PROPERTY_ID|START_DATE|END_DATE|CAPACITY
1|01-JAN-07|31-DEC-30|101
2|01-JAN-07|31-DEC-30|202
3|01-JAN-07|31-DEC-30|151
4|01-JAN-07|31-DEC-30|162
5|01-JAN-07|31-DEC-30|224
I need to transfer this data to S3 bucket using NiFi. In this process I need to add another column which is today's date at the end.
PROPERTY_ID|START_DATE|END_DATE|CAPACITY|AS_OF_DATE
1|01-JAN-07|31-DEC-30|101|20-10-2020
2|01-JAN-07|31-DEC-30|202|20-10-2020
3|01-JAN-07|31-DEC-30|151|20-10-2020
4|01-JAN-07|31-DEC-30|162|20-10-2020
5|01-JAN-07|31-DEC-30|224|20-10-2020
what is the simple way to implement this in NiFi?

#Naga here is a very similar post that describes the ways to solve adding a new column on CSV:
Apache NiFi: Add column to csv using mapped values
The simplest way is ReplaceText to append the same "|20-10-2020" to each line. ReplaceText settings will be evaluate line by line and Regex: $1|20-10-2020. The other methods are additional ways to do that more dynamically, for example if the date isnt static.

Related

How to set start and end row or interval rows for CSV in Nifi?

I want to get particular part of excel file in Nifi. My Nifi template like that;
GetFileProcessor
ConvertExcelToCSVProcessor
PutDatabaseRecordProcessor
I should parse data between step 2 and 3.
Is there a solution for getting specific rows and columns ?
Note:If there is a option for cutting ConvertExcelToCSVProcessor, it will work for me.
You can use Record processors between ConvertExcelToCSV and PutDatabaseRecord.
to remove or override a column use UpdateRecord. this processor can receive your data via CSVReader and prepare an output for PutDatabaseRecord or QueryRecord . check View usage -> Additional Details...
in order to filter by column use QueryRecord.
here an example. this example receives data through CSVReader and makes some aggregations, you can as well do some filtering according to doc
also this post had helped me to understand Records in Nifi

Problem in Reading Data from CSV file in Jmeter

I am facing an issue while reading the data from CSV file to pass the values to the request. I have a csv with 3 columns userid, password and type.When the data is being passed for the username field, its taking the values of 3 columns instead of just the username
Jmeter version: 5.0
CSV file value:
Can you please help me if i am doing it wrong.
Double check the "Delimiter" in your CSV file because if it's different from the default comma , you will need to change it accordingly:
In order to be 100% sure in the test data integrity I would recommend opening it with a text editor like Notepad instead of Excel as JMeter treats CSV files as plain text and this is you should be also doing when validating your test data.
You may find Debug Sampler useful for visualising JMeter Variables which are generated by CSV Data Set Config, Post-Processors and some pre-defined variables.
Let me see the variable you insert into the Login request. Everything is find with your config. Maybe it about your CSV file. Please sure format of CSV file when you open with Notepad not Microsoft Word like this
userid,password,type
test#test.com,Test#123,Vehicle
test#test.com,Test#123,Vehicle
test#test.com,Test#123,Vehicle

Talend tInputFileDelimited component java.lang.NumberFormatException for CSV file

As a beginner to TOS for BD, I am trying to read two csv files in Talend OS, i have inferred the metadata schema from the same CSV file, and setup the first row to be header, and delimiter as comma (,)
In my code:
The tMap will read the csv file, and do a lookup on another csv file, and generate two output files passed and reject records.
But while running the job i am getting below error.
Couldn't parse value for column 'Product_ID' in 'row1', value is '4569,Laptop,10'. Details: java.lang.NumberFormatException: For input string: "4569,Laptop,10"
I believe it is considering entire row as one string to be the value for "Product_ID" column
I don't know why that is happening when i have set the delimiter and row separator correctly.
Schema
I can see no rows are going from the first tInputFileDelimited due to above error.
Job Run
Input component
Any idea what else can i check?
Thanks in advance.
In your last screenshot, you can see that the Field separator of your tFileInputDelimited_1 is ; and not ,.
I believe that you haven't set up your component to use the metadata you created for your csv file.
So you need to configure the component to use the metadata you've created by selecting Repository under Property Type, and selecting the delimited file metadata.

How to start reading from second row using CSV data config element in jmeter

CSV data config always reads from first row. I want to add column headers in the CSV file. Hence I want the CSV config to start reading from second row.
Below is the setting in Thread Group.
No.of threads = 1
Loop Count= 10 (depends on no.of rows in CSV file)
What version of JMeter are you using? It seems like leaving the Variable Names field empty will do the trick. More info here:
Versions of JMeter after 2.3.4 support CSV files which have a header
line defining the column names. To enable this, leave the "Variable
Names" field empty. The correct delimiter must be provided.
Leave the variable names empty like on an attached snapshot.
In csv file add header. Header values will act as variable names which could be used as parameters in requests.
~sample facility.csv file
facility1,name
GG1LMD,test1
Request
There is a field called "Ignore first line (only used if Variable Names is not empty)", set this to true.

Reading files in PIG where delemeter comes in data

I want to read a CSV file using PIG what should i Do?. I used load n pigstorage(',') but it fails to read CSV file properly because where it encounters comma (,) in data it splits it.How should i give delimeter now if i have comma in data also?
It's generally impossible to distinguish comma in data from comma as a delimiter.
You will need to escape that comma that is in your 'data' and custom load function (for Pig) that can recognize escaped commas.
Take a look here:
http://ofps.oreilly.com/titles/9781449302641/load_and_store_funcs.html
http://pig.apache.org/docs/r0.7.0/udf.html#Load%2FStore+Functions
Have you had a look at the CSVLoader loader in the PiggyBank if you want to read a CSV file? (of course the file format needs to be valid)
First make sure you have a valid CSV file. In the case you haven't try to change the source file through Excel (if the file is small) or other tool and export a new CSV with a good delimiter for your data (Ex: \t tab, ; , etc). Even better can be do another extract with a "good" delimiter.
Example of your load can be then something like this:
TABLE = LOAD 'input.csv' USING PigStorage(';') AS ( site_id: int,
name: chararray, ... );
Example of your DUMP:
STORE TABLE INTO 'clean.csv' using PigStorage(','); <- delimiter that suits you best

Resources