Given the CSV input file below:
name,amount
Abc,"1,234.56"
Def,"2,222,222.222222"
The amount field contains decimal number with comma. How to parse it into a number in NiFi? I don't want to parse it into a string.
I thought of using the UpdateRecord processor, Expression Language, and Java's NumberFormat to parse it, but it seems that NumberFormat is inaccessible from Expression Language. Alternatively, I want to use ScriptedRecordSetWriter to parse, but couldn't find any working example out there.
Appreciate any help especially with a working example.
When we are reading the incoming data we still needs to use String type(as the data is enclosed in ") while writing out the data from UpdateRecord processor we can use int/decimal types to write the output flowfile records.
1. Using Record Path Value:
You can read the incoming data as String datatype, Output flowfile will have integer type defined() and using UpdateRecord processor replace the ',' with ''
Add new property in UpdateRecord processor as
/amount
substringBefore(replace(/amount,',',''),'.')
Now the output flowfile will have integer datatype for the amount field.
2. Using Literal Value:
If we are using literal value we can use NiFi expression language functions on field.value by using replace and toNumber functions we are able to get int value for amount field.
Both ways we are going to get output flowfile in json format as
[{"name":"Abc","amount":1234},{"name":"Def","amount":2222222}]
In the same way if you want to have decimal as output flowfile type define avro schema with decimal type and don't use substringBefore and toNumber functions.
Related
I have lookup table with one string column and input column from source qualifier is decimal data type. I need to use these two columns in lkp condition. While doing this it is throwing invalid datatype error. So here i need to convert input lookup column datatype. Is anyother way to get this done
Convert your input column in another column in an expression using the function below: TO_CHAR(Your_Column), then link this new column to your lookup.
The datatypes of the 2 columns need to be the same, otherwise how can you expect there to be any matches in the lookup?
You can either convert the string to a decimal or the decimal to a string, whichever is easier
Have you tried to add an Expression transformation that will first transform your input to string then do the lookup after the Exptrans. Try this
I am building a generic CSV output module with a variable number of columns. The DataFormat in BW (5.14) lets you define repeating item and thus offers a list of items that I could use to map data to in the RenderCSV step.
But when I run this with data for >> 1 column (and loopings) only one column is generated.
Is the feature broken or do I use it wrongly?
Alternatively I defined "enough" optional columns in the data format and map each field separately - no really generic solution.
Looks like In BW 5, when using Data Format and Parse Data to parse text, repeating elements isn’t supported.
Please see https://support.tibco.com/s/article/Tibco-KnowledgeArticle-Article-27133
The workaround is to use Data Format resource, Parse Data and Mapper
activities together. First use Data Format and Parse Data to parse the
text into the xml where every element represents one line of the text.
Then use Mapper activity and tib:tokenize-allow-empty XSLT function to
tokenize every line and get sub-elements for each field in the lines.
The link has also attached workaround implementation
I am using UpdateRecord Processor in Nifi where I need to get value from one record path (/amount), add some value to it and put the resulting value to another record path (/result).
I could not find any way of doing this. Any help would be great!!
Use UpdateRecord twice.
The first one is
Record Reader CSVReader
Record Writer CSVRecordSetWriter
Replacement Value Strategy Record Path Value
/result /amount
and the second one is
Record Reader CSVReader
Record Writer AvroRecordSetWritter
Replacement Value Strategy Literal Value
/result ${field.value:toNumber():plus(1000)}
where the answer is based on my the other answer, Add two columns together using apache-nifi
I am using the NiFi ListAzureBlobStorage to get the available blob objects. The processor creates a flowfile for each object with the attributes containing the object metadata. I want to filter on the azure.timestamp attribute, but I do not know what the numeric value represents and how it relates to the NiFi's expression language date data type. I want to compare it with a known date so I need to convert it to a NiFi data-time variable first. How do I do this?
Thanks
According to the code it is already in "NiFi format" which means a Unix timestamp.
Since it represents the number of milliseconds passed since 1/1/1970, you can compare this and the other timestamp using regular number comparison operators.
example: ${azure.timestamp:ge(${now()})} - this will return true if the azure.timestamp is later(or equal) than the current timestamp(now).
If you'd like to compare it to another attribute you can do this:
${azure.timestamp:ge(${attribute.name})}.
If you'd like to convert a different date into a unix timestamp, you can use toDate and then toNumber, or to do the other way around, just use format.
There is a date field in the record. That is in the format below "YYYY-MM-DD HH:MM:SS.sss"(using this date value as a string). In some records, the milliseconds are rounded off from the source for example
2018-05-15 15:30:20.123
2018-05-15 15:30:20.12
2018-05-15 15:30:20.3
Is there a way to pad the additional zeros in example 2 and 3 like below in NiFi?
2018-05-15 15:30:20.120
2018-05-15 15:30:20.300
Is there way to loop in NiFi expression language?
PS: Right now I am using three different processors to do this loop by having the date as an attribute and check its length as a condition and decide to add '0' if needed. And another approach I tried is using an Execute script processor. But trying to find if there is a better solution to this?.
assume you have attribute date = 2018-05-15 15:30:20.3
you can use updateattribute with expression like this:
${date:append('000'):replaceAll('(\\.\\d{3})(.*)$','$1')}
append extra zeros and then remove the needless with regexp replace