Passing parameter from different source into insert statement using Nifi - oracle

I'm still new in NiFi. What I want to achieve is to pass a parameter from a different source.
Scenario:
I have 2 datasource which is Json data and record id (from oracle function). I declared record id using extract text as "${recid}" and json string default is "$1" .
How to insert into table using sql statement insert into table1 (json,recid) value ('$1','${recid}')
After I run the processor. I'm not able to get both attribute into one insert statement.
Please help.
Nifi flowfile
Flowfile after mergecontent

you should merge these 2 flowfiles to make one.
Use mergeFlowfile processor with Attribute Strategy set to Keep All Unique Attributes
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.MergeContent/index.html

Take a look at LookupAttribute with a SimpleDatabaseLookupService. You can pass your JSON flow file into that, look up the recid into an attribute, then do the ExtractText -> ReplaceText to get it into SQL form.

Related

NiFi - Call Rest API for every row in the file

I have a datset of IDs, I've got a flow file that has one row per ID. I have an API that takes this ID as a parameter, and I want to harvest the results for all rows back into NiFi (example below).
https://service.com/api/thing/{ID}
How in NiFi, can I call this API, for all IDs in my dataset. Ideally using some parallelism if possible.
(for reference, in SSIS I could load these IDs into an array and then loop over an API call with a parameter for the ID).
First, use SplitText to get each Id as a flowfile
Then copy content to an attribute by ExtractText . add custom property such as 'message.body' in this example
so that ExtractText would add message.body.0 attribute to the flowfile and you can use it InvokeHttp like below . Please note that since your endpoint is https , you may need to configure SSL Contect Service
Finally , you can set concurent task count for each Processor for parallelism

NiFi - How can i change ReplacementValue on ReplaceText?

I'm using ExecuteSQL,SplitAvro,ConvertAvroToJSON,EvaluateJsonPath,ReplaceText,ExecuteSQL.
I am trying to replace the content in the flowfile using replaceText processor.
Now, i can replace like this. -> INSERT INTO values (${id},'${name}') . ReplaceText process send excute sql like this :
INSERT into x values(1,'xx')
INSERT into x values(2,'yy')
INSERT into x values(3,'zz')
I'm sending INSERT query for each line.
But i want to send executesql process like this
INSERT INTO x values (1,'xx'),(2,'yy'),(3,'zz')
i'm not sure it's the best approach but you could do this:
SplitAvro # i guess you are splitting the records here (here you should get fragment.* attributes)
ConvertAvroToJSON # converting each record to json
EvaluateJsonPath # getting id,name values from json
ReplaceText # (${id}, '${name}')
MergeContent # merge rows back to single file with header and delimiter
Binary Concatenation
Header = Insert into X values
Demarcator = ,
ExecuteSQL
Although it won't generate exactly the SQL you're looking for, take a look at ConvertRecord and/or JoltTransformRecord -> PutDatabaseRecord. The former is used to get each of your Avro records into the form you want (id, name), and PutDatabaseRecord will use a PreparedStatement in batches to send the records to the database. It might not be quite as efficient as a single INSERT, but should be much more efficient than a Split -> Convert -> ExecuteSQL with separate INSERTs per FlowFile.
To truly get the SQL you want, you'll likely need a scripted processor such as InvokeScriptedProcessor with a RecordReader, I have a blog post on the subject.

How to set start and end row or interval rows for CSV in Nifi?

I want to get particular part of excel file in Nifi. My Nifi template like that;
GetFileProcessor
ConvertExcelToCSVProcessor
PutDatabaseRecordProcessor
I should parse data between step 2 and 3.
Is there a solution for getting specific rows and columns ?
Note:If there is a option for cutting ConvertExcelToCSVProcessor, it will work for me.
You can use Record processors between ConvertExcelToCSV and PutDatabaseRecord.
to remove or override a column use UpdateRecord. this processor can receive your data via CSVReader and prepare an output for PutDatabaseRecord or QueryRecord . check View usage -> Additional Details...
in order to filter by column use QueryRecord.
here an example. this example receives data through CSVReader and makes some aggregations, you can as well do some filtering according to doc
also this post had helped me to understand Records in Nifi

Migrating table with PutDatabaseRecord with different column name at the target table

I need to migrate the data from a db2 table to a mssql table but one column has a different name, but the same datatype.
Db2 table:
NROCTA,NUMRUT,DIASMORA2
MSSQL table:
NROCTA,NUMRUT,DIAMORAS
As you see DIAMORAS is different.
Im using the following flow:
ExecuteSQL -> SplitAvro -> PutDatabaseRecord
In PutDataBaseRecord I have as RecordReader an AvroReader configured in this way:
Schema Acesss Strategy: Use Embedded Avro Schema.
Schema Text: ${avro.schema}
The flow just insert the two first columns.¿How I can do the mapping between DIASMORA2 and DIAMORAS columns ?
Thanks in advance!
First thing, you probably don't need SplitAvro in your flow at all, unless there's some logical subset of rows that you are trying to send as individual transactions.
For the column name change, use UpdateRecord and set the field /DIASMORAS to the record path /DIASMORA2, and change the name of the field in the AvroRecordSetWriter's schema from DIASMORA2 to DIASMORAS.
That last part is a little trickier since you are using the embedded schema in your AvroReader. If the schema will always be the same, you can stop the UpdateRecord processor and put in an ExtractAvroMetadata processor to extract the avro.schema attribute. That will put the embedded schema in the flowfile's avro.schema attribute.
Then before you start UpdateRecord, start the ExecuteSQL and ExtractAvroMetadata processors, then inspect a flow file in the queue to copy the schema out of the avro.schema attribute. Then in your AvroRecordSetWriter in ConvertRecord, instead of Inheriting the schema, you can choose to Use Schema Text, then paste in the schema from the attribute, changing DIASMORA2 to DIASMORAS. This approach puts values from the DIASMORA2 field into the DIASMORAS field, but since DIASMORA2 is not in the output schema, it is ignored, thereby effectively renaming the field (although under the hood it is a copy-and-remove).

Nifi add attribute from DB

I am currently getting files from FTP in Nifi, but I have to check some conditions before I fetch the file. The scenario goes some thing like this.
List FTP -> Check Condition -> Fetch FTP
In the Check Condition part, I have fetch some values from DB and compare with the file name. So can I use update attribute to fetch some records from DB and make it like this?
List FTP -> Update Attribute (from DB) -> Route on Attribute -> Fetch FTP
I think your flow looks something like below
Flow:
1.ListFTP //to list the files
2.ExecuteSQL //to execute query in db(sample query:select max(timestamp) db_time from table)
3.ConvertAvroToJson //convert the result of executesql to json format
4.EvaluateJsonPath //keep destination as FlowfileAttribute and add new property as db_time as $.db_time
5.ROuteOnAttribute //perform check filename timestamp vs extracted timestamp by using nifi expresson language
6.FetchFile //if condition is true then fetch the file
RouteOnAttribute Configs:
I have assumed filename is something like fn_2017-08-2012:09:10 and executesql has returned 2017-08-2012:08:10
Expression:
${filename:substringAfter('_'):toDate("yyyy-MM-ddHH:mm:ss"):toNumber()
:gt(${db_time:toDate("yyyy-MM-ddHH:mm:ss"):toNumber()})}
By using above expression we are having filename value same as ListFTP filename and db_time attribute is added by using EvaluateJsonPath processor and we are changing the time stamp to number then comparing.
Refer to this link for more details regards to NiFi expression language.
So if I understand your use case correctly, it is like you are using the external DB only for tracking purpose. So I guess only the latest processed timestamp is enough. In that case, I would suggest you to use DistributedCache processors and ControllerServices offered by NiFi instead of relying on an external DB.
With this method, your flow would be like:
ListFile --> FetchDistributedMapCache --(success)--> RouteOnAttribute -> FetchFile
Configure FetchDistributedMapCache
Cache Entry Identifier - This is the key for your Cache. Set it to something like lastProcessedTime
Put Cache Value In Attribute - Whatever name you give here will be added as a FlowFile attribute with its value being the Cache value. Provide a name, like latestTimestamp or lastProcessedTime
Configure RouteOnAttribute
Create a new dynamic relationship by clicking the (+) button in the Properties tab. Give it a name, like success or matches. Let's assume, your filenames are of the format somefile_1534824139 i.e. it has a name and an _ and the epoch timestamp appended.
In such case, you can leverage NiFi Expression Language and make use of the functions it offer. So for the new dynamic relation, you can have an expression like:
success - ${filename:substringAfter('_'):gt(${lastProcessedTimestamp})}
This is with the assumption that, in FetchDistributedMapCache, you have configured the property Put Cache Value In Attribute with the value lastProcessedTimestamp.
Useful Links
https://community.hortonworks.com/questions/83118/how-to-put-data-in-putdistributedmapcache.html
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#dates

Resources