How to ingest multiple record JSON array into SQL Server using PutDatabaseReader - apache-nifi

The problem is that I am not able to process a bunch of JSON-records which came as output of the QueryCassandra processor. I am able to process record by record using the splitjson processor before PutDatabaseRecord.
I am trying to use Jsonpathreader in PutDatabaseRecord. **How can I configure the PutDatabaseRecord processor or the Jsonpathreader in order to process all records of the JSON at once?

Related

Apache Nifi for data masking

We are using Nifi as our main data ingestion engine. Nifi is used to ingest data from multiple sources like DB, blob storage, etc and all of the data is pushed to kafka ( with avro as serializatiton format). Now, one of the requirement is to mask the specific fields(
PII) in input data.
Is nifi a good tool to do that ?
Does it have any processor to support data masking/obfuscation ?
Nifi comes with the EncryptContent and CryptographicHashContent and CryptographicHashAttribute processors which can be used to encrypt/hash data respectively.
I would look into this first.
In addition ReplaceText could also do simple masking. An ExecuteScript processor could perform custom masking, or a combination of UpdateRecord with a ScriptedRecordSetWriter could easily mask certain fields in a record.

Read data from multiple tables at a time and combine the data based where clause using Nifi

I have scenario where I need to extract multiple database table data including schema and combine(combination data) them and then write to xl file?
In NiFi the general strategy to read in from a something like a fact table with ExecuteSQL or some other SQL processor, then using LookupRecord to enrich the data with a lookup table. The thing in NiFi is that you can only do a table at a time, so you'd need one LookupRecord for each enrichment table. You could then write to a CSV file that you could open in Excel. There might be some extensions elsewhere that can write directly to Excel but I'm not aware of any in the standard NiFi distro.

Record Oriented InvokeHTTP Processor

I have a csv file
longtitude,lagtitude
34.094933,-118.30674
34.095028,-118.306625
(more to go)
I use UpdateRecord Processor (which support record processing) with CSVRecordSetWriter using RecordPath (https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html) to prepare gis field.
longtitude,lagtitude,gis
34.094933,-118.30674,"34.094933,-118.30674"
34.095028,-118.306625,"34.095028,-118.306625"
My next step is to retrieve gis as input parameter to a HTTP API, where this HTTP API returns info (poi) that I would like to store.
longtitude,lagtitude,gis,poi
34.094933,-118.30674,"34.094933,-118.30674","Restaurant A"
34.095028,-118.306625,"34.095028,-118.306625","Cinema X"
It seems like InvokeHTTP Processor does not process in record oriented way. Any possible solution to prepare the above without split it further?
When you want to enrich each record like this it is typically handled in NiFi by using the LookupRecord processor with a LookupService. It is basically saying, for each record in the incoming flow file, pass in some fields of the record to the lookup service, and take the results of the lookup and stored them back in the record.
For your example it sounds like you would want a RestLookupService:
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-lookup-services-nar/1.9.1/org.apache.nifi.lookup.RestLookupService/index.html

How to pass values dynamicallly from one processor to another processor using apache nifi

i want pass one processor result as input to another processor using apache NiFi.
I am geeting values from mysql using ExecuteSQL processor .i want pass this result dynamically to SelectHiveQL Processor in apache nifi.
ExecuteSQL outputs a result set as Avro. If you would like to process each row individually, you can use SplitAvro then ConvertAvroToJson, or ConvertAvroToJson then SplitJson. At that point you can use EvaluateJsonPath to extract values into attributes (for use with NiFi Expression Language), and at some point you will likely want ReplaceText where you set the content of the flow file to a HiveQL statement (for use by SelectHiveQL).

Data aggregation in Apache Nifi

I am using Apache nifi to process the data from different resources and I have independent pipelines created for each data flow. I want to combine this data to process further. Is there any way I can aggregate the data and write it to a single file. The data is present in the form of flowfiles attributes in Nifi.
You should use the MergeContent processor, which accepts configuration values for min/max batch size, etc. and combines a number of flowfiles into a single flowfile according to the provided merge strategy.

Resources