Error while using ConvertJSONToSQL processor in Nifi - apache-nifi

I am using 2.5 HDP. I got issue in ConvertJSONToSQL processor I try to convert bulk(1GB) size of Avro file into SQL format,for that first I convert Avro format into JSON format (by using ConvertAvroToJSON processor) after that I convert JSON format into SQL format (by using ConvertJSONToSQL processor) that time I got issue that I mention below.

First you try to split bulk size of Avro format file into some small size by using SplitAvro processor which is present in Nifi and then you convert it into json
Configure processor of SplitAvro processor
For more information follow this link

Related

How to convert an AVRO scheme into line protocol in order to insert data into InfluxBD with Apache Ni-Fi

I am creating a data pipeline with Apache Ni-Fi to copy data from a remote MySQL database into InfluxDB.
I use QueryDatabaseTable processor to extract the data from the MySQL database, then I use UpdateRecord to do some data transformation and I would like to use PutInfluxDB to insert the time series into my local Influx instance in Linux.
The data coming from the QueryDatabaseTable processor uses AVRO scheme and I need to convert it into line protocol by configuring which are the tags and which are the measurement values.
However, I do not find any processor that allows doing this conversion.
Any hints?
Thanks,
Bernardo
There is no built-in processor for InfluxDB Line Protocol conversions - you could write a ScriptedRecordWriter if you wanted to do it yourself, however there is a project that already implements a Line Protocol reader for NiFi here by InfluxData that seems to be active & up-to-date.
See the documentation for adding it into NiFi here

how to configure convertrecord processor in Apache nifi so that it convert JSON to AVRO format

not able to figure out how to configure this processor to convert the incoming JSON twitter data to Avro to Put that data into a hive table
plz help
Thank you in advance.
#RishiPandey
ConvertJsonToAvro is the processor you need.

Apache Nifi for data masking

We are using Nifi as our main data ingestion engine. Nifi is used to ingest data from multiple sources like DB, blob storage, etc and all of the data is pushed to kafka ( with avro as serializatiton format). Now, one of the requirement is to mask the specific fields(
PII) in input data.
Is nifi a good tool to do that ?
Does it have any processor to support data masking/obfuscation ?
Nifi comes with the EncryptContent and CryptographicHashContent and CryptographicHashAttribute processors which can be used to encrypt/hash data respectively.
I would look into this first.
In addition ReplaceText could also do simple masking. An ExecuteScript processor could perform custom masking, or a combination of UpdateRecord with a ScriptedRecordSetWriter could easily mask certain fields in a record.

How to keep hive table in NiFi DistributedMapCache

I want to keep my hive/MySQL table in NiFi DistributedMapCache. Can someone please help me with the example?
Or please correct me if we can not cache hive table anyhow in NiFi cache.
Thanks
You can use SelectHiveQL processor to pull data from Hive table and output format as CSV and include Header as false.
SplitText processor to split each line as individual flowfile.
Note
if your flowfile size is big then you have to use series of split text processors in series to split the flowfile to each line individually
ExtractText processor to extract the key attribute from the flowfile content.
PutDistributedMapCache processor
Configure/Enable DistributedMapCacheClientService, DistributedMapCacheServer controller service.
Add the Cache Entry Identifier property as your extracted attribute from ExtractText processor.
You need to change the Max cache entry size depending on the flowfile size.
To fetch the cached data you can use FetchDistributedMapCache processor and we need to use same exact value for the identifier that we have cached in PutDistributedMapCache
In the same way if you want to load data from external sources as we are going to have data in Avro format use ConvertRecord processor to convert Avro --> CSV format then load the data into distributed cache.
However this not an best practice to load all the data into distributedmapcache for the huge datasets as you can use lookuprecord processor also.

ExecuteSQL processor in Nifi returns data in avro format

Just started working with Apache Nifi. I am trying to fetch data from oracle and place it in HDFS then build an external hive table on top of it. The problem is ExecuteSQL processor returns data in avro format. Is there anyway I can get this data in a readable format?
apache nifi also has an 'ConvertAvroToJSON' processor. That might help you get it into a readable format. We also really need to just knock out the ability for our content viewer to nicely render avro data which would help as well.
Thanks
joe

Resources