How can we generate flow files using java code in apache nifi - apache-nifi

Is there a way to generate flow files in apache-nifi using java code which i will invoke using ExecuteStreamCommand ?

ExecuteStreamCommand starts system command and passes flow file to STDIN of this command, then takes STDOUT of the command and stores as a content of the flow file.
So, in java you have to write code that reads data from STDIN (System.in) and writes processed data to STDOUT (System.out)
I advice you to check ExecuteScript groovy examples because it is a java-based scripting language.

Related

Can I delete file in Nifi after send messages to kafka?

Hi I'm using nifi as an ETL tool.
Process IMG
This is my current process. I use TailFile to detect CSV file and then send messages to Kafka.
It works fine so far, but i want to delete CSV file after i send contents of csv to Kafka.
Is there any way?
Thanks
This depends on why you are using TailFile. From the docs,
"Tails" a file, or a list of files, ingesting data from the file as it is written to the file
TailFile is used to get new lines that are added to the same file, as they are written. If you need to a tail a file that is being written to, what condition determines it is no longer being written to?
However, if you are just consuming complete files from the local file system, then you could use GetFile which gives the option to delete the file after it is consumed.
From a remote file system, you could use ListSFTP and FetchSFTP which has a Completion Strategy to move or delete.

Logstash with XML file

We are using logstash to read an xml file. This xml file is generated when a Jenkins pipeline build commences and is written to with build data during the pipeline execution. We use file input mode 'read'.
CURRENT BEHAVIOR:
The xml file is created when the Jenkins pipeline starts. Logstash discovers this xml file, reads it, logs it, and does not return to the xml file again.
PROBLEM:
Logstash has read the xml file prematurely and misses all the subsequent data that is written to it.
DESIRED BEHAVIOR:
Logstash allows us to apply some condition to tell it when to read the xml file. Ideally a trigger would tell logstash the xml file is completed and ready to be read and logged.
We want this to work with file input mode 'read'. The xml file is written to for around 1.5 hours.
Is there a filter, plugin or some other functionally that will allow logstash to return to the xml file when it is modified?

Returning an output from a JVM application to a bash script

I have a JVM app that is being run from a bash script. I would like for the app to return an output to the script, so that the script can use it as a parameter for other commands.
One suggestion I've read is to use System.out.print on the desired output. However, my application does a significant amount of logging using log4j. It also invokes other libraries which also log other info as well. If my bash-script tries to read from stdout, wouldn't it read all of those log-outputs as well?
Another option I thought of is:
The script passes in a /tmp/${RANDOM}.out file-path to the application
The JVM application writes the desired output to the specified file
The script reads the value off the specified file, once the application has finished running
The above approach seems more cumbersome, and makes certain assumptions about the system's file-system and write-permissions. But it's the best option I can think of.
Is there a better way to do this?
This answer assumes some experience and knowledge about bash IO channels, Java out/err print, and log4j configuration files.
There are 2 techniques:
1. In the Java file, send relevant output to System.err. In the bash script, capture channel 2 which is stderr. Channel 1 is stdout.
2. In the log4j config .xml, use a file appender. This will send logging data to a file. Define the Logger Root to use this appender.
There are many subtleties that can affect these techniques. Hopefully, one of these options will suffice.

Backend Java application testing using jmeter

I have a Java program which works on backend .It's a kind of batch processing type where I will have a text file that contains messages.The Java program will fetch message from text file and will load in to DB or Mainframe.Instead of sequential fetching we need to try parallel fetching .How can I do through Jmeter?
I tried my converting the program to a Jar file and calling it through class name.
Also tried by pasting code and in argument place we kept the CSV (the text file converted to .CSV).
Both of this giving Sampler client exception..
Can you please help me how to proceed or is there something we are missing or other way to do it.
The easiest way to kick off multiple instances of your program is running it via JMeter's OS Process Sampler which can run arbitrary commands and print their output + measure execution time.
If you have your program as an executable jar you can kick it off like:
See How to Run External Commands and Programs Locally and Remotely from JMeter article for more information on the approach

How do I suppress the boilerplate 'WASX7209I: Connected to process...' from the wsadmin command?

I have a wsadmin script that needs to produce a very specific output format that will be consumed by a third-party monitoring tool. I've written my Jython script to produce the correct output except that wsadmin always seems to spit out this boilerplate at the beginning:
WASX7209I: Connected to process "dmgr" on node [node] using SOAP connector; The type of process is: DeploymentManager
Is there a way to suppress this output or will I need to do some post processing to strip off this superfluous info?
I'm not aware of any way to suppress that output from being generated. I think you're going to have to strip it out post execution if your consuming system can't handle it...

Resources