Nifi path is invalid - apache-nifi

I am running a Nifi server, basically I have a ListFile running perfectly OK in this path : /tmp/nifi/info/daily. Here I can work and run the processor without any issue.
Because an specific reason, I had to create another ListFile, which its information is on the path: /tmp/nifi/info/last_month. When I add this second value, it says the path doesn't exist.
I checked the permissions with an ls -l, they are exactly the same, and same group:user, so I'm confused:
drwxr-xr-x. 2 nifi hadoop
I even tried re-starting Nifi to see if it was that but not. Is there any way I can test (other than keep trying input paths in the config) to see which access Nifi have? Why it doesn't see the folder?
Thanks.

As #Ben Yaakobi mentioned I was missing to create the folder on every node.

Related

How to run spark-jobs outside the bin folder of spark-2.1.1-bin-hadoop2.7

I have an existing spark-job, the functionality of this spark-job is to connect kafka-server get the data and then storing the data into cassandra tables, now this spark-job is running on server inside spark-2.1.1-bin-hadoop2.7/bin but whenever I am trying to run this spark-job from other location, Its not running, this spark-job contains some JavaRDD related code.
Is there any chance, I can run this spark-job from outside also by adding any dependency in pom or something else?
whenever I am trying to run this spark-job from other location, Its not running
spark-job is a custom launcher script for a Spark application, perhaps with some additional command-line options and packages. Open it, review the content and fix the issue.
If it's too hard to figure out what spark-job does and there's no one nearby to help you out, it's likely time to throw it away and replace with the good ol' spark-submit.
Why don't you use it in the first place?!
Read up on spark-submit in Submitting Applications.

How to use OozieClient.doAs

I'm trying to start an Oozie workflow from a web service. One of the actions should delete and create some folders. Concretely, I want to empty a folder before starting a Java action (which actually serves as a driver for MapReduce job). I know that there is a "prepare" part where in java action where you can specify the path to delete,but I need to delete all the files in a folder, but to keep the folder. That's why I'm using fs action, to delete the folder and than to make the folder.
The problem is, when I run this using oozieClient.run I get an exception that says that there is a problem with permissions, since I'm running the workflow as root user.
I found that I can use OozieClient.doAs to impersonate a specific user, but I'm not able to use it for some reason. I get internal oozie exception.
Can anyone show me how to run a workflow as a specific user, or at least point me to some good example?
One way to run oozie job with specific user is to secure oozie using kerberos with Active Directory. So that you can authenticate for any user from active directory and run the oozie job for authenticated user.

Elasticsearch - how to store scripts in config/scripts directory

I'm trying to experiment with using scripts in the config/scripts directory. The Elasticsearch docs here say this:
Save the contents of the script as a file called config/scripts/my_script.groovy on every data node in the cluster:
This seems like it's probably really easy, but I'm afraid I don't understand how exactly to put a groovy file "on every data node in the cluster". Would this normally be done through the command line somehow, or can it be done by manually moving the groovy file (in Finder on OSX for example)? I have a test index, but when I look at the file structure on the nodes I'm confused where to put the groovy file. Help, pretty please.
You just need to copy the file to each server running elasticsearch. If you're just running elasticsearch on your computer then go to the folder you've installed elasticsearch into and add copy the file into config/scripts in there (you may have to create the folder first). Doesn't matter how the file gets there.
You should see an entry in the logs (or the console if you are running in the foreground) along the lines of
compiling script file [/path/to/elasticsearch/config/scripts/my_script.groovy
This won't show up straightaway - by default elasticsearch checks for new/updated scripts every 60 seconds (you can change this with the watcher.interval setting)
Since file scripts are deprecated (elastic/elasticsearch#24552 & elastic/elasticsearch#24555) this aproach is not going to work anymore.
API it's the only way.

HDInsight Hive not finding SerDe jar in ADD JAR statement

I've uploaded json-serde-1.1.9.2.jar to the blob store with path "/lib/" and added
ADD JAR /lib/json-serde-1.1.9.2.jar
But am getting
/lib/json-serde-1.1.9.2.jar does not exist
I've tried it without the path and also provided the full url to the ADD JAR statement with the same result.
Would really appreciate some help on this, thanks!
If you don't include the scheme, then Hive is going to look on the local filesystem (you can see the code around line 768 of the source)
when you included the URI, make sure you use the full form:
ADD JAR wasb:///lib/json-serde-1.1.9.2.jar
If that still doesn't work, provide your updated command as well as some details about how you are launching the code. Are you RDP'd in to the cluster running via the Hive shell, or running remote via PowerShell or some other API?

check directory of oracle logs

I'm using the check_logfiles nagios plugin to monitor Oracle alert logs. It works wonderfully for that purpose.
However I also need to monitor and entire directory of oracle trace logs for errors. This is because the oracle database is always creating new log files with different names.
What I need to know is the best way to scan an entire directory of oracle trace logs to find out which ones match patterns that specify oracle alerts.
Using check logfiles I tried specifying these options -
--criticalpattern='ORA-00600|ORA-00060|ORA-07445|ORA-04031|Shutting
down instance'
and to specify the directory of logs -
--logfile='/global/cms/u01/app/orahb/admin/opbhb/udump/'
and
--logfile="/global/cms/u01/app/orahb/admin/opbhb/udump/*"
Neither of which have any effect. The check runs but returns ok. Does anyone know if this nagios plugin called check_logfiles can monitor a directory of files rather than just a single file? Or perhaps there is another, better way to achieve the same goal of monitoring a bunch of files that can't be specified ahead of time?
Use a script which:
Opens each file
Copies entries which match the pattern
Outputs the matches to a file

Resources