MissingArgumentException while configuring Flume - hadoop

I installed Flume
and tried to run this command
flume-ng agent -n $agent_name -c conf -f /home/gautham/Downloads/apache-flume-1.5.0.1-bin/conf/flume-conf.properties.template
and I get this exception
ERROR node.Application: A fatal error occurred while running. Exception follows.
org.apache.commons.cli.MissingArgumentException: Missing argument for option: n
at org.apache.commons.cli.Parser.processArgs(Parser.java:343)
at org.apache.commons.cli.Parser.processOption(Parser.java:393)
at org.apache.commons.cli.Parser.parse(Parser.java:199)
at org.apache.commons.cli.Parser.parse(Parser.java:85)
at org.apache.flume.node.Application.main(Application.java:252)

Check if you are named your flume file with .conf extension.
And try to use the below command:
$ flume-ng agent \
--conf-file PathOfYourFlumeFile\
--name agentameInFlumeFile\
--conf $FLUME_HOME/conf \
Change $agent_name with the name of agent you have used in your flume file.
You have to mention path of flume file with .conf extension instead of this /home/gautham/Downloads/apache-flume-1.5.0.1-bin/conf/flume-conf.properties.template

Instead of $agent_name use the actual name of the agent in your conf file.
I suspect that you do not have an $agent_name environment variable so it's being replaced with empty string.

I had this similar issue. Later found out that by replacing all the hyphens(-) with hyphens again, it started working. Probably when i copied this command, the hyphens were replaced with minus(-) signs.

Related

Logstash cannot start because of multiple instances even though there are no instances of it running

I keep getting this error [2019-02-26T16:50:41,329][FATAL][logstash.runner ] Logstash could not be started because there is already another instance using the configured data directory. If you wish to run multiple instances, you must change the "path.data" setting.
when I launch logstash. I am using the cli to launch logstash. The command that I execute is:
screen -d -S logstash -m bash -c "cd;export JAVA_HOME=/nastools/jdk1.8.0_77/; export LS_JAVA_OPTS=-Djava.net.preferIPv4Stack=true; ~/monitoring/6.2.3/bin/logstash-6.2.3/bin/logstash -f ~/monitoring/6.2.3/config/logstash_forwarder/forwarder.conf"
I don't have any instance of logstash running. I tried running this:
ps xt | grep "logstash" and it didn't return any process. I tried the following as well: killall logstash but to no avail, it gives me the same error. I tried restarting my machine as well but still the same error.
Has anyone experienced something similar? Kibana and elastic search launch just fine.
Thanks in advance for your help!
The problem is solved now. I had to empty the contents of the data directory of logstash. I then restarted it and it generated the uuid and other files it needed.
To be more specific, you need to cd to the data folder of logstash (usually it is /usr/share/logstash/data) and delete the .lock file.
You can see if this file exists with:
ll -lah
In the data folder.
Learn it from http://www.programmersought.com/article/2009814657/;jsessionid=282FF6001AFE90D7D8609975B8222CE8
sudo /usr/share/logstash/bin/logstash --path.settings /etc/logstash/ --path.data sensor39 -f /etc/logstash/conf.d/company_dump.conf --config.reload.automatic
Try this cmd I hope it will work(but please check the .conf file path)

script backup namespace,deployment etc.. from kubernetes

I'm looking for a bash script that could backup all the kubernetes in a yaml format or json it's good too:)
I already backup the kubernetes conf files already.
/etc/kubernetes
/etc/systemd/system/system/kubelet.service.d
etc...
Now I'm just looking to save the
namespaces
deployment
etc...
You can dump your entire cluster info into one file using:
kubectl cluster-info dump > cluster_dump.txt
The above command will dump all the yaml and container logs into one file
Or if you just want yaml files, you can write a script of some commands which includes
kubectl get deployment -o yaml > deployment.yaml
kubectl get statefulset -o yaml > statefulset.yaml
kubectl get daemonset -o yaml > daemonset.yaml
Then you have to keep the namespace also in mind while creating the script. This gives you fair idea what to do
try the below command. you can include all the namespaces that you want to backup
mkdir -p /root/k8s-backup
kubectl cluster-info dump --namespaces default,kube-system --output-directory=/root/k8s-backup

Hive disable history logs and query logs

We are using hive on our production machines but it generates a lot of job logs in /tmp/<user.name>/ directory. We would like to disable this logging as we don't need it but can't find any option to disable it. Some of the answers we checked required us to modify hive-log4j.properties file. But the only file available in /usr/lib/hive/conf is hive-site.xml
While starting hive it gives the following information:
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.10.0-cdh4.7.0.jar!/hive-log4j.properties
Hive history file=/tmp/adqops/hive_job_log_79c7f1c2-b4e5-4b7b-b2d3-72b032697bb5_1000036406.txt
So it seems that hive-log4j.properties file is included in a jar and we can't modify it.
Hive Version: hive-hwi-0.10.0-cdh4.7.0.jar
Any help/solution is greatly appreciated.
Thanks.
Since Hive expects a custom properties file name, I guess you cannot use the usual trick of setting -Dlog4j.configuration=my_custom_log4j.properties on the command-line.
So I fear you would have to edit hive-common-xxx.jar with some ZIP utility to
extract the default props file into /etc/hive/conf/ or any other
directory that will be at the head of CLASSPATH
delete the file from the JAR
edit the extracted file
Ex:
$ unzip -l /blah/blah/blah/hive-common-*.jar | grep 'log4j\.prop'
3505 12-02-2015 10:31 hive-log4j.properties
$ unzip /blah/blah/blah/hive-common-*.jar hive-log4j.properties -d /etc/hive/conf/
Archive: /blah/blah/blah/hive-common-1.1.0-cdh5.5.1.jar
inflating: /etc/hive/conf/hive-log4j.properties
$ zip -d /blah/blah/blah/hive-common-*.jar hive-log4j.properties
deleting: hive-log4j.properties
$ vi /etc/hive/conf/hive-log4j.properties
NB: proceed at your own risk... 0:-)
Effectively set the logging level to FATAL
hive --hiveconf hive.root.logger=DRFA --hiveconf hive.log.level=FATAL -e "<query>"
OR redirect logs into another directory and just purge the dir
hive --hiveconf hive.root.logger=DRFA --hiveconf hive.log.dir=./logs --hiveconf hive.log.level=DEBUG -e "<query>"
It will create a log file in logs folder. Make sure that the logs folder exist in current directory.

File not found exception while starting Flume agent

I have installed Flume for the first time. I am using hadoop-1.2.1 and flume 1.6.0
I tried setting up a flume agent by following this guide.
I executed this command : $ bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template
It says log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: ./logs/flume.log (No such file or directory)
Isn't the flume.log file generated automatically? If not, how can I rectify this error ?
Try this:
mkdir ./logs
sudo chown `whoami` ./logs
bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template
The first line creates the logs directory in the current directory if it does not already exist. The second one sets the owner of that directory to the current user (you) so that flume-ng running as your user can write to it.
Finally, please note that this is not the recommended way to run Flume, just a quick hack to try it.
You are getting this error probably because you are running command directly on console, you've to first go to the bin in flume and try running your command there over console.
As #Botond says, you need to set the right permissions.
However, if you run Flume within a program, like supervisor or with a custom script, you might want to change the default path, as it's relative to the launcher.
This path is defined in your /path/to/apache-flume-1.6.0-bin/conf/log4j.properties. There you can change the line
flume.log.dir=./logs
to use an absolute path that you would like to use - you still need the right permissions, though.

Error: Could not find or load main class org.apache.flume.node.Application - Install flume on hadoop version 1.2.1

I have built a hadoop cluster which 1 master-slave node and the other is slave. And now, I wanna build a flume to get all log of the cluster on master machine. However, when I try to install flume from tarball and I always get:
Error: Could not find or load main class org.apache.flume.node.Application
So, please help me to find the answer, or the best way to install flume on my cluster.
many thanks!
It is basically because of FLUME_HOME..
Try this command
$ unset FLUME_HOME
I know its been almost a year for this question, but I saw it!
When you set your agnet using sudo bin/flume-ng.... make sure to specify the file where the agent configuration is.
--conf-file flume_Agent.conf -> -f conf/flume_Agent.conf
This did the trick!
look like you run flume-ng in /bin folder
flume after build in
/flume-ng-dist/target/apache-flume-1.5.0.1-bin/apache-flume-1.5.0.1-bin
run flume-ng in this
I suppose you are trying to run flume from cygwin on windows? If that is the case, I had a similar issue. The problem might be with the flume-ng script.
Find the following line in bin/flume-ng:
$EXEC java $JAVA_OPTS $FLUME_JAVA_OPTS "${arr_java_props[#]}" -cp "$FLUME_CLASSPATH" \
-Djava.library.path=$FLUME_JAVA_LIBRARY_PATH "$FLUME_APPLICATION_CLASS" $*
and replace it with this
$EXEC java $JAVA_OPTS $FLUME_JAVA_OPTS "${arr_java_props[#]}" -cp `cygpath -wp "$FLUME_CLASSPATH"` \
-Djava.library.path=`cygpath -wp $FLUME_JAVA_LIBRARY_PATH` "$FLUME_APPLICATION_CLASS" $*
Notice that the paths have been replaced with the windows directories. Java would not be able to find the library paths from the cygdrive paths and we would have to convert it to the correct windows paths wherever applicable
Maybe you are using the source files, you first should compile the source code and generate the binary code, then inside the binary files directory, you can execute: bin/flume-ng agent --conf ./conf/ -f conf/flume.conf -Dflume.root.logger=DEBUG,console -n agent1. All these information you can follow: https://cwiki.apache.org/confluence/display/FLUME/Getting+Started
I got same issue before, it's simply due to FLUME_CLASSPATH not set
the best way to debug is see the java command being fired and make sure that flume lib is included in the CLASSPATH (-cp),
As in following command its looking for /lib/*, thats where the flume-ng-*.jar are, but its incorrect because there's nothing in /lib, in this line -cp '/staging001/Flume/server/conf://lib/*:/lib/*'. It has to be ${FLUME_HOME}/lib.
usr/lib/jvm/java-1.8.0-ibm-1.8.0.3.20-1jpp.1.el7_2.x86_64/jre/bin/java -Xms100m -Xmx500m $'-Dcom.sun.management.jmxremote\r' \
-Dflume.monitoring.type=http \
-Dflume.monitoring.port=34545 \
-cp '/staging001/Flume/server/conf://lib/*:/lib/*' \
-Djava.library.path= org.apache.flume.node.Application \
-f /staging001/Flume/server/conf/flume.conf -n client
So, if you look at the flume-ng script,
There's FLUME_CLASSPATH setup, which if absent it is setup based on FLUME_HOME.
# prepend $FLUME_HOME/lib jars to the specified classpath (if any)
if [ -n "${FLUME_CLASSPATH}" ] ; then
FLUME_CLASSPATH="${FLUME_HOME}/lib/*:$FLUME_CLASSPATH"
else
FLUME_CLASSPATH="${FLUME_HOME}/lib/*"
fi
So make sure either of those environments is set. With FLUME_HOME set, (I'm using systemd)
Environment=FLUME_HOME=/staging001/Flume/server/
Here's the working java exec.
/usr/lib/jvm/java-1.8.0-ibm-1.8.0.3.20-1jpp.1.el7_2.x86_64/jre/bin/java -Xms100m -Xmx500m \
$'-Dcom.sun.management.jmxremote\r' \
-Dflume.monitoring.type=http \
-Dflume.monitoring.port=34545 \
-cp '/staging001/Flume/server/conf:/staging001/Flume/server/lib/*:/lib/*' \
-Djava.library.path= org.apache.flume.node.Application \
-f /staging001/Flume/server/conf/flume.conf -n client

Resources