How can I get job configuration in command line? - hadoop

I get get running apps with this yarn application -appStates RUNNING then I get one applicationID from list.
then I can get status of app with this: yarn application -status
I want to get job configuration information on command line. it is possible?

That's not "Job Configuration". It is whole cluster config.
You can use cURL to parse it
$ curl -s http://localhost:8088/conf | grep defaultFS
<property><name>fs.defaultFS</name><value>file:///</value><final>false</final><source>core-default.xml</source></property>
<property><name>mapreduce.job.hdfs-servers</name><value>${fs.defaultFS}</value><final>false</final><source>mapred-default.xml</source></property>

Related

stdin is not a tty when try to run multiple commands using '&' operator in shell script

I was working with a microservice project, so I needed to run all services at once so I set up bash script but throw stdin is not a tty error and run only the last line of command
yarn --cwd /d/offic_work/server/customer/ start:dev &
yarn --cwd /d/offic_work/server/admin start:dev &
yarn --cwd /d/offic_work/server/orders/ start:dev &
yarn --cwd /d/offic_work/server/product start:dev
I'm not particularly familiar with yarn, but "is not a tty" means it is seeking input from the user, and can't get any because you ran it in the background. So what you need to do is run it in the foreground, find out what input it is seeking, then figure out what command line arguments, or otherwise configuration will let it run without user intervention. Then when you know that you can alter your script or take appropriate action so that it can run in the background.
In some cases, programs expect a user confirmation "y" to some question. That's why in unix the "yes" command exists, which outputs endless "y"s for such a purpose. You could also try piping yes to your command:
yes | yarn --cwd /d/offic_work/server/customer/ start:dev &
first
Create a list of your services i.e. list.txt
customer
admin
orders
product
second
Run them in parallel with xargs -P 0
# dry-run - test
xargs -I SERVICE -P 0 echo "yarn --cwd /d/offic_work/server/SERVICE start:dev" < list.txt
# run - remove the echo
xargs -I SERVICE -P 0 yarn --cwd /d/offic_work/server/SERVICE start:dev < list.txt

How to execute gcloud command in bash script from crontab -e

I am trying execute some gcloud commands in bash script from crontab. The script execute sucessfully from command shell but not from the cron job.
I have tried with:
Settng the full path to gcloud like:
/etc/bash_completion.d/gcloud
/home/Arturo/.config/gcloud
/usr/bin/gcloud
/usr/lib/google-cloud-sdk/bin/gcloud
Setting in the begin the script:
/bin/bash -l
Setting in the crontab:
51 21 30 5 6 CLOUDSDK_PYTHON=/usr/bin/python2.7;
/home/myuser/folder1/myscript.sh param1 param2 param3 -f >>
/home/myuser/folder1/mylog.txt`
Setting inside the script:
export CLOUDSDK_PYTHON=/usr/bin/python2.7
Setting inside the script:
sudo ln -s /home/myuser/google-cloud-sdk/bin/gcloud /usr/bin/gcloud
Version Ubuntu 18.04.3 LTS
command to execute: gcloud config set project myproject
but nothing is working, maybe I am doing something wrongly. I hope you can help me.
You need to set your user in your crontab, for it to run the gcloud command. As well explained in this other post here, you need to modify your crontab to fetch the data in your Cloud SDK, for the execution to occur properly - it doesn't seem that you have made this configuration.
Another option that I would recommend you to try out, it's using a Cloud Scheduler to run your gcloud commands. This way, you can use gcloud for your cron jobs in a more integrated and easy way. You can verify more information about this option here: Creating and configuring cron jobs
Let me know if the information helped you!
I found my error, the problem here was only in the command: "gcloud dns record-sets transaction start", the others command was executing sucesfully but only no logging nothing, by that I though that was not executng the other commands. This Command create a temp file ex. transaction.yaml and that file could not be created in the default path for gcloud(snap/bin), but the log simply dont write any thing!. I had to specify the path and name for that file with the flag --transaction-file=mytransaction.yaml. Thanks for your supprot and ideas
I have run into the same issue before. I fixed it by forcing the profile to load in my script.sh,loading the gcloud environment variables with it. Example below:
#!/bin/bash
source /etc/profile
gcloud config set project myprojectecho
echo "Project set to myprojectecho."
I hope this can help others in the future with similar issues, as this also helped me when trying to set GKE nodes from 0-4 on a schedule.
Adding the below line to the shell script fixed my issue
#Execute user profile
source /root/.bash_profile

Logstash cannot start because of multiple instances even though there are no instances of it running

I keep getting this error [2019-02-26T16:50:41,329][FATAL][logstash.runner ] Logstash could not be started because there is already another instance using the configured data directory. If you wish to run multiple instances, you must change the "path.data" setting.
when I launch logstash. I am using the cli to launch logstash. The command that I execute is:
screen -d -S logstash -m bash -c "cd;export JAVA_HOME=/nastools/jdk1.8.0_77/; export LS_JAVA_OPTS=-Djava.net.preferIPv4Stack=true; ~/monitoring/6.2.3/bin/logstash-6.2.3/bin/logstash -f ~/monitoring/6.2.3/config/logstash_forwarder/forwarder.conf"
I don't have any instance of logstash running. I tried running this:
ps xt | grep "logstash" and it didn't return any process. I tried the following as well: killall logstash but to no avail, it gives me the same error. I tried restarting my machine as well but still the same error.
Has anyone experienced something similar? Kibana and elastic search launch just fine.
Thanks in advance for your help!
The problem is solved now. I had to empty the contents of the data directory of logstash. I then restarted it and it generated the uuid and other files it needed.
To be more specific, you need to cd to the data folder of logstash (usually it is /usr/share/logstash/data) and delete the .lock file.
You can see if this file exists with:
ll -lah
In the data folder.
Learn it from http://www.programmersought.com/article/2009814657/;jsessionid=282FF6001AFE90D7D8609975B8222CE8
sudo /usr/share/logstash/bin/logstash --path.settings /etc/logstash/ --path.data sensor39 -f /etc/logstash/conf.d/company_dump.conf --config.reload.automatic
Try this cmd I hope it will work(but please check the .conf file path)

Can't view any 'dependencies' inside zipkin UI dependencies tab

I do have several services interacting with each other, and all of them, sending traces to openzipkin ( https://github.com/openzipkin/docker-zipkin ).
While i can see the system behaviour in detail , looks like the 'dependencies' tab does not display anything at all.
The trace i check has 6 services, 21 spans and 43 spans, and i believe something should appear.
I'm using latest ( 1.40.1 ) docker-zipkin, with cassandra as storage, and
just connecting to the cassandra instance, can see there's no entry in the dependencies 'table'. why ?
Thanks
Same problem here with the docker images using Cassandra (1.40.1, 1.40.2, 1.1.4).
This is a problem specific to using Cassandra as the storage tier. Mysql and the in-memory storage generate the dependency graph on-demand as expected.
There are references to the following project to generate the Cassandra graph data for the UI to display.
https://github.com/openzipkin/zipkin-dependencies-spark
This looks to be superseded by ongoing work mentioned here
https://github.com/openzipkin/zipkin-dependencies-spark/issues/22
If storage type is other than inline storage, for zipkin dependencies graph you have to start separate cron job/scheduler which reads the storage database and builds the graph. Because zipkin dependencies is separate spark job .
For reference : https://github.com/openzipkin/docker-zipkin-dependencies
I have used zipkin with elastic search as storage type. I will share the steps for setting up the zipkin dependencies with elastic search and cron job for the same:
1.cd ~/
2. curl -sSL https://zipkin.io/quickstart.sh | bash -s io.zipkin.dependencies:zipkin-dependencies:LATEST zipkin-dependencies.jar
3. touch cron.sh or vi cron.sh
4. paste this content :
STORAGE_TYPE=elasticsearch ES_HOSTS=https:172.0.0.1:9200 ES_HTTP_LOGGING=BASIC ES_NODES_WAN_ONLY=true java -jar zipkin-dependencies.jar
5.chmode a+x cron.sh //make file executable
6.crontab -e
window will open paste below content
0 * * * * cd && ./cron.sh //every one hour it will start the cron job if you need every 5 minutes change the commmand to '*/5 * * * * cd && ./cron.sh'
7. to check cron job is schedule run commant crontab -l
Other solution is to start a separate service and run the cron job
using docker
Steps to get the latest zipkin-dependencies jar try running given
command on teminal
cd /zipkindependencies // where your Dockerfile is available
curl -sSL https://zipkin.io/quickstart.sh | bash -s io.zipkin.dependencies:zipkin-dependencies:LATEST
you will get jar file at above mention directory
Dockerfile
FROM openjdk:8-jre-alpine
ENV STORAGE_TYPE=elasticsearch
ENV ES_HOSTS=http://172.0.0.1:9200
ENV ES_NODES_WAN_ONLY=true
ADD crontab.txt /crontab.txt
ADD script.sh /script.sh
COPY entry.sh /entry.sh
COPY zipkin-dependencies.jar /
RUN chmod a+x /script.sh /entry.sh
RUN /usr/bin/crontab /crontab.txt
CMD ["/entry.sh"]
EXPOSE 8080
entry.sh
#!/bin/sh
# start cron
/usr/sbin/crond -f -l 8
script.sh
#!/bin/sh
java ${JAVA_OPTS} -jar /zipkin-dependencies.jar
crontab.txt
0 * * * * /script.sh >> /var/log/script.log

File not found exception while starting Flume agent

I have installed Flume for the first time. I am using hadoop-1.2.1 and flume 1.6.0
I tried setting up a flume agent by following this guide.
I executed this command : $ bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template
It says log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: ./logs/flume.log (No such file or directory)
Isn't the flume.log file generated automatically? If not, how can I rectify this error ?
Try this:
mkdir ./logs
sudo chown `whoami` ./logs
bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template
The first line creates the logs directory in the current directory if it does not already exist. The second one sets the owner of that directory to the current user (you) so that flume-ng running as your user can write to it.
Finally, please note that this is not the recommended way to run Flume, just a quick hack to try it.
You are getting this error probably because you are running command directly on console, you've to first go to the bin in flume and try running your command there over console.
As #Botond says, you need to set the right permissions.
However, if you run Flume within a program, like supervisor or with a custom script, you might want to change the default path, as it's relative to the launcher.
This path is defined in your /path/to/apache-flume-1.6.0-bin/conf/log4j.properties. There you can change the line
flume.log.dir=./logs
to use an absolute path that you would like to use - you still need the right permissions, though.

Resources