NodeManager not started in Hadoop Yarn - hadoop

I have setup hadoop and yarn in standalone mode for now.
I am trying to start all process in yarn. All process are started except nodemanager. It is throwing jvm error everytime.
[root#ip-10-100-223-16 hadoop-0.23.7]# sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /root/hadoop-0.23.7/logs/yarn-root-nodemanager-ip-10-100-223-16.out
Unrecognized option: -jvm
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
What can be the issue? Any help is appreciated.

Following link has a patch for the above issue : https://issues.apache.org/jira/browse/MAPREDUCE-3879
in bin/yarn script, we need to comment following lines. Here :
'-' : shows removal of lines
'+' : shows addition of lines
elif [ "$COMMAND" = "nodemanager" ] ; then
CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/nm-config/log4j.properties
CLASS='org.apache.hadoop.yarn.server.nodemanager.NodeManager'
- if [[ $EUID -eq 0 ]]; then
- YARN_OPTS="$YARN_OPTS -jvm server $YARN_NODEMANAGER_OPTS"
- else
- YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"
- fi
+ YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"
elif [ "$COMMAND" = "proxyserver" ] ; then
CLASS='org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer'
YARN_OPTS="$YARN_OPTS $YARN_PROXYSERVER_OPTS"
Above patch is available on this location.
Courtesy LorandBendig for helping me .

Related

Troubleshooting PM2 Restart in Cron

Update: I was not able to get this working as I desired. I ended up using the pm2 --cron flag since cron couldn't find my pm2 processes.
I am using pm2 restart in a bash script and it always fails on the pm2 restart projectName commands when run by cron (as root) but works when run manually with sudo. I am not sure how to troubleshoot as the pm2 log files don't show anything obvious. It looks like someone else has had the same issue so perhaps it's just a bug? Has anybody else found a way to resolve this issue?
Thank you!
Edit: Adding some context for #kevinnls
Here's the .sh script; I've isolated the command so I can test cron
# Vars
NOW=$(date +"%Y-%m-%d")
LOGS_PATH="/scriptLoc/logs"
LOG_FILE="$LOGS_PATH/$NOW-log.txt"
BE_PATH="/beLoc"
# Start log file
date >> $LOG_FILE
echo "Running..." >> $LOG_FILE
# Temp
cd $BE_PATH
pm2 restart be >> $LOG_FILE
pm2 restart be || echo "ERROR: BE pm2 restart: $?" >> $LOG_FILE
echo "Done." >> $LOG_FILE
exit
If I run the command with sudo ./script.sh it works
If I run it with cron I see the following output in my log file:
Fri Mar 26 17:35:01 UTC 2021
Running...
Use --update-env to update environment variables
ERROR: BE pm2 restart: 1
Done.
If I view pm2 logs:
I see it exit with code 0 and restart when I run the script manually.
I see no output for the failed restart from cron
The error you got:
[PM2][ERROR] Process or Namespace be not found
What is "be"? seems like pm2 from inside of cron has no idea what to make of it.
If it is a file, when you ran the script in your shell, was the file be in your working directory?
You can make these changes to log your error messages to $LOG_FILE.
This should help you get more details.
#redirect output *and* errors to log
pm2 restart be &>> $LOG_FILE
exit_code=$?
# if previous exit code != 0 do everything after `&&'
[[ $exit_code -ne 0 ]] && echo "ERROR: BE pm2 restart: $exit_code" >> $LOG_FILE
Then try running your script again and share the errors encountered.
I was not able to resolve this in the manner I desired. Ultimately I used the -c (cron) flag with pm2 itself.
pm2 cli reference
gitlab issue

docker logs: bash output redirection not working in case of 'exit 1'

I have an issue with exiting from bash in case of error (exit 1).
The output of my bash script is redirected to a logfile but in case of an error, the log file is not created. In the normal scenario, I have a proper log file.
What complicates my issue is that my script is executed from Dockerfile.
the relevant part of my Dockerfile:
RUN $ORACLE_HOME/$SCRIPT_FILE.sh param1 param2 >$ORACLE_HOME/$SCRIPT_FILE.sh.log 2>&1
content of the bash script:
#!/bin/bash
ORACLE_HOME=$1
DOMAIN_NAME=$2
DOMAIN_HOME=$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME
echo "Oracle home: "$ORACLE_HOME
echo "Domain home: "$DOMAIN_HOME
echo "Domain name: "$DOMAIN_NAME
# validation
PROPERTIES_FILE=$ORACLE_HOME/properties/domain.properties
if [ ! -e "$PROPERTIES_FILE" ]; then
echo "A properties file with the username and password needs to be supplied."
exit 1
fi
...
wlst.sh ... >$ORACLE_HOME/$SCRIPT_FILE.py.log 2>&1
RETURN_VALUE=$?
if [ $RETURN_VALUE -ne 0 ]; then
echo "Domain creation failed. Return code: $RETURN_VALUE. For more details please check the '$SCRIPT_FILE.py.log' file."
exit 1
fi
...
So in case of any error, my bash script returns with exit 1 and in this case the output redirections to logfiles do not work. In case of any error, I do not have log files that are so bad.
This is the output redirection I use:
something.sh >something.sh.log 2>&1
Thanks to the community finally I was able to solve my issue.
According to #DavidMaze comment, my issue was related to the behavior of the RUN command:
If the script exits with a status of 1, then the RUN command fails, and Docker won’t create an image from that step; the container filesystem is lost.
That was the reason why I could not see any log files in the container in case of an error in the bash script.
The solution is
create a logfile in the image
piped it into PID 1 in order to see log content in docker logs
bash scripts need to write logs into this logfile
So everything you write to the log file will be shown in docker standard output.
My final Dockerfile:
# set the home directory
WORKDIR ${ORACLE_HOME}
# piped logfile into PID 1 in order to see log content in docker logs
RUN touch $ORACLE_HOME/$SCRIPT_FILE.log
RUN ln -sf /proc/1/fd/1 $ORACLE_HOME/$SCRIPT_FILE.log
# execute a script
RUN $ORACLE_HOME/$SCRIPT_FILE.sh $ORACLE_HOME $DOMAIN_NAME $SCRIPT_FILE >$ORACLE_HOME/$SCRIPT_FILE.log 2>&1
bash script ($SCRIPT_FILE.sh):
#!/bin/bash
# define variables
ORACLE_HOME=$1
DOMAIN_NAME=$2
SCRIPT_FILE=$3
DOMAIN_HOME=$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME
echo "Oracle home: "$ORACLE_HOME
echo "Domain home: "$DOMAIN_HOME
echo "Domain name: "$DOMAIN_NAME
echo "Script file: "$SCRIPT_FILE
# validation
PROPERTIES_FILE=$ORACLE_HOME/properties/domain.properties
if [ ! -e "$PROPERTIES_FILE" ]; then
echo "A properties file with the username and password needs to be supplied."
exit 1
fi
...
wlst.sh ... $ORACLE_HOME/$SCRIPT_FILE.py >$ORACLE_HOME/$SCRIPT_FILE.log 2>&1
RETURN_VALUE=$?
if [ $RETURN_VALUE -ne 0 ]; then
echo "Domain creation failed. Return code: $RETURN_VALUE."
exit 1
fi
The output during the image build process looks perfect:
Step 11/11 : RUN $ORACLE_HOME/$SCRIPT_FILE.sh $ORACLE_HOME $DOMAIN_NAME $SCRIPT_FILE >$ORACLE_HOME/$SCRIPT_FILE.log 2>&1
---> Running in 1e7d0d80821e
Oracle home: /u01/oracle
Domain home: /u01/oracle/user_projects/domains/DEV_DOMAIN
Domain name: DEV_DOMAIN
Script file: create-admin-server
Username : weblogic
Password : weblogic12
Creating password file for Administration server...
Creating WebLogic domain...
Initializing WebLogic Scripting Tool (WLST) ...
Welcome to WebLogic Server Administration Scripting Shell
Type help() for help on available commands
domain name : DEV_DOMAIN
domain path : /u01/oracle/user_projects/domains/DEV_DOMAIN
admin server name : AdminServer
admin server port : 7001
machine name : host_1
cluster name : DEV_DOMAIN_CLUSTER
managed server name prefix: managed_server
managed server port : 7011
production mode enabled : true
WebLogic domain has been created successfully
Script completed successfully.
Removing intermediate container 1e7d0d80821e
Successfully built b4f63eb00e62
Successfully tagged <my-image name>:<version>
---> b4f63eb00e62

How do I get that a WebSphere application has been installed in Jython?

When I deploy an .ear application in WebSphere I have a problem in installing the shared libraries. I use a workaround to solve my issue like that
[... code to install the application]
&& sleep 60
&& /opt/IBM/WebSphere/AppServer/bin/wsadmin.sh -lang jython -c \
"AdminApp.edit('appname', ['-MapSharedLibForMod', [['.*','.*', 'ibm']]])"
because I need to be sure that the .ear file has been installed before calling AdminApp.edit
How can I get rid of the sleep command? Is there a way to get a signal that the app has been installed?
In my deploy script (bash) I call:
#!/bin/bash
$DM_WAS_HOME/wsadmin.sh -f $SCRIPTS_HOME/application_deploy.jacl $WORKING_DIRECTORY/appServer/$EAR_NAME $dmserver
if [ $? -eq 0 ]
then
$DM_WAS_HOME/wsadmin.sh -lang jython -f $SCRIPTS_HOME/link_shared_lib.jython
if [ $? -ne 0 ]
then
echo "ERROR: could not link libraries."
exit 2
fi
else
echo "ERROR: installation failed, fix it"
exit 1
fi
Anything goes wrong in the wsadmin.sh installation and the exit status is not 0. This way if your install takes more time for some reason, it will not be an issue, since only when the first task is done will you move on.
The application installation jacl sets a bunch of variables and calls:
$AdminApp update $appname app $updateopts
$adminConfig save
foreach nodeName $SyncNode {
puts "Syncing $nodeName"
$AdminControl invoke $nodeName sync
}
So anything does not work correctly in there, the exit status is != 0.
Yes I know I have to rewrite my jacl into jython (still on WAS 7 for this application).

executing command on vagrant-mounted

I'm trying to run a command after a share is mounted with vagrant. bu I've never written an upstart script before. What I have so far is
start on vagrant-mounted
script
if [ "$MOUNTPOINT" = "/vagrant" ]
then
env CMD="echo $MOUNTPOINT mounted at $(date)"
elif [ "$MOUNTPOINT" = "/srv/website" ]
then
env CMD ="echo execute command"
fi
end script
exec "$CMD >> /var/log/vup.log"
of course that's not the actual script I want to run but I haven't gotten that far yet but the structure is what I need. My starting point has been this article. I've had a different version that was simply
echo $MOUNTPOINT mounted at `date`>> /var/log/vup.log
that version did write to the log.
Trying to use init-checkconf faile with failed to ask Upstart to check conf file

Why "hadoop -jar" command only launch local job?

I use "hadoop -jar" rather than "hadoop jar" by mistake when I submit a job.
In this case, my jar package cannot not be submit to the clusters, and only "local job runner" will be launched, which puzzled me so much.
Anyone knows the reason for that? Or the difference between "hadoop jar" and "hadoop -jar" command ?
Thank you!
/usr/bin/hadoop jar is what your Hadoop's $HADOOP_HOME/bin/hadoop script requires as an argument, where $HADOOP_HOME is where you have kept your hadoop related files.
Excerpt from hadoop script
elif [ "$COMMAND" = "jar" ] ; then
CLASS=org.apache.hadoop.util.RunJar
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
and,
elif [[ "$COMMAND" = -* ]] ; then
# class and package names cannot begin with a -
echo "Error: No command named \`$COMMAND' was found. Perhaps you meant \`hadoop ${COMMAND#-}'"
exit 1
Here COMMAND="jar" and when COMMAND=-*, or -jar it should throw an exception as coded above. I'm not sure how you can even run a local jar.

Resources