Kafka in supervisor mode - apache-storm

I'm trying to run kafka in supervision mode so that it can start automatically in case of a shutdown. But all the examples of running kafka use shell scripts and the supervisord is not able to note which PID to monitor. Can anyone suggesthow to accomplish auto restart of kafka?

If you are on a Unix or Linux machine, then this is when /etc/inittab comes in handy. Or you might want to use daemontools. I don't know about Windows though.

We are running Kafka under Supervisord (http://supervisord.org/), it works like a charm. Run command looks like this (as specified in supervisord.conf file:
command=/usr/local/bin/pidproxy /var/run/kafka.pid /usr/lib/kafka/bin/kafka-server.sh -f -p /var/run/kafka.pid
Flag -f tells Kafka to start in foreground. If flag -p is set, Kafka process PID is written into specified file.
The command pidproxy is a part of Supervisord distribution. Upon receiving KILL signal, it reads PID from specified file, and forwards the signal to the corresponding process.

Related

Script invoked from remote server unable to run service correctly

I have a unix script that invokes another script on a remote unix server.
amongst other commands i am stopping a service. The stop command essentially translates to
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "service aem stop"'
The service is getting stopped but when i start back the service it just creates the .pid file and does not perform the start up. When i run the command for start i.e.
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "service aem start"'
it does not show any error. On going to the server and checking the status
service aemauthor status
Below message is displayed
aem dead but pid file exists
Also when starting the service by logging in to the server, it works as expected along with the message
Removing stale pidfile (pid: 8701)
Starting aem
We don't know the details of the service script of aem.
I guess the problem is related to the SIGHUP signal. When we log off from a shell or disconnect from ssh, the OS will send HUP signal to all processes that started in this terminated shell. If the process didn't handle the HUP signal, it would exit by default.
When we run a command via ssh remotely, the process started by this command will receive HUP signal after ssh session is terminated.
We can use the nohup command to ignore the HUP signal.
You can try
ssh -t -t -q ${AEM_USER}#${SERVERIP} 'bash -l -c "nohup service aem start"'
If it works, you can use nohup command to start aem in the service script.
As mentioned at the stale pidfile syndrome, there are different reasons for pidfiles getting stalled, like for instance some issues with the way your handles its removal when the process exits... but considering your only experiencing when running remotely, I would guess it might be related to what is being loaded or not by your profile... check the most voted solid answer at the post below for some insights:
Why Does SSH Remote Command Get Fewer Environment Variables
As described in the comments of the mentioned post, you can try sourcing /etc/profile or ~/.bash_profile before executing your script to test it, or even trying to execute env locally and remotelly to compare variables that are being sourced or not.

Deploy a TCP Server written in Ruby

I've written a TCP Server in ruby running on port 2000 with event machine.
Right now, what I do is ssh to my server and run the command ruby lib/tcp_server.rb to turn on the server, but it shuts down when I log out.
I've tried nohup and using & but nothing seems to stick for the server for a long time.
So my question is, how do I deploy this server on port 2000 and keep it running, like how we deploy Rails to nginx.
It's not a webserver, but an a tcp server for a connected device, if that helps.
Thanks!
Solution 1: tmux or screen
This is the simplest way to approach, you will have to create a tmux or screen session, then start your server in that session.
Solution 2: nohup
nohup ruby lib/tcp_server.rb > stdout.log 2> stderr.log &
You've tried nohup and using &, I suppose you've already known how to do.
Solution 3: daemonize
You can detach from the shell and daemonize the process by forking
it twice, setting the session ID and changing the current working directory.
def daemonize
exit if fork
Process.setsid
exit if fork
Dir.chdir '/'
end
With this approach, you will have to redirect stdout and stderr to keep logs.
Another way to daemonize is to use gems like daemons.
update:
To restart the process automatically after being killed, you need a process manager like god or pm2.
To start the process automatically after booting, you need to compose an init scripts but how it looks like depends on your service management system and operating system. One of the most well-known is System V. If you are using Ubuntu, you might want to take a look at Upstart or systemd.

How to stop a logstash Config file running in Ubuntu?

Im running my logstash config file in Ubuntu using the following command.
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
Its working, However I recently realized that every time I run this command it starts another instance. Now I think there are six instances running. Because each new record i create shows as six in elasticsearch.
How can I stop all these other instances and is there any way to check how many are running?
Thanks
You can use the pkill command and specify the name of the process(es) you want to kill
pkill logstash
Or the killall command works as well the same way
killall logstash
As Val states, pkill should work to resolve what you are facing.
To avoid this in future why don't you create a small service file so which uses a PID file so you can't have multiple instances running? Here is what I did:
http://www.logstashbook.com/code/3/logstash-central.init

How can the strace logs of the ever running binary from rcS can be logged?

I am trying to do a profiling on my embedded Linux box. This is running a software.
I want to do a profiling on my software using strace.
The application is the main software that keeps running forever.
How can I run the strace and log the outputs to a file.
In my rcS script.
I run the application like this
./my_app
Now, with strace.
strace ./my_app -> I want to log these outputs on a file, and I should be able to access the file without killing the application. Remember this application never terminates.
Please help!
Instead of a target filename, use the -p option to strace to specify the process ID of an already running process you wish to attach to.
Chris is actually right. Strace takes the -p option, which enables you to attach to a running process just by specifying the processes PID.
Let's say your 'my_app' process runs with PID 2301 (you can see the PID by logging into your device and us 'ps'). Try doing 'strace -p 2301', and you will see all system calls for that PID. You can throw it to a file by redirecting everywhere: 'strace -p 2301 > /tmp/my_app-strace'.
Hope this helps.

supervisord stopping child processes

One of the problems, I face with supervisord is that when I have a command which in turn spawns another process, supervisord is not able to kill it.
For example I have a java process which when runs normally is like
$ zkServer.sh start-foreground
$ ps -eaf | grep zk
user 30404 28280 0 09:21 pts/2 00:00:00 bash zkServer.sh start-foreground
user 30413 30404 76 09:21 pts/2 00:00:10 java -Dzookeeper.something..something
The supervisord config file looks like:
[program:zookeeper]
command=zkServer.sh start-foreground
autorestart=true
stopsignal=KILL
These kind of processes which have multiple childs are not well handled by supervisord when it comes to stopping them from supervisorctl. So when I run this from the supervisord and try to stop it from supervisorctl, only the top level process gets killed but not the actual java process.
The same problem was encountered by Rick Hanlon II here: https://coderwall.com/p/4tcw7w
Option stopasgroup=true should be set in the program section for supervisord to stop not only the parent process but also the child processes.
The example is given as:
[program:some_django]
command=python manage.py runserver
directory=/dir/to/app
stopasgroup=true
Also, have in mind that you may have an older package of supervisord that does not have "stopasgroup" functionality.
I tried these Debian packages on Raspberry Pi:
supervisor_3.0a8 does not work.
supervisor_3.0b2-1 works as expected.
Doing the following early in the main bash script called by supervisord fixed the problem for me:
trap "kill -- -$$" EXIT
This kills the entire process group when the main script exits, such as when it is killed by supervisord.
A feature was recently added to supervisord to send SIGKILL to the whole process group. It's in github but not officially released yet.
If the process id is available in a file, you can use the pid-proxy program
try this supervisor program config:
stopasgroup=true
killasgroup=true
stopsignal=INT
The following article has an in-depth discussion of the problem:
http://veithen.github.io/2014/11/16/sigterm-propagation.html
You can also use priorities in /conf.d/your-configuration.conf file. For example, if you want to run zookeeper first and then kafka you can specify two programs.
Lower priority means that the program starts first and stops last.

Resources