We recently deployed Ansible in our different environments and I'm running into a problem I can't find a solution to.
On two servers you have to start and stop the services by becoming a specific user.
su - itvmgr
Then you have to run a custom command to stop and start the services:
itvmgrctl stop dispatcher
itvmgrctl start dispatcher
One of the tasks looks like this:
- name: "Start Dispatcher Service"
sudo_user: itvmgr
command: su itvmgr -c '/itvmgr/bin/itvmgrctl start dispatcher'
- name: Pause
pause: seconds=15
There's another task to stop it which looks just like this one just using stop instead of start.
The problem I'm running into is Ansible stops the service fine but it fails to start the service again. I'm not getting any errors while it's running but I can't find any reason why it would stop the service fine, but the same command fails to start it.
If anyone has any suggestions on how I can troubleshoot this problem it would be greatly appreciated.
Perhaps you start-script needs an interactive shell or some environment variables?
Related
In a virtual box I have a Debian that I sometimes want to run without X. So I edited /etc/grub.d/10_linux and added another menu item with a kernel option "nox" appended. Then I added a line to /lib/systemd/system/lightdm.service, Section [Unit]:
ConditionKernelCommandLine=!nox
However, when starting this, it hangs with the message:
A start job is running for Hold until boot process finishes up (56min / no limit)
Thank you, systemd for informing me about that. I wouldn't have noticed. Yet, I would like to know, which job it is that's hanging.
The system allows me to connect via SSH, but none of the systemctl or journalctl commands I tried did tell me the name of the service causing the problem. lightdm.service itself seems to be satisfied.
I known it's a but late, but I just found out that one can use:
systemctl list-jobs
to find out what units are waiting or running at any given moment.
By adding systemd.debug-shell=1 to the kernel command line, a root shell will be available on TTY9 (crlt+alt+F9) to run the command above.
I first tried "systemd-analyze", and that gave me the message about "systemctl list-jobs".
hope this helps someone with similar problems.
I am trying to run a Docker image on amazon ECS. I am using a command that starts a shell script to boot up the program:
CMD ["sh","-c", "app/bin/app start; bash"]
in order to start it because for some reason when I run the app (elixir/phoenix app) in the background it was crashing immediately but if I run it in the foreground it is fine. If I run it this way locally, everything works fine but when I try to run it in my cluster, it shuts down. Please help!!
Docker was supposed to keep track of your running foreground process, if the process stop, the container stop. The reason your container work when you use command with "bash" because bash wasn't stop.
I guess you use she'll script to start an application that serve in background like nginx or a daemon. So try to find an option that make the app running foreground will keep your container alive. I.e nginx has an option while starting "daemon off"
for some reason when I run the app (elixir/phoenix app) in the background it was crashing immediately
So you have a corrupted application and you are looking for a kludge to make it looking like it somewhat works. This is not a reliable approach at all.
Instead you should:
make it working in background
use systemctl or upstart to manage restarts of Erlang VM on crashes
Please note that it matters where you have your application compiled. It must be the exactly same architecture/container as the production one, with same Erlang, Elixir, OS versions, otherwise nobody guarantees it will be robust or even working.
I'm using Vagrant and Ansible to create my Bitbucket Server on Ubuntu 15.10. I have the server setup complete and working but I have to manually run the start-webapp.sh script to start the server each time I reprovision the server.
I have the following task in my Bitbucket role in Ansible and when I increase the verbosity I can see that I get a positive response from the server saying it will be running at http://localhost/ but when I go to the URL the server isn't on. If I then SSH in to the server and run the script myself, getting the exact same response after running the script I can see the startup webpage.
- name: Start the Bitbucket Server
become: yes
shell: /bitbucket-server/atlassian-bitbucket-4.7.1/bin/start-webapp.sh
Any advice would be great on how to fix this.
Thanks,
Sam
Probably better to change that to an init script and use the service module to start it. For example, see this role for installing bitbucket...
Otherwise, you're subject to HUP and other issues from running processes under an ephemeral session.
we are in the middle of process migrating from Jenkins to Concourse CI and everything was pretty smooth so far. But now I have the issue, that I don't know how to solve. I would like to get any advices from the community.
What I am trying to do is a job that can run integrational or functional (web) tests using Selenium. There are few issues for us:
To run web tests I need to set up the database (and optionally, the searching engine, proxy and etc...) proxy to imitate the production environment as close at possible.
Ideally, it should be set up by docker-compose.
This database service should run in parallel of my tests
This database service should not return anything, neither error or success, because it only starts the database and nothing else
My web-tests should not be started until the database is ready
This database service should be stopped when all the web-tests were finished
As you can see, it's pretty non-trivial task. Of course, I can create an big uber-container that contains everything I need, but this is bad solution. Another option is to create a shell-script for that, but this is not flexible enough.
Is there any example how I could implement that or good practices for this issue?
Thanks!
Since version 1.3.0 it appears you can run Docker-compose in a task: https://github.com/concourse/concourse/issues/324
This appears to work:
jobs:
- name: docker-compose
public: true
serial: true
plan:
- do:
- task: docker-compose
timeout: 20m
privileged: true
config:
platform: linux
image_resource:
type: docker-image
source: {repository: "mumoshu/dcind", tag: "latest"}
run:
path: sh
args:
- -exc
- |
source /docker-lib.sh
start_docker
docker ps
docker-compose version
This is comment from author of Concourse:
There is no Docker binary or socket on the host - they're just running a Garden backend (probably Guardian). Concourse runs at an abstraction layer above Docker, so providing any sort of magic there doesn't really make sense.
The one thing missing post-1.3 is that Docker requires you to set up cgroups yourself. I forgot how annoying that is. I wish they did what Guardian does and auto-configure it, but what can ya do.
So, the full set of instructions is:
Use or a build an image with docker in it, e.g. docker:dind.
Run the following at the start of your task: https://github.com/concourse/docker-image-resource/blob/master/assets/common.sh#L1-L40
Spin up Docker with docker daemon &.
Then you can run docker-compose and friends as normal.
The downside of this is that you'll be fetching the images every time. #230 will address that.
In the long run, #324 (comment) is the direction I want to go.
See here https://github.com/concourse/concourse/issues/324
as in the accepted answer, the Slack archive data is deleted (due to Slack limit)
The docker image specialized for the usecase: https://github.com/meAmidos/dcind
It does not sound that complicated to me. I wrote a post on how to get something similar up and running here. I use some different containers for the stack and the test runner and fire up everything from an official docker:dind image with docker-compose installed on it...
Beyond the usual concourse CI stuff of fetching resources etc.
Performing a test-run would consist of :
Starting the web,rest, and other services with docker-compose up.
Starting the Testrunner service and fire the test-suites on the
web-page which communicates with the rest layer, which in turn is
dependent on the other services for responses.
Performing docker-compose down when the test-runner completes and
deciding the return-code of the task (0=fail, 1=success) based upon
the return code of the test-suite.
To cleanly setup and tear down the stack and test runner you could do something like the below, ( maybe you could use depends if your service is not started when the test begins, for me it works without)
# Setup the SUT stack:
docker-compose up -d
# Run the test-runner container outside of the SUT to be able to teardown the SUT when testing is completed:
docker-compose run --rm test-runner --entrypoint '/entrypoint.sh /protractor/project/conf-dev.js --baseUrl=http://web:9000/dist/ --suite=my_suite'
# Store the return-code from the tests and teardown:
rc=$?
docker-compose down
echo "exit code = $rc "
kill %1
exit $rc
I just started using Ansible and I am having trouble running a server.
I have a server which can be started using java -jar target/server-1.0-SNAPSHOT.jar. However, this will start the server and keep running forever displaying output, so Ansible never finishes.
This is what I tried that never finishes:
- name: Start server
command: chdir=~/server java -jar target/server-1.0-SNAPSHOT.jar
What is the proper way to do this?
Either create a service, as #udondan suggests, or use an asynchronous task to launch your server. http://docs.ansible.com/ansible/playbooks_async.html
As #Petro026 suggested, your choices are asynchronous task or creating a service.
I would strongly suggest against the asynchronous task approach. It's a very fragile solution:
What if the host is restarted?
What if you run your playbook twice?
What if your server app just dies?
Your best bet is to create a service for it, and probably the easiest approach for it would involve using a process control system like supervisord, which is supported by ansible.
From the supervisor docs:
Supervisor is a client/server system that allows its users to monitor
and control a number of processes on UNIX-like operating systems.
Put that in a PID and send the output to nohup.
Something like this:
nohup java -jar target/server-1.0-SNAPSHOT.jar &
In your playbook:
name: Start server
command: chdir=~/server nohup java -jar target/server-1.0-SNAPSHOT.jar &
If you want kill the process kill -9 #numerofpid.