AWS EC2 CentOS 7 Usardata file fails to load on initial boot - amazon-ec2

I'm trying to pass simple bash script to an AWS CentOS 7 instance. User Data looks like:
#!/bin/bash
yum update -y
Here is a fragment of cloud init log:
Apr 1 19:03:01 ip-172-20-60-102 cloud-init: /usr/bin/env: bash yum update -y : No such file or directory
Apr 1 19:03:01 ip-172-20-60-102 cloud-init: 2016-04-01 19:03:01,604 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [127]
Apr 1 19:03:01 ip-172-20-60-102 cloud-init: 2016-04-01 19:03:01,616 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Apr 1 19:03:01 ip-172-20-60-102 cloud-init: 2016-04-01 19:03:01,617 - util.py[WARNING]: Running scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/site-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
I tried a different approach as suggested in:
Bash script passed to AWS EC2 Instance as User Data file fails to load on initial boot
So I changed User Data to:
"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
"#!/bin/bash\n\n",
"yum update -y"
]]}}
I got different kind of error:
Apr 1 19:28:17 ip-172-20-60-102 cloud-init: 2016-04-01 19:28:17,450 - __init__.py[WARNING]: Unhandled non-multipart (text/x-not-multipart) userdata: '"UserData" : { "Fn::Base...'
Apr 1 19:28:20 ip-172-20-60-102 cloud-init: Cloud-init v. 0.7.5 running 'modules:config' at Fri, 01 Apr 2016 19:28:20 +0000. Up 48.56 seconds.
Apr 1 19:28:21 ip-172-20-60-102 cloud-init: Cloud-init v. 0.7.5 running 'modules:final' at Fri, 01 Apr 2016 19:28:20 +0000. Up 48.89 seconds.

Use absolute path. Try /usr/bin/yum -y update .

Related

Anconda env. python script autostart after reboot

having a strange behaviour.
Setup:
Debian 11 server on Proxmox
added user mupsje (modded for sudo)
installed anaconda3 on mupsje and root.
created env discordbot.
directory: /home/mupsje/discordbot/
sh. file: on_startup.sh (chmodded)
#!/bin/bash
/home/mupsje/anaconda3/envs/discordbot/bin/python /home/mupsje/discordbot/bot.py
above line bot.py
#!/home/mupsje/anaconda3/envs/discordbot/bin/python
import blablabla
Now I want to start this env and the shell script when the server is reboot.
When I login with credentials mupsje or root and test the shell script.
(base) mupsje#debian:$ cd /home/mupsje/discordbot/
(base) mupsje#debian:$ cd ~/discordbot$ ./on_startup.sh
Python scripts runs.
Now i'm trying to ad this in systemd as a service
discordbot.service
[Unit]
Description=Startup Discordbot
[Service]
ExecStart=/home/mupsje/discordbot/on_startup.sh
[Install]
WantedBy=multi-user.target
when I start and status the systemctl service.
sudo systemctl start discordbot.service
sudo systemctl status discordbot.service
I get a error back and the python is not running.
* discordbot.service - Startup Discordbot
Loaded: loaded (/etc/systemd/system/discordbot.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2022-09-05 13:05:18 UTC; 9s ago
Process: 752 ExecStart=/home/mupsje/discordbot/on_startup.sh (code=exited, status=1/FAILURE)
Main PID: 752 (code=exited, status=1/FAILURE)
CPU: 355ms
Sep 05 13:05:18 debian on_startup.sh[753]: return future.result()
Sep 05 13:05:18 debian on_startup.sh[753]: File "/home/mupsje/anaconda3/envs/discordbot/lib/python3.10/site-packages/discord/client.py", line 817, in runner
Sep 05 13:05:18 debian on_startup.sh[753]: await self.start(token, reconnect=reconnect)
Sep 05 13:05:18 debian on_startup.sh[753]: File "/home/mupsje/anaconda3/envs/discordbot/lib/python3.10/site-packages/discord/client.py", line 745, in start
Sep 05 13:05:18 debian on_startup.sh[753]: await self.login(token)
Sep 05 13:05:18 debian on_startup.sh[753]: File "/home/mupsje/anaconda3/envs/discordbot/lib/python3.10/site-packages/discord/client.py", line 577, in login
Sep 05 13:05:18 debian on_startup.sh[753]: raise TypeError(f'expected token to be a str, received {token.__class__!r} instead')
Sep 05 13:05:18 debian on_startup.sh[753]: TypeError: expected token to be a str, received <class 'NoneType'> instead
Sep 05 13:05:18 debian systemd[1]: discordbot.service: Main process exited, code=exited, status=1/FAILURE
Sep 05 13:05:18 debian systemd[1]: discordbot.service: Failed with result 'exit-code'.
What I'm doing wrong?
regards
Mupsje
Found the solution myself.
I used crontab for this, with webmin "execute cron job as"
#reboot /home/mupsje/discordbot/on_startup.sh
then in on_startup.sh file.
#!/bin/bash
source ~/anaconda3/etc/profile.d/conda.sh
conda activate discordbot
echo $CONDA_DEFAULT_ENV
cd ~/discordbot/
python ppi_bot.py
conda deactivate
Works like a charm.
If you have better solutions, please let me know

Ruby execute system command while running as service (systemd) on ubuntu

I want to run a ruby program as a service in Ubuntu but in my code I have a system call to execute an external program:
system(frame3dd, 'file.csv', 'file.out')
It works just fine when I run the program in terminal but as soon as I run it as a service, frame3dd returns error code 12. according to the manual it means:
12 : error in opening the temporary cleaned input data file for writing
Oct 09 10:56:57 vps594898 ruby[6900]: FRAME3DD version: 20140514+
Oct 09 10:56:57 vps594898 ruby[6900]: Analysis of 2D and 3D structural
frames with elastic and geometric stiffness.
Oct 09 10:56:57 vps594898 ruby[6900]: http://frame3dd.sf.net
Oct 09 10:56:57 vps594898 ruby[6900]: GPL Copyright (C) 1992-2014, Henri P.
Gavin
Oct 09 10:56:57 vps594898 ruby[6900]: This is free software with absolutely
no warranty.
Oct 09 10:56:57 vps594898 ruby[6900]: For details, see the GPL license file,
LICENSE.txt
Oct 09 10:56:57 vps594898 ruby[6900]: false
Oct 09 11:42:19 vps594898 ruby[14314]: #<Process::Status: pid 14415 exit 12>
systemd config file:
[Unit]
Description=BK Geveldragers staging
[Service]
User=server
Group=server
PIDFile=/home/server/bk_projecten/staging/server.pid
WorkingDirectory=/home/server/bk_projecten/staging
Restart=always
ExecStart=/usr/bin/ruby server.rb
ExecStop=/bin/kill -s QUIT $MAINPID
[Install]
WantedBy=multi-user.target
The problem is solved. Frame3dd creates temporary files in /tmp. Some of these files were created with root. When I was running Frame3dd with a non root account the program exits with error code 12 because it coudn't overwrite these files.

Failed to start Elasticsearch. Error opening log file '/gc.log': Permission denied

Dear StackOverflow community,
I was running Kibana/Elasticsearch without a problem until installing a Kibana plugin. Then the service failed and I noticed that the problem is that Elasticsearch stopped. I tried several ways to fix it, and then even reinstalled all. But the problem still avoiding to launch Elasticsearch, even with a fresh installation.
Installation on Debian 9 using apt install.
systemctl start elasticsearch.service
results on:
Exception in thread "main" java.lang.RuntimeException: starting java failed with [1]
[0.000s][error][logging] Error opening log file '/gc.log': Permission denied
Full log with journalctl -xe
-- Unit elasticsearch.service has begun starting up.
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"Unable to revive connection: http://localhost:9200/"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"No living connections"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"Unable to revive connection: http://localhost:9200/"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"No living connections"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Exception in thread "main" java.lang.RuntimeException: starting java failed with [1]
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: output:
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: [0.000s][error][logging] Error opening log file '/gc.log': Permission denied
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: [0.000s][error][logging] Initialization of output 'file=/var/log/elasticsearch/gc.log' using options 'filecount=32,filesize=64m' failed.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: error:
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Error: Could not create the Java Virtual Machine.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Error: A fatal exception has occurred. Program will exit.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.flagsFinal(JvmErgonomics.java:118)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.finalJvmOptions(JvmErgonomics.java:86)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:59)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:92)
Feb 07 14:09:06 Debian-911-stretch-64-minimal systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Feb 07 14:09:06 Debian-911-stretch-64-minimal systemd[1]: Failed to start Elasticsearch.
-- Subject: Unit elasticsearch.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit elasticsearch.service has failed.
The mentioned gc.log file was not in that folder. And the permissions were:
drwxr-s--- 2 elasticsearch elasticsearch 4096 Jan 15 13:20 elasticsearch
I created the file and also played with permissions until having these:
-rwxrwxrwx 1 root elasticsearch 0 Feb 7 15:19 gc.log
...and even changed the ownership:
-rwxrwxrwx 1 root root 0 Feb 7 15:19 gc.log
But no success, I still having the same issue.
Thanks
Make sure you are running CMD as Administrator.
This error also happens if you are using docker & running the container as a different user. You have to add --group_add flag to docker command or set TAKE_FILE_OWNERSHIP environment variable as mentioned here
Using docker-compose:
user: 1007:1007
group_add:
- 0
Using docker:
--group-add 0
Firstly, I didn't know why gc.log file was not present. Have you changed the logs folder path or something? The gc.log path can be set in jvm.options file. By default ES logs and java garbage collection logs are fed into the logs folder inside $ES_HOME directory.
About user perspective, elastic search can't be run as root user. So from the ES directory details its showing you have an elasticsearch user created, and trying to run the cluster by that user.
The problem here can be solved by changing the permissions of files insdie the ES directory where all it belongs. Now the gc.log file is owned by root user and it cannot be accessed by the elasticsearch user.
Try this: sudo chown <user> <path/to/es/directory> -R
Here it becomes : sudo chown elasticsearch elasticsearch/ -R
If the issue still persists, check the jvm.options file whether its all configured correctly. Unless you change the -Xloggc:logs/gc.log option, the gc.log won't be pushing to /var/log.
Feb 09 17:09:02 server elasticsearch[2199]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Your log says, the option is given as file=/var/log/elasticsearch/gc.log. Correct any wrong configurations as per documentation : https://www.elastic.co/guide/en/elasticsearch/reference/master/jvm-options.html
sudo systemctl -l status elasticsearch.service
Returns this log:
● elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/elasticsearch.service.d
└─override.conf
Active: failed (Result: exit-code) since Sun 2020-02-09 17:09:02 CET; 2min 48s ago
Docs: http://www.elastic.co
Process: 2199 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=1/FAILURE)
Main PID: 2199 (code=exited, status=1/FAILURE)
Feb 09 17:09:02 server elasticsearch[2199]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Feb 09 17:09:02 server elasticsearch[2199]: Error: Could not create the Java Virtual Machine.
Feb 09 17:09:02 server elasticsearch[2199]: Error: A fatal exception has occurred. Program will exit.
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.flagsFinal(JvmErgonomics.java:118)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.finalJvmOptions(JvmErgonomics.java:86)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:59)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:92)
Feb 09 17:09:02 server systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Feb 09 17:09:02 server systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
Feb 09 17:09:02 server systemd[1]: Failed to start Elasticsearch.
At this point I'm doing a fresh install. Not able to find the solution I need to continue working...

Marathon exited with status 1

I am installing mesosphere on ubuntu 16.04 xenial .zookeeper and mesos-master and mesos-slave are running fine ,while starting marathon I am getting this issue .
Required option 'master' not found .I have created folder in /etc/marathon/conf .These are the steps I am following for marathon .
sudo mkdir -p /etc/marathon/conf
sudo cp /etc/mesos-master/hostname /etc/marathon/conf
sudo cp /etc/mesos/zk /etc/marathon/conf/master
sudo cp /etc/marathon/conf/master /etc/marathon/conf/zk
sudo nano /etc/marathon/conf/zk ,edit mesos to marathon in the end .
I am attaching the whole logs here,
Jan 25 14:18:01 master01 cron[859]: (*system*) INSECURE MODE (group/other writable) (/etc/crontab)
Jan 25 14:18:01 master01 cron[859]: (*system*popularity-contest) INSECURE MODE (group/other writable) (/etc/cron.d/popularity-contest)
Jan 25 14:18:01 master01 cron[859]: (*system*php) INSECURE MODE (group/other writable) (/etc/cron.d/php)
Jan 25 14:18:01 master01 cron[859]: (*system*anacron) INSECURE MODE (group/other writable) (/etc/cron.d/anacron)
Jan 25 14:18:29 master01 systemd[1]: marathon.service: Service hold-off time over, scheduling restart.
Jan 25 14:18:29 master01 systemd[1]: Stopped Scheduler for Apache Mesos.
Jan 25 14:18:29 master01 systemd[1]: Starting Scheduler for Apache Mesos...
Jan 25 14:18:29 master01 systemd[1]: Started Scheduler for Apache Mesos.
Jan 25 14:18:29 master01 marathon[29366]: No start hook file found ($HOOK_MARATHON_START). Proceeding with the start script.
Jan 25 14:18:30 master01 marathon[29366]: [scallop] Error: **Required option 'master' not found**
Jan 25 14:18:30 master01 systemd[1]: marathon.service: Main process exited, code=exited, status=1/FAILURE
Jan 25 14:18:30 master01 systemd[1]: marathon.service: Unit entered failed state.
Jan 25 14:18:30 master01 systemd[1]: marathon.service: Failed with result 'exit-code'.
Breaking Changes / Packaging standardized
We now publish more normalized packages that attempt to follow Linux Standard Base Guidelines and use sbt-native-packager to achieve this. As a result of this and the many historic ways of passing options into marathon, we will only read /etc/default/marathon when starting up. This file, like /etc/sysconfig/marathon, has all marathon command line options as "MARATHON_XXX=YYY" which will translate to --xx=yyy. We no longer support /etc/marathon/conf which was a set of files that would get translated into command line arguments. In addition, we no longer assume that if there is no zk/master argument passed in, then both are running on localhost.
Try to keep config in the environment.
cat << EOF > /etc/default/marathon
MARATHON_MASTER=zk://127.0.0.1:2181/mesos
MARATHON_ZK=zk://127.0.0.1:2181/marathon
EOF
Remember to replace 127.0.0.1:2181 with proper Zookeeper location.
I am using Ubuntu 14.04 in my case janisz solution did not work as I needed to add export
cat << EOF > /etc/default/marathon
export MARATHON_MASTER=zk://127.0.0.1:2181/mesos
export MARATHON_ZK=zk://127.0.0.1:2181/marathon
EOF

Chef isn't running the apt (apt-get update) recipe. Apt returns 100

Running Ubuntu 11.04 on vagrant, mac os x 10.7.2. Running chef server.
Trying to install the postgresql community chef recipe, I get the following error, even though my base role looks something like this (I added the apt recipe to try to update apt-get):
name "base"
description "The base role for systems"
run_list(
"recipe[apt]",
"recipe[vim]"
)
Trying to do a chef run:
$ vagrant reload db1dev
[db1dev] Attempting graceful shutdown of linux...
[db1dev] Preparing host only network...
[db1dev] Clearing any previously set forwarded ports...
[db1dev] Forwarding ports...
[db1dev] -- ssh: 22 => 2222 (adapter 1)
[db1dev] Cleaning previously set shared folders...
[db1dev] Creating shared folders metadata...
[db1dev] Running any VM customizations...
[db1dev] Booting VM...
[db1dev] Waiting for VM to boot. This can take a few minutes.
[db1dev] VM booted and ready for use!
[db1dev] Enabling host only network...
[db1dev] Mounting shared folders...
[db1dev] -- v-root: /vagrant
[db1dev] Running provisioner: Vagrant::Provisioners::ChefClient...
[db1dev] Creating folder to hold client key...
[db1dev] Uploading chef client validation key...
[db1dev] Generating chef JSON and uploading...
[db1dev] Running chef-client...
[db1dev] stdin: is not a tty
: stderr
[db1dev] [Thu, 19 Jan 2012 21:44:45 -0800] INFO: *** Chef 0.10.2 ***
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:46 -0800] INFO: Client key /etc/chef/client.pem is not present - registering
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:46 -0800] INFO: HTTP Request Returned 404 Not Found: Cannot load node dev-vagrant-db1-andres
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Setting the run_list to ["role[base]", "role[db_master]"] from JSON
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Run List is [role[base], role[db_master]]
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Run List expands to [base_server, vim, postgresql::server]
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Starting Chef Run for dev-vagrant-db1-andres
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Loading cookbooks [base_server, openssl, postgresql, vim]
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/resources/repository.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/metadata.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/providers/repository.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/recipes/cacher.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/recipes/cacher-client.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/recipes/default.rb from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/metadata.json from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Removing cookbooks/apt/README.md from the cache; its cookbook is no longer needed on this client.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Storing updated cookbooks/base_server/recipes/default.rb in the cache.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Storing updated cookbooks/base_server/README.rdoc in the cache.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Storing updated cookbooks/base_server/metadata.rb in the cache.
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:47 -0800] INFO: Processing package[postgresql-client] action install (postgresql::client line 37)
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:48 -0800] ERROR: package[postgresql-client] (postgresql::client line 37) has had an error
[Thu, 19 Jan 2012 21:44:48 -0800] ERROR: Running exception handlers
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:48 -0800] FATAL: Saving node information to /srv/chef/file_store/failed-run-data.json
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:48 -0800] ERROR: Exception handlers complete
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:48 -0800] FATAL: Stacktrace dumped to /srv/chef/file_store/chef-stacktrace.out
: stdout
[db1dev] [Thu, 19 Jan 2012 21:44:48 -0800] FATAL: Chef::Exceptions::Exec: package[postgresql-client] (postgresql::client line 37) had an error: apt-get -q -y install postgresql-client=8.4.8-0ubuntu0.10.04 returned 100, expected 0
: stdout
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
chef-client -c /etc/chef/client.rb -j /etc/chef/dna.json
The output of the command prior to failing is outputted below:
[no output]
If I manually do a $sudo apt-get update from inside the server, doing another run of chef-client install postgres without a problem. Any ideas on why the apt recipe is not running? (I also know it's not running because the timestamp file for apt-get is not created in ubuntu). Any help would be much appreciated.
So, from the error you printed, it seems that running the command
apt-get -q -y install postgresql-client=8.4.8-0ubuntu0.10.04
is failing.
But you said that if you run
apt-get update
First, and then re-provision, it works fine?
My hunch is that when you first run chef, the version of postgresql client you are requesting is not in your downloaded list of apt packages, but running the apt-get update finds that version.
I see you are running a base_server recipe first, is that adding apt repositories to the list?
Regardless, I would recommend running an apt-get update prior to running the postgres recipe, this is something that would probably be appropriate to put into the base_server recipe I would think.

Resources