vertica database autostart not running - vertica

I have an installed vertica cluster with 3 hosts. I want to do autostart database when starting the hosts (when all hosts were shutdown and now I turn on them). At every host in /etc/init.d/ I added script myscript.sh:
sudo -u myuser /opt/vertica/bin/admintools -t start_db -d test
When I run this script manually, it works and database is starting. But when the script is launched at OS startup, the database does not start. In the adminTools.log log I don't see startup errors, I see only pexpecting vsql command... and All nodes in db test are in state DOWN
2020-06-08 16:42:08.526 agent/752:0x7f195dffb700 [vsql._just_connect] <INFO> pexpecting vsql command: /opt/vertica/bin/vsql --no-vsqlrc -n -p 5433 -U myuser -h 192.168.0.5 test -P pager -A
2020-06-08 16:42:08.984 agent/752:0x7f195dffb700 [vsql._just_connect] <INFO> pexpecting vsql command: /opt/vertica/bin/vsql --no-vsqlrc -n -p 5433 -U myuser -h 192.168.0.6 test -P pager -A
2020-06-08 16:42:09.459 agent/752:0x7f195dffb700 [vsql._just_connect] <INFO> pexpecting vsql command: /opt/vertica/bin/vsql --no-vsqlrc -n -p 5433 -U myuser -h 192.168.0.7 test -P pager -A
2020-06-08 16:43:05.639 admintools/3701:0x7f456298c740 [adminExec.getCollapsedClusterState] <INFO> All nodes in db test are in state DOWN
Why is that?

Yes, my OS is Centos 7
But service verticad doesn't work
I run systemctl start verticad
Then I run: systemctl status verticad
verticad.service - Vertica server restart oneshot
Loaded: loaded (/etc/systemd/system/verticad.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Thu 2020-06-11 07:19:43 MSK; 38min ago
Process: 650 ExecStart=/opt/vertica/sbin/verticad start (code=exited, status=0/SUCCESS)
Main PID: 650 (code=exited, status=0/SUCCESS)
Jun 11 07:19:09 verticaserv1 systemd[1]: Starting Vertica server restart oneshot...
Jun 11 07:19:13 verticaserv1 su[706]: (to mydba) root on none
Jun 11 07:19:43 verticaserv1 verticad[650]: Vertica: start OK for users: mydba
Jun 11 07:19:43 verticaserv1 verticad[650]: [ OK ]
Jun 11 07:19:43 verticaserv1 systemd[1]: Started Vertica server restart oneshot.
In logs (/opt/vertica/log/verticad.log and /var/log/messages) I see only:
vertica process is not running
Vertica: start not OK

Related

Problem debugging sudo shell script executed from cron

I have a shell script which renews a LetsEncrypt SSL certificate. If I run it from a bash prompt it works fine. If I execute from the crontab it gets triggered but none of the commands seem to be executed.
The script is:
sudo systemctl stop nginx
sudo docker stop home-assistant
sudo certbot renew
sudo docker start home-assistant
sudo systemctl start nginx
If I examine the letsencrypt log afterwards (/var/log/letsencrypt/letsencrypt.log) there is no trace of the "certbot renew" command having been executed if it's run from cron. If I run it manually I see entries in this log.
syslog has this for the time the cron job is executed:
Sep 1 19:27:01 HP-MICROSERVER cron[106193]: (myuser) RELOAD (crontabs/myuser)
Sep 1 19:27:01 HP-MICROSERVER CRON[107237]: (myuser) CMD (bash -l /usr/bin/sudo /home/myuser/renew_ssl.sh)
Sep 1 19:27:01 HP-MICROSERVER postfix/pickup[106963]: 6E68F132832: uid=1000 from=<myuser>
Sep 1 19:27:01 HP-MICROSERVER postfix/cleanup[107242]: 6E68F132832: message-id=<20220901182701.6E68F132832#HP-MICROSERVER.localdomain>
Sep 1 19:27:01 HP-MICROSERVER postfix/qmgr[102195]: 6E68F132832: from=<myuser#HP-MICROSERVER.localdomain>, size=717, nrcpt=1 (queue active)
Sep 1 19:27:01 HP-MICROSERVER postfix/local[107244]: warning: dict_nis_init: NIS domain name not set - NIS lookups disabled
Sep 1 19:27:01 HP-MICROSERVER postfix/local[107244]: 6E68F132832: to=<myuser#HP-MICROSERVER.localdomain>, orig_to=<myuser>, relay=local, delay=0.07, delays=0.04/0.01/0/0.01, dsn=2.0.0, status=sent (delivered to mailbox)
Sep 1 19:27:01 HP-MICROSERVER postfix/qmgr[102195]: 6E68F132832: removed
I've added this to /etc/sudoers to enable the script to run as sudo from the cron:
myuser ALL=(root) NOPASSWD: /home/myuser/renew_ssl.sh
Any idea what I need to do to get this to execute properly from the crontab?
The problem was resolved by using sudo crontab -e to run the script as root (it was a permissions issue)

having issues getting systemd process to execute successfully?

I am currently trying to set up a bash script that records the ipv6 address of a host machine (raspberry pi running buster) and writes it to an env file the script creates. The script runs fine if I execute manually, but it doesn't seem to execute successfully on boot when enabled; I don't see the env file it is supposed to create in the appropriate directory.
I wonder if this is a permissions issue in regard to creating the .env file? Hoping someone might be able to shed light on how to trouble shoot this? I have set the following permissions on files as well as tried feed -c type to the exec start option.
OS and hardware:
OS: buster lite
host: raspberry pi 4
Permissions and prep:
# copy file to the bin folder and add permissions
cp /home/pi/my_project/ip_addr/linux_get_ip.sh /usr/local/bin/linux_get_ip.sh
sudo chmod 744 /usr/local/bin/linux_get_ip.sh
sudo chmod 644 /etc/systemd/system/get_ip.service
Bash script: linux_get_ip.sh
#!/bin/bash
# test existence of a directory
cd ./my_project/ip_addr/
# create an env file
touch .env
# assign the ip address to that env file
MACHINE_HOST_IP="$(hostname -I | cut -d " " -f 2)"
echo "${MACHINE_HOST_IP}"
echo "$(ls)"
destdir=.env
echo "MACHINE_HOST_IP=$MACHINE_HOST_IP" > "$destdir"
Service: get_ip.service
[Unit]
Description=Get IP address of local machine
After=multi-user.target
[Service]
Type=idle
ExecStart=/bin/bash -c /usr/local/bin/linux_get_ip.sh
[Install]
WantedBy=multi-user.target
systemctl status get_ip.service
● get_ip.service - Get IP address of local machine
Loaded: loaded (/etc/systemd/system/get_ip.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Mon 2022-04-04 12:26:31 EDT; 8min ago
Process: 1235 ExecStart=/bin/bash -c /usr/local/bin/linux_get_ip.sh (code=exited, status=0/SUCCESS)
Main PID: 1235 (code=exited, status=0/SUCCESS)
CPU: 51ms
Apr 04 12:26:31 raspberrypi bash[1235]: proc
Apr 04 12:26:31 raspberrypi bash[1235]: root
Apr 04 12:26:31 raspberrypi bash[1235]: run
Apr 04 12:26:31 raspberrypi bash[1235]: sbin
Apr 04 12:26:31 raspberrypi bash[1235]: srv
Apr 04 12:26:31 raspberrypi bash[1235]: sys
Apr 04 12:26:31 raspberrypi bash[1235]: tmp
Apr 04 12:26:31 raspberrypi bash[1235]: usr
Apr 04 12:26:31 raspberrypi bash[1235]: var
Apr 04 12:26:31 raspberrypi systemd[1]: get_ip.service: Succeeded.

Run Bash script as root in startup on ubuntu 18.04

I wanted to run a bash script as root in startup. First I started using RC.Local and Crontab but nothing works.
Create the service file as in the template below and add the file in the location /etc/systemd/system/
And the Template as
[Unit]
Description = ~Name of the service~
[Service]
WorkingDirectory= ~directory of working file~
ExecStart= ~directory~/filename.sh
[Install]
WantedBy=multi-user.target
Start the service file by the name using
systemctl start servicefile.service
To enable on startup
systemctl enable servicefile.service
To check the status
systemctl status servicefile.service
To stop
systemctl stop servicefile.service
Create a systemd unit file in /etc/systemd/system/ and use it to execute your script. (i.e. hello-world.service).
[Unit]
Description=Hello world
After=sysinit.target
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=no
RemainAfterExit=yes
User=root
ExecStart=/bin/echo hello world
ExecStop=/bin/echo goodby world
[Install]
WantedBy=multi-user.target
Now you can use it through systemctl as you would with other services.
$ systemctl enable hello-world
$ systemctl start hello-world
$ systemctl stop hello-world
$ systemctl status hello-world
● hello-world.service - Hello world
Loaded: loaded (/etc/systemd/system/hello-world.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Wed 2019-10-09 13:54:58 CEST; 1min 47s ago
Process: 11864 ExecStop=/bin/echo goodby world (code=exited, status=0/SUCCESS)
Main PID: 11842 (code=exited, status=0/SUCCESS)
Oct 09 13:54:38 lnxclnt1705 systemd[1]: Started Hello world.
Oct 09 13:54:38 lnxclnt1705 echo[11842]: hello world
Oct 09 13:54:57 lnxclnt1705 systemd[1]: Stopping Hello world...
Oct 09 13:54:57 lnxclnt1705 echo[11864]: goodby world
Oct 09 13:54:58 lnxclnt1705 systemd[1]: Stopped Hello world.
Make sure that you use the full path to your script in the unit file (i.e. /bin/echo). Check out the documentation about keys used in hello-world.service:
[Unit]
[Service]
Place the script inside /etc/init.d
Ensure that it has the extension '.sh'
For the crontab,
there is difference in if you set up user crontab or root crontab:
$ crontab -e
#reboot sudo ...
^^ This is user's cron tab and wont work as it is.
$ sudo crontab -e
#reboot ...
^^ This is root's cron tab and will run comand as root.
The #reboot should do the trick for you of running scripts after startup.

Cannot SSH to Docker Container Running on MAC

I cannot access SSH or HTTP-alt. The Ubuntu container is running on MacOSX. I assume both SSH and HTTP-alt are problematic for the same reason. I am using dockerfile and docker-compose for the setup. Because I am a novice with docker, there may be redundant commands. My host machine has the firewall disabled.
dockerfile
<-- output omitted for brevity -->
# ports
EXPOSE 22 8080
docker-compose
version: '3'
services:
base:
image: cox-nams:1.0
container_name: cox-nams
hostname: neteng-docker
stdin_open: true
ports:
- "10000:22" # ssh
- "10001:8080" # jupyter
<-- output omitted for brevity -->
Initializing Commands
$ docker exec -it cox-nams /bin/bash
Docker output
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b37789c4660c ba397d1c07cd "/bin/sh -c 'service…" 34 minutes ago Up 34 minutes 0.0.0.0:10000->22/tcp, 0.0.0.0:10001->8080/tcp cox-nams
Ports within the Container
duser#neteng-docker:~$ netstat -at | grep LISTEN
tcp 0 0 0.0.0.0:http-alt 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.11:46461 0.0.0.0:* LISTEN
tcp6 0 0 [::]:ssh [::]:* LISTEN
SSH from within the Container
duser#neteng-docker:~$ ssh duser#localhost -p 22
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:la2X7X8gZj7t8DQC7rwHTalMBHYC9oVggfYzATuzkyM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
duser#localhost's password:
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.14.134-boot2docker x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
This system has been minimized by removing packages and content that are
not required on a system that users do not log into.
To restore this content, you can run the 'unminimize' command.
Last login: Fri Aug 30 18:38:54 2019 from 127.0.0.1
duser#neteng-docker:~$
SSH from the Host
$ ssh duser#localhost -p 10000
ssh: connect to host localhost port 10000: Connection refused
Services
root#neteng-docker:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 03:37 ? 00:00:00 /bin/sh -c service ssh restart && bash
root 18 1 0 03:37 ? 00:00:00 /usr/sbin/sshd
root 19 1 0 03:37 ? 00:00:00 bash
root 20 0 0 03:37 pts/0 00:00:00 /bin/bash
root 55 20 0 03:40 pts/0 00:00:00 ps -ef
root#neteng-docker:/# service --status-all
[ - ] dbus
[ ? ] hwclock.sh
[ - ] procps
[ + ] ssh
EDIT: Added services output
You can use this Dockerfile
FROM ubuntu:16.04
RUN apt-get update && apt-get install -y openssh-server
RUN mkdir /var/run/sshd
RUN echo 'root:THEPASSWORDYOUCREATED' | chpasswd
RUN sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/'
/etc/ssh/sshd_config
# SSH login fix. Otherwise user is kicked off after login
RUN sed 's#session\s*required\s*pam_loginuid.so#session optional
pam_loginuid.so#g' -i /etc/pam.d/sshd
ENV NOTVISIBLE "in users profile"
RUN echo "export VISIBLE=now" >> /etc/profile
EXPOSE 22
CMD ["/usr/sbin/sshd", "-D"]
This will expose ssh on port 22 of container. then you can run following command to know which host port is connected to containers 22 port for ssh.
docker port <name of container> 22
This sample application provides solution to your problem. Have a look at it.
https://docs.docker.com/engine/examples/running_ssh_service/
Sadly, this ended up being an appliance firewall issue that I troubleshoot using "nc -l 22" on the server and "telnet IP -p 22" on the client (Linux machines).

systemd does not start service - Failed at step USER spawning

I am trying to write a systemd service script. Its starts with root user creating nonlogin user and gives him privileges. Then the nologin
user starts the application.
I am on rhel-7.5 (Maipo) with Linux-5.0.7-2019.05.28.x86_64. Here is what I tried.
/root/myhome/my_setup.sh:
#!/bin/bash
# Create nologin user with workingdir. Make hime owner for DB files, binary files he runs.
crdb_setup() {
/bin/mkdir -p /var/lib/lsraj /root/crdb || return $?
/usr/bin/getent group lsraj || /usr/sbin/groupadd -g 990 lsraj|| return $?
/usr/bin/getent passwd lsraj || /usr/sbin/useradd -u 990 -g 990 \
-c 'CRDB User' -d /var/lib/lsraj -s /sbin/nologin -M -K UMASK=022 lsraj || return $?
/bin/chown lsraj:lsraj /var/lib/lsraj /root/crdb /root/myhome/cockroach || return $?
}
crdb_setup
[root#lsraj ~]#
total 99896
-rwxr-xr-x 1 root root 102285942 Jun 18 16:54 cockroach
-rwxr-xr-x 1 root root 521 Jun 18 17:07 my_setup.sh
[root#lsraj ~]#
Service script:
[root#lsraj~]# cat /usr/lib/systemd/system/lsraj.service
[Unit]
Description=Cockroach Database Service
After=network.target syslog.target
[Service]
Type=notify
# run the script with root privileges. The script creates user and gives him privileges.
ExecStartPre=+/root/myhome/my_setup.sh
User=lsraj
Group=lsraj
WorkingDirectory=/var/lib/lsraj
ExecStart=/root/myhome/cockroach start --insecure --host=localhost --store=/root/crdb
ExecStop=/root/myhome/cockroach quit --insecure --host=localhost
StandardOutput=journal
Restart=on-failure
RestartSec=60s
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=cockroachdb
[Install]
WantedBy=multi-user.target
[root#lsraj~]#
Jun 18 17:30:51 lsraj systemd: [/usr/lib/systemd/system/lsraj.service:8] Executable path is not absolute, ignoring: +/root/myhome/my_setup.sh
Jun 18 17:30:51 lsraj systemd: Starting Cockroach Database Service...
Jun 18 17:30:51 lsraj systemd: Failed at step USER spawning /root/myhome/cockroach: No such process
Jun 18 17:30:51 lsraj systemd: lsraj.service: main process exited, code=exited, status=217/USER
Jun 18 17:30:51 lsraj systemd: Failed at step USER spawning /root/myhome/cockroach: No such process
Jun 18 17:30:51 lsraj systemd: lsraj.service: control process exited, code=exited status=217
Jun 18 17:30:51 lsraj systemd: Failed to start Cockroach Database Service.
Jun 18 17:30:51 lsraj systemd: Unit lsraj.service entered failed state.
Jun 18 17:30:51 lsraj systemd: lsraj.service failed.
I've moved my comment here to support richer formatting.
I can not advise on your need for the '+', I am only reading the error message for you which says systemd is ignoring the ExecStartPre path because it is not absolute.
Maybe this is a feature that exists in freedesktop.org, but my Redhat 7.6 release (which is what you indicate that you are using) does not include a similar statement (or table) in the systemd.service unit file man page. Plus you are getting a very clear error message about that line in your unit file.
The man page it mentions "-" and "#", but none of the others...
Here is an extract from the man page (and I've provided a link above to the full page).
ExecStartPre=, ExecStartPost=
Additional commands that are executed before or after the command in ExecStart=, respectively. Syntax is the same as for ExecStart=, except that multiple command lines are
allowed and the commands are executed one after the other, serially.
If any of those commands (not prefixed with "-") fail, the rest are not executed and the unit is considered failed.
Note that ExecStartPre= may not be used to start long-running processes. All processes forked off by processes invoked via ExecStartPre= will be killed before the next service
process is run.
I suggest trying to remove the "+" first and see what happens, then progress from there.

Resources