OKD4.4 compute node failed with "Internal Server Error" - etcd

I tried to deployed OKD 4.4 on my home cluster using the following doc =>
https://medium.com/#craig_robinson/openshift-4-4-okd-bare-metal-install-on-vmware-home-lab-6841ce2d37eb
The "services", "bootstrap" and "control-plane" nodes went smoothly (at least the output on screen is similar to those in the doc).
However, when I deployed the "compute" (worker) nodes, it failed to startup with the following error:
ignition[xxx]: GET https://api-int.lab.xxxtest.com:22623/config/worker: attempt #xxx
ignition[xxx]: GET result: Internal Server Error
A check on the bootstrap node (journalctl -u bootkube | grep bootkube.sh | tail):
[root#okd4-bootstrap openshift]# journalctl -u bootkube | grep bootkube.sh | tail
Apr 07 05:22:14 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: Error: unhealthy cluster
Apr 07 05:22:14 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: etcdctl failed. Retrying in 5 seconds...
Apr 07 05:22:24 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: {"level":"warn","ts":"2020-04-07T05:22:24.872Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-57584517-34e6-40c3-b945-0b920fb059e6/localhost:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp [::1]:2379: connect: connection refused\""}
Apr 07 05:22:24 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: https://localhost:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Apr 07 05:22:24 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: Error: unhealthy cluster
Apr 07 05:22:24 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: etcdctl failed. Retrying in 5 seconds...
Apr 07 05:22:35 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: {"level":"warn","ts":"2020-04-07T05:22:35.347Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"endpoint://client-304bfb54-2184-4c01-acdb-86850fbe9b8d/localhost:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp [::1]:2379: connect: connection refused\""}
Apr 07 05:22:35 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: https://localhost:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Apr 07 05:22:35 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: Error: unhealthy cluster
Apr 07 05:22:35 okd4-bootstrap.lab.xxxtest.com bootkube.sh[4838]: etcdctl failed. Retrying in 5 seconds...
[root#okd4-bootstrap openshift]#
Any idea what could have gone wrong?
It seems bootstrap is trying to start/connect to "etcd" on the "localhost" (bootstrap node).
Thanks.

Related

AWS Linux 2 AMI Failed to get D-Bus connection: No such file or directory

I have an AWS Linux 2 AMI EC2 instance.
When running systemctl --user status I get the message:
Failed to get D-Bus connection: No such file or directory
I then ran systemctl start dbus.socket, which gave me this message:
Failed to start dbus.socket: The name org.freedesktop.PolicyKit1 was not provided by any .service files See system logs and 'systemctl status dbus.socket' for details.
I then ran systemctl status dbus.socket -l which returned this:
dbus.socket - D-Bus System Message Bus Socket
Loaded: loaded (/usr/lib/systemd/system/dbus.socket; static; vendor preset: disabled)
Active: active (running) since Thu 2022-03-31 21:26:42 UTC; 14h ago
Listen: /run/dbus/system_bus_socket (Stream)
Mar 31 21:26:42 ip-10-0-0-193.ec2.internal systemd[1]: Listening on D-Bus System Message Bus Socket.
Mar 31 21:26:42 ip-10-0-0-193.ec2.internal systemd[1]: Starting D-Bus System Message Bus Socket.
Running sudo systemctl --user status gives a different error:
Failed to get D-Bus connection: Connection refused
I'm unsure of what to investigate next or what steps to take to resolve the issue.

Failed to start Elasticsearch. Error opening log file '/gc.log': Permission denied

Dear StackOverflow community,
I was running Kibana/Elasticsearch without a problem until installing a Kibana plugin. Then the service failed and I noticed that the problem is that Elasticsearch stopped. I tried several ways to fix it, and then even reinstalled all. But the problem still avoiding to launch Elasticsearch, even with a fresh installation.
Installation on Debian 9 using apt install.
systemctl start elasticsearch.service
results on:
Exception in thread "main" java.lang.RuntimeException: starting java failed with [1]
[0.000s][error][logging] Error opening log file '/gc.log': Permission denied
Full log with journalctl -xe
-- Unit elasticsearch.service has begun starting up.
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"Unable to revive connection: http://localhost:9200/"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"No living connections"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"Unable to revive connection: http://localhost:9200/"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal kibana[576]: {"type":"log","#timestamp":"2020-02-07T13:09:06Z","tags":["warning","elasticsearch","admin"],"pid":576,"message":"No living connections"}
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Exception in thread "main" java.lang.RuntimeException: starting java failed with [1]
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: output:
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: [0.000s][error][logging] Error opening log file '/gc.log': Permission denied
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: [0.000s][error][logging] Initialization of output 'file=/var/log/elasticsearch/gc.log' using options 'filecount=32,filesize=64m' failed.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: error:
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Error: Could not create the Java Virtual Machine.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: Error: A fatal exception has occurred. Program will exit.
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.flagsFinal(JvmErgonomics.java:118)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.finalJvmOptions(JvmErgonomics.java:86)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:59)
Feb 07 14:09:06 Debian-911-stretch-64-minimal elasticsearch[2312]: at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:92)
Feb 07 14:09:06 Debian-911-stretch-64-minimal systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Feb 07 14:09:06 Debian-911-stretch-64-minimal systemd[1]: Failed to start Elasticsearch.
-- Subject: Unit elasticsearch.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit elasticsearch.service has failed.
The mentioned gc.log file was not in that folder. And the permissions were:
drwxr-s--- 2 elasticsearch elasticsearch 4096 Jan 15 13:20 elasticsearch
I created the file and also played with permissions until having these:
-rwxrwxrwx 1 root elasticsearch 0 Feb 7 15:19 gc.log
...and even changed the ownership:
-rwxrwxrwx 1 root root 0 Feb 7 15:19 gc.log
But no success, I still having the same issue.
Thanks
Make sure you are running CMD as Administrator.
This error also happens if you are using docker & running the container as a different user. You have to add --group_add flag to docker command or set TAKE_FILE_OWNERSHIP environment variable as mentioned here
Using docker-compose:
user: 1007:1007
group_add:
- 0
Using docker:
--group-add 0
Firstly, I didn't know why gc.log file was not present. Have you changed the logs folder path or something? The gc.log path can be set in jvm.options file. By default ES logs and java garbage collection logs are fed into the logs folder inside $ES_HOME directory.
About user perspective, elastic search can't be run as root user. So from the ES directory details its showing you have an elasticsearch user created, and trying to run the cluster by that user.
The problem here can be solved by changing the permissions of files insdie the ES directory where all it belongs. Now the gc.log file is owned by root user and it cannot be accessed by the elasticsearch user.
Try this: sudo chown <user> <path/to/es/directory> -R
Here it becomes : sudo chown elasticsearch elasticsearch/ -R
If the issue still persists, check the jvm.options file whether its all configured correctly. Unless you change the -Xloggc:logs/gc.log option, the gc.log won't be pushing to /var/log.
Feb 09 17:09:02 server elasticsearch[2199]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Your log says, the option is given as file=/var/log/elasticsearch/gc.log. Correct any wrong configurations as per documentation : https://www.elastic.co/guide/en/elasticsearch/reference/master/jvm-options.html
sudo systemctl -l status elasticsearch.service
Returns this log:
● elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/elasticsearch.service.d
└─override.conf
Active: failed (Result: exit-code) since Sun 2020-02-09 17:09:02 CET; 2min 48s ago
Docs: http://www.elastic.co
Process: 2199 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=1/FAILURE)
Main PID: 2199 (code=exited, status=1/FAILURE)
Feb 09 17:09:02 server elasticsearch[2199]: Invalid -Xlog option '-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m', see error log for details.
Feb 09 17:09:02 server elasticsearch[2199]: Error: Could not create the Java Virtual Machine.
Feb 09 17:09:02 server elasticsearch[2199]: Error: A fatal exception has occurred. Program will exit.
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.flagsFinal(JvmErgonomics.java:118)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.finalJvmOptions(JvmErgonomics.java:86)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmErgonomics.choose(JvmErgonomics.java:59)
Feb 09 17:09:02 server elasticsearch[2199]: at org.elasticsearch.tools.launchers.JvmOptionsParser.main(JvmOptionsParser.java:92)
Feb 09 17:09:02 server systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Feb 09 17:09:02 server systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
Feb 09 17:09:02 server systemd[1]: Failed to start Elasticsearch.
At this point I'm doing a fresh install. Not able to find the solution I need to continue working...

Dante Socks5 proxy server doesn't start

I have installed Dante Proxy server by using following methods from the website. But the Server doesn't start and shows the following error. I have tried the steps from other websites also. I searched StackOverflow and saw the same issue in one question. but it has been solved yet. Anyone can solve it or suggest me any other alternative for SOCKS5 proxy server
Job for danted.service failed because the control process exited with error code. See "systemctl status danted.service" and "journalctl -xe" for details.
Error shown in systemctl status danted.service & journalctl -xe
steven#steven-VirtualBox:~$ systemctl status danted.service
● danted.service - LSB: SOCKS (v4 and v5) proxy daemon (danted)
Loaded: loaded (/etc/init.d/danted; bad; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2019-03-10 18:12:42 IST; 2min 59s ago
Docs: man:systemd-sysv-generator(8)
Process: 3400 ExecStart=/etc/init.d/danted start (code=exited, status=1/FAILURE)
Mar 10 18:12:41 steven-VirtualBox systemd[1]: Starting LSB: SOCKS (v4 and v5) proxy daemon (danted)...
Mar 10 18:12:42 steven-VirtualBox danted[3405]: error: /etc/danted.conf: problem on line 11 near token "eth0": could not resolve hostname "eth0
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Control process exited, code=exited status=1
Mar 10 18:12:42 steven-VirtualBox danted[3400]: Starting Dante SOCKS daemon:
Mar 10 18:12:42 steven-VirtualBox systemd[1]: Failed to start LSB: SOCKS (v4 and v5) proxy daemon (danted).
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Unit entered failed state.
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Failed with result 'exit-code'.
steven#steven-VirtualBox:~$ journalctl -xe
-- The result is failed.
Mar 10 18:11:40 steven-VirtualBox systemd[1]: danted.service: Unit entered failed state.
Mar 10 18:11:40 steven-VirtualBox systemd[1]: danted.service: Failed with result 'exit-code'.
Mar 10 18:12:40 steven-VirtualBox sudo[3397]: steven : TTY=pts/18 ; PWD=/home/steven ; USER=root ; COMMAND=/bin/systemctl restart danted
Mar 10 18:12:41 steven-VirtualBox sudo[3397]: pam_unix(sudo:session): session opened for user root by (uid=0)
Mar 10 18:12:41 steven-VirtualBox systemd[1]: Stopped LSB: SOCKS (v4 and v5) proxy daemon (danted).
-- Subject: Unit danted.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit danted.service has finished shutting down.
Mar 10 18:12:41 steven-VirtualBox systemd[1]: Starting LSB: SOCKS (v4 and v5) proxy daemon (danted)...
-- Subject: Unit danted.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit danted.service has begun starting up.
Mar 10 18:12:42 steven-VirtualBox danted[3405]: error: /etc/danted.conf: problem on line 11 near token "eth0": could not resolve hostname "eth0
Mar 10 18:12:42 steven-VirtualBox danted[3405]: alert: mother[1/1]: shutting down
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Control process exited, code=exited status=1
Mar 10 18:12:42 steven-VirtualBox danted[3400]: Starting Dante SOCKS daemon:
Mar 10 18:12:42 steven-VirtualBox sudo[3397]: pam_unix(sudo:session): session closed for user root
Mar 10 18:12:42 steven-VirtualBox systemd[1]: Failed to start LSB: SOCKS (v4 and v5) proxy daemon (danted).
-- Subject: Unit danted.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit danted.service has failed.
--
-- The result is failed.
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Unit entered failed state.
Mar 10 18:12:42 steven-VirtualBox systemd[1]: danted.service: Failed with result 'exit-code'.
Mar 10 18:12:50 steven-VirtualBox sudo[3407]: steven : TTY=pts/18 ; PWD=/home/steven ; USER=root ; COMMAND=/bin/systemctl status danted
Mar 10 18:12:50 steven-VirtualBox sudo[3407]: pam_unix(sudo:session): session opened for user root by (uid=0)
Mar 10 18:14:38 steven-VirtualBox sudo[3407]: pam_unix(sudo:session): session closed for user root
I had the same issue and came across your question. I fixed it by adding a systemd dependency of network-online.target to the danted.service, based on reading this https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Here's how:
sudo systemctl edit danted.service
add this:
[Unit]
After=network-online.target
Wants=network-online.target
save & exit, run this for good measure
sudo systemctl daemon-reload
sudo systemctl enable danted.service
This line is the telltale:
Mar 10 18:12:42 steven-VirtualBox danted[3405]: error: /etc/danted.conf: problem on line 11 near token "eth0": could not resolve hostname "eth0
It looks like there is no interface called eth0.
I had the same issue, found out what the actual interface is called using ifconfig and swapped out eth0 for that.
Find the interface of your device from Terminal with netstat -rn and look at the Iface column. Install netstat with sudo apt install net-tools if you don't have it. Change the settings of external: eth0 to external: xxxx where of course xxxx being your Iface value, in the file /etc/danted.conf.
If you're just starting out and there's not yet saved rules in danted.conf you can simply delete the file with sudo rm /etc/danted.conf and then create a new with sudo nano /etc/danted.conf. If using firewall it is mandatory that you open the port 1080 with sudo ufw allow 1080. In the new empty file danted.conf, paste in
logoutput: syslog
user.privileged: root
user.unprivileged: nobody
# The listening network interface or address.
internal: 0.0.0.0 port=1080
# The proxying network interface or address.
external: xxxx #Replace xxxx with the device's Iface
# socks-rules determine what is proxied through the external interface.
socksmethod: username
# client-rules determine who can connect to the internal interface.
clientmethod: none
client pass {
from: 0.0.0.0/0 to: 0.0.0.0/0
}
socks pass {
from: 0.0.0.0/0 to: 0.0.0.0/0
}
Save the file and run
sudo systemctl restart danted.service
sudo systemctl status danted.service

Apache server not working locally in ubuntu showing an error " Failed to start The Apache HTTP Server." on terminal

user#user-desktop:~$ sudo service apache2 restart
Job for apache2.service failed because the control process exited with error code.
See "systemctl status apache2.service" and "journalctl -xe" for details.
user#user-desktop:~$ systemctl status apache2.service
● apache2.service - The Apache HTTP Server
Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
Drop-In: /lib/systemd/system/apache2.service.d
└─apache2-systemd.conf
Active: failed (Result: exit-code) since Tue 2018-09-18 16:45:15 IST; 7s ago
Process: 12099 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)
Sep 18 16:45:14 user-desktop systemd[1]: Starting The Apache HTTP Server...
Sep 18 16:45:14 user-desktop apachectl[12099]: apache2: Syntax error on line 225 of /etc/apache2/apache2.conf: Syntax error on line
14 of /etc
Sep 18 16:45:15 user-desktop apachectl[12099]: Action 'start' failed.
Sep 18 16:45:15 user-desktop apachectl[12099]: The Apache error log may have more information.
Sep 18 16:45:15 user-desktop systemd[1]: apache2.service: Control process exited, code=exited status=1
Sep 18 16:45:15 user-desktop systemd[1]: apache2.service: Failed with result 'exit-code'.
Sep 18 16:45:15 user-desktop systemd[1]: Failed to start The Apache HTTP Server.

Why tinyproxy requires an upstream proxy?

Today I configured a basic tinyproxy.
I expected it to act as proxy for ubuntu repositories.
But when trying to download stuff from repositories I got this on tinyproxy log
CONNECT Mar 27 17:30:46 [20348]: Connect (file descriptor 9): [unknown] [192.168.2.30]
CONNECT Mar 27 17:30:46 [20348]: Request (file descriptor 9): GET http://br.archive.ubuntu.com/ubuntu/pool/main/t/tdb/python-tdb_1.2.12-1_amd64.deb HTTP/1.1
INFO Mar 27 17:30:46 [20348]: No upstream proxy for br.archive.ubuntu.com
ERROR Mar 27 17:30:56 [20348]: opensock: Could not retrieve info for br.archive.ubuntu.com
INFO Mar 27 17:30:56 [20348]: no entity
I stuck on some misconcept. Do not tinyproxy send requests for outside servers directly?
I supllied an external proxy server to fix this
upstream 117.79.64.29:80

Resources