exception while logging using fluentd logger with Google Stackdriver - amazon-ec2

I'm using Google stackdriver on AWS EC2. Followed all the Steps as available for installing stackdriver on EC2.
Verified following
A) fluentd & collectd status
ps ax | grep fluentd
11429 pts/1 S+ 0:00 grep --color=auto fluentd
ps ax | grep collectd
1341 ? Ssl 0:02 /opt/stackdriver/collectd/sbin/stackdriver-collectd -C /opt/stackdriver/collectd/etc/collectd.conf -P /var/run/stackdriver-agent.pid
11431 pts/1 S+ 0:00 grep --color=auto collectd
B) Current credentials availability
sudo cat /etc/google/auth/application_default_credentials.json
However still getting stackoverflow error as I log any error in the system
java.lang.StackOverflowError
java.lang.StringCoding$StringDecoder.decode(StringCoding.java:153)
java.lang.StringCoding.decode(StringCoding.java:193)
java.lang.String.<init>(String.java:426)
java.lang.String.<init>(String.java:491)
java.net.PlainSocketImpl.socketConnect(Native Method)
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
java.net.Socket.connect(Socket.java:589)
org.fluentd.logger.sender.RawSocketSender.connect(RawSocketSender.java:83)
org.fluentd.logger.sender.RawSocketSender.reconnect(RawSocketSender.java:95)
org.fluentd.logger.sender.RawSocketSender.flush(RawSocketSender.java:186)
org.fluentd.logger.sender.RawSocketSender.flushBuffer(RawSocketSender.java:152)
org.fluentd.logger.sender.RawSocketSender.send(RawSocketSender.java:164)
org.fluentd.logger.sender.RawSocketSender.emit(RawSocketSender.java:147)
org.fluentd.logger.sender.RawSocketSender.emit(RawSocketSender.java:129)
org.fluentd.logger.sender.RawSocketSender.emit(RawSocketSender.java:124)
org.fluentd.logger.FluentLogger.log(FluentLogger.java:101)
org.fluentd.logger.FluentLogger.log(FluentLogger.java:86)

Your item A does not show any trace of the google-fluentd service. The best way to find out the status of the Stackdriver logging agent is by running sudo service google-fluentd status. The same goes for the Stackdriver monitoring agent (sudo service stackdriver-agent status).
If your logging agent is indeed not running, that would explain the failure of the Java logging library to connect to it. The StackOverflow exception is probably due to some configuration that causes Java to log all errors, including errors resulting from logging, resulting in infinite recursion.
To be able to answer this fully, we'd need to see the status of the logging service (as described above) and the configuration of your Java logger.

Related

journalctl stops logging systemd service logs with systemd_unit field

This is a question about journalctl and how systemd log entries are produced.
My OS is RHEL8.
I've got a systemd service setup to run kubelet as a systemd service, and I would like to use journalctl to tail the logs using the unit flag with "journalctl -u kubelet".
When I initially start the systemd service, I can see kubelet logs show up in /var/log/messages and I can also filter them using "journalctl -u kubelet". However, very shortly after the service is started, journalctl goes quiet when filtering with "-u kubelet", yet kubelet logs continue to be dumped into /var/log/messages. If I filter journalctl with "journalctl --identifier kubelet" instead of the "-u kubelet" I do see all the logs that are in /var/log/messages.
Using "-o json-pretty", I can see that the initial logs produced by the kubelet process have journald log entries with:
"_SYSTEMD_CGROUP" : "/system.slice/kubelet.service",
"_SYSTEMD_INVOCATION_ID" : "b422558179854d55a44d0ea6f7240828",
"_SYSTEMD_SLICE" : "system.slice",
"_SYSTEMD_UNIT" : "kubelet.service",
Logs produced shortly after starting the service is started seem to drop the unit property, and look like:
"_SYSTEMD_CGROUP" : "/systemd/system.slice",
"_SYSTEMD_INVOCATION_ID" : "b422558179854d55a44d0ea6f7240828",
"_SYSTEMD_SLICE" : "-.slice",
I think the fact that the logs start being produced without the "_SYSTEMD_UNIT" property indicates why filtering them with "-u" stops working, but I'd like to know why my service initially starts producing logs with the unit property, and then stops. Any clues would be appreciated.
Turns out this had more to do with kubelet configuration than it did journald configuration. kubelet needed to designate the correct cgroup in the kubeletCgroups config.

Access k8s pod logs generated from ssh exec

I have a filebeat configured to send my k8s cluster logs to Elasticsearch.
When I connect to the pod directly (kubectl exec -it <pod> -- sh -c bash),
the generated output logs aren't being sent to the destination.
Digging at k8s docs, I couldn't find how k8s is handling STDOUT from a running shell.
How can I configure k8s to send live shell logs?
Kubernetes has (mostly) nothing to do with this, as logging is handled by the container environment used to support Kubernetes, which is usually docker.
Depending on docker version, logs of containers could be written on json-file, journald or more, with the default being a json file. You can do a docker info | grep -i logging to check what is the Logging Driver used by docker. If the result is json-file, logs are being written down on a file in json format. If there's another value, logs are being handled in another way (and as there are various logging drivers, I suggest to check the documentation about them)
If the logs are being written on file, chances are that by using docker inspect container-id | grep -i logpath, you'll be able to see the path on the node.
Filebeat simply harvest the logs from those files and it's docker who handles the redirection between the application STDOUT inside the container and one of those files, with its driver.
Regarding exec commands not being in logs, this is an open proposal ( https://github.com/moby/moby/issues/8662 ) as not everything is redirected, just logs of the apps started by the entrypoint itself.
There's a suggested workaround which is ( https://github.com/moby/moby/issues/8662#issuecomment-277396232 )
In the mean time you can try this little hack....
echo hello > /proc/1/fd/1
Redirect your output into PID 1's (the docker container) file
descriptor for STDOUT
Which works just fine but has the problem of requiring a manual redirect.
Use the following process:
Make changes in your application to push logs to STDOUT. You may configure this in your logging configuration file.
Configure file to read those STDOUT logs (which eventual is some docker log file location like /var/log etc)
Start your file as a DeamonSets, so that logs from new pods and nodes can be anatomically pushed to ES.
For better readability of logs, make sure you push logs in json format.

Unable To Login Via Default Login Credentials of Admin in Ambari

I installed hortonworks sandbox around two weeks ago on a cloud server of linode with a machine of 8GB RAM. I access this node of Linode via Putty.
Everything is working fine. Also, I am able to login Ambari via the default login credentials like maria_dev, raj_ops, holger_gov and amy_ds.
But I am unable to login via the default login credentials of admin. I think that I've forgotten the password. Since I am really new to this framework, I am unable to recover the password. In the command line, I tried the following commands :
ambari-server restart
ambari-admin-password-reset
But each time, I am getting the command not found error.
Can someone please help me to recover the password ? or help me to login via the login credentials of admin ?
If you get the "command not found error" on ambari-server restart command, then probably you are logged into a wrong cluster node. Ambari-server is installed to a single host of a cluster.
Solution 1:
Ambari server failing, Check if Ambari server is running or not by following command
ps -aux | grep ambari
If you can not find ambari server process, please check a log for ambari-server. You can find ambari-server logs at /var/log/ambari-server/ambari-server.log
Ambari-server is Running but its missing from PATH please check if ambari-server is actually running ?
[root#sandbox ~]# ps aux | grep ambari-server
root 9 0.0 0.0 11356 1288 ? S+ 12:24 0:00 /bin/sh -c ambari-server start 1>/var/log/startup_script.log 2> "temp Ambari server.log" || touch temp.errors;
root 15 0.0 0.0 11360 1388 ? S+ 12:24 0:00 bash /usr/sbin/ambari-server start
root 46 0.8 0.3 116772 15168 ? D+ 12:24 0:01 /usr/bin/python /usr/sbin/ambari-server.py start
root 2324 0.0 0.0 8024 844 pts/0 R+ 12:26 0:00 grep ambari-server
Solution 2:
Try to find out where Ambari is Installed.
find / -name "ambari-server"
Try running ambari-server --help with full path. It will print following message as below.
[root#sandbox ~]# /usr/sbin/ambari-server --help
Using python /usr/bin/python
Usage: /usr/sbin/ambari-server
{start|stop|restart|setup|setup-jce|upgrade|status|upgradestack|setup-ldap|sync-ldap|set-current|setup-security|setup-sso|refresh-stack-hash|backup|restore|update-host-names|enable-stack|check-database|db-cleanup} [options]
Use /usr/sbin/ambari-server.py <action> --help to get details on options available.
Or, simply invoke ambari-server.py --help to print the options.
In case /usr/sbin/ missing from path then you have to manually add PATH with export PATH=$PATH:/usr/sbin

Bluemix Docker Container deployment results in "No route to host"

we are deploying a docker-image using this command:
cf ic run -p 8080 -m 512 -e SPRING_PROFILES_ACTIVE=test -e logging.config=classpath:logback-docker-test.xml --name <container-name> registry.eu-gb.bluemix.net/<repository_name>/<container-name>:latest
Within that container we are starting a Java8 Spring-Boot application that uses a connection-pooling provider. The connection-pooling provider connects to an existing PostgreSQL-Database that is accessible on the standard port. We do not use any domain name to connect to PostgreSQL-Database. We only use the IP-Address and the standard postgresql port.
The deployment is working on a machine that uses the standard Docker container daemon and is also working on Amazon WebServices (AWS) without any problems and using the same deployment mechanism.
However, if we are deploying the image to the Bluemix-Container-Service we do get the following error at startup of the spring-boot application:
Caused by: java.net.NoRouteToHostException: No route to host
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.postgresql.core.PGStream.<init>(PGStream.java:61)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:129)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:65)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:146)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:35)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:47)
at org.postgresql.jdbc42.AbstractJdbc42Connection.<init>(AbstractJdbc42Connection.java:21)
at org.postgresql.jdbc42.Jdbc42Connection.<init>(Jdbc42Connection.java:28)
at org.postgresql.Driver.makeConnection(Driver.java:415)
at org.postgresql.Driver.access$100(Driver.java:47)
at org.postgresql.Driver$ConnectThread.run(Driver.java:325)
... 1 more
We don't know why this happens, because if we do a telnet on another Bluemix-Docker-Machine to the PostgreSQL-Database server with the desired port everything is fine.
This is very annoying, since we cannot use this Docker-Image on Bluemix currently and is currently obstructing our planned roll-out.
Can you help us with details what might be wrong and how can fix this?
Any help will be appreciated.
Regards,
Christian
Is this error raised when the container is starting up?
If so, the Docker/IBM Containers on Bluemix take about between 30 up to 60 seconds in networking status: during this phase the container is not able to connect to the network.
It should be really probably the root cause of the error you are getting: if the Java SpringBoot application is trying to connect to the PostgreSQL database when the container is still in networking phase, it will fail with this error.
You should start your application running on the container when the container has completed the networking phase (for example through a bash script that checks the availability of the PostgreSQL server, or simply configure your springboot to manage this exception)
Official bluemix support gave the hint to wait for 120 seconds before starting the Java-Application that needs network access. The suggested way is:
CMD ["/bin/sh", "-c", "sleep 120; exec java $JVM_ARGS -cp /app org.springframework.boot.loader.JarLauncher --spring.main.show_banner=false"]
With that we have got network access and everything is fine.

Where is the kibana error log? Is there a kibana error log?

QUESTION: how do I debug kibana? Is there an error log?
PROBLEM 1: kibana 4 won't stay up
PROBLEM 2: I don't know where/if kibana 4 is logging errors
DETAILS:
Here's me starting kibana, making a request to the port, getting nothing, and checking the service again. The service doesn't stay up, but I'm not sure why.
vagrant#default-ubuntu-1204:/opt/kibana/current/config$ sudo service kibana start
kibana start/running, process 11774
vagrant#default-ubuntu-1204:/opt/kibana/current/config$ curl -XGET 'http://localhost:5601'
curl: (7) couldn't connect to host
vagrant#default-ubuntu-1204:/opt/kibana/current/config$ sudo service kibana status
kibana stop/waiting
Here's the nginx log, reporting when I curl -XGET from port 80, which is forwarding to port 5601:
2015/06/15 17:32:17 [error] 9082#0: *11 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: kibana, request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:5601/", host: "localhost"
UPDATE: I may have overthought this a bit. I'm still interested in ways to view the kibana log, however! Any suggestions are appreciated!
I've noticed that when I run kibana from the command-line, I see errors that are more descriptive than a "Connection refused":
vagrant#default-ubuntu-1204:/opt/kibana/current$ bin/kibana
{"#timestamp":"2015-06-15T22:04:43.344Z","level":"error","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n at respond (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n at checkRespForFailure (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n at HttpConnector.<anonymous> (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n at IncomingMessage.bound (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n at IncomingMessage.emit (events.js:117:20)\n at _stream_readable.js:944:16\n at process._tickCallback (node.js:442:13)\n"}}
{"#timestamp":"2015-06-15T22:04:43.346Z","level":"fatal","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n at respond (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n at checkRespForFailure (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n at HttpConnector.<anonymous> (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n at IncomingMessage.bound (/usr/local/kibana-4.0.2/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n at IncomingMessage.emit (events.js:117:20)\n at _stream_readable.js:944:16\n at process._tickCallback (node.js:442:13)\n"}}
vagrant#default-ubuntu-1204:/opt/kibana/current$
Kibana 4 logs to stdout by default. Here is an excerpt of the config/kibana.yml defaults:
# Enables you specify a file where Kibana stores log output.
# logging.dest: stdout
So when invoking it with service, use the log capture method of that service. For example, on a Linux distribution using Systemd / systemctl (e.g. RHEL 7+):
journalctl -u kibana.service
One way may be to modify init scripts to use the --log-file option (if it still exists), but I think the proper solution is to properly configure your instance YAML file. For example, add this to your config/kibana.yml:
logging.dest: /var/log/kibana.log
Note that the Kibana process must be able to write to the file you specify, or the process will die without information (it can be quite confusing).
As for the --log-file option, I think this is reserved for CLI operations, rather than automation.
In kibana 4.0.2 there is no --log-file option. If I start kibana as a service with systemctl start kibana I find log in /var/log/messages
It seems that you need to pass a flag "-l, --log-file"
https://github.com/elastic/kibana/issues/3407
Usage: kibana [options]
Kibana is an open source (Apache Licensed), browser based analytics and search dashboard for Elasticsearch.
Options:
-h, --help output usage information
-V, --version output the version number
-e, --elasticsearch <uri> Elasticsearch instance
-c, --config <path> Path to the config file
-p, --port <port> The port to bind to
-q, --quiet Turns off logging
-H, --host <host> The host to bind to
-l, --log-file <path> The file to log to
--plugins <path> Path to scan for plugins
If you use the init script to run as a service, maybe you will need to customize it.
Kibana doesn't have a log file by default. but you can set it up using log_file Kibana server property - https://www.elastic.co/guide/en/kibana/current/kibana-server-properties.html
For kibana 6.x on Windows, edit the shortcut to "kibana -l " folder must exist.

Resources