Save pod metadata in external log analysis tool - elasticsearch

We currently save all Kubernetes logs on a central log analysis tool. We use fluentbit to ship the logs.
Although we are able to retrieve and analyze the logs sent to stdout by the containers, we are unable to find the information shown in pods when we describe them (kubectl describe pod somepod). We want to save this information too, as it shows the Exit Code and Reason for a Terminated pod. Like exit codes 137 and OOMKilled.
Having this information in prometheus would also be valid. In prometheus we can see that some of the information is present like kube_pod_container_status_terminated_reason but the "exit code" is missing.
How could this be performed?

Related

How to output logs in a containerized application?

Normally, in docker/k8s, it's recommended that directly ouput the logs to the stdout.
Then we can use kubectl logs or docker logs to see the logs.
like: time=123 action=write msg=hello world , as a tty, it might be colorized for human friendliness.
However, if we want to export the logs to a log processing center, like EFK (elasticsearch-fluentd-kibana), we need a json-format log file.
like: {"time"=123,"action"="write","msg"="hello world"}
What I want?
Is there a log method that can take into account both human friendliness and json format?
I'm looking for a way that if I use docker logs, then I can get human-readable logs, and at the same time, the log collector can still get the logs in json-format
Conclusion
Thanks for the answer below. I have got 2 methods:
different log format in different env:
1.1 use text-format in developing: docker logs will print colorized and human readable logs.
1.2 use json-format in production: EFK can process json-format well.
log collector's format convertion
2.1 we use text-format, but in log collector like fluentd we can define some scripts to translate text-format kv pair to json-format kv pair.
Kubernetes has such option of structured logging for its system components.
Klog library allows to use --logging-format=json flag that enables to change the format of logs to JSON output - more information about it here and here.
yes you can do that with Flunetd, below are the basic action items that you need to take to finalize this setup
Configure docker container to log to stdout (you can use any format you like)
Configure FluentD to tail docker files from /var/lib/docker/containers/*/*-json.log
parse logs with Flunetd and change the format to JSON logs
output the logs to Elasticsearch.
This article shows exactly how to do this setup also this one explain how to parse Key-Value logs

What events are triggered for PV/PVC and from where?

kubectl get events list the events for the K8s objects.
From where the events are triggered for PV/PVC actually ?
There is a list of volume events
https://docs.openshift.com/container-platform/4.5/nodes/clusters/nodes-containers-events.html
but it does not identifies that which events are for which resource ?
Let`s start from what exactly is an Kubernetes event. Those are object that provide insight into what is happening inside a cluster, such as what decisions were made by scheduler or why some pods were evicted from the node. Those API objects are persisted in etcd.
You can read more about them here and here.
There is also an excellent tutorial about Kubernetes events which you may find here.
There are couple of ways to view/fetch more detailed events from Kubernetes:
Use kubectl get events -o wide. This will give you information about object, subobject and source of the event. Here`s an example:
LAST SEEN TYPE REASON OBJECT SUBOBJECT SOURCE MESSAGE
<unknown> Warning FailedScheduling pod/web-1 default-scheduler running "VolumeBinding" filter plugin for pod "web-1": pod has unbound immediate PersistentVolumeClaims
6m2s Normal ProvisioningSucceeded persistentvolumeclaim/www-web-1 k8s.io/minikube-hostpath 2481b4d6-0d2c-11eb-899d-02423db39261 Successfully provisioned volume pvc-a56b3f35-e7ac-4370-8fda-27342894908d
6m2s Normal ProvisioningSucceeded persistentvolumeclaim/www-web-1 k8s.io/minikube-hostpath 2481b4d6-0d2c-11eb-899d-02423db39261 Successfully provisioned volume pvc-a56b3f35-e7ac-4370-8fda-27342894908d
Use kubectl get events --output json will give you list of the event in json format containing other details such as selflink.
---
"apiVersion": "v1",
"count": 1,
"eventTime": null,
"firstTimestamp": "2020-10-13T12:07:17Z",
"involvedObject": {
---
"kind": "Event",
"lastTimestamp": "2020-10-13T12:07:17Z",
"message": "Created container nginx",
"metadata": {
---
Selflink can be used to determine the the API location from where the data is being fetched.
We can take as an example /api/v1/namespaces/default/events/ and fetch the data from API server using kubectl proxy:
kubectl proxy --port=8080 & curl http://localhost:8080/api/v1/namespaces/default/events/
Using all those information you can narrow down to a specific details from the underlying object using field-selector:
kubectl get events --field-selector type=!Normal
or
kubectl get events --field-selector involvedObject.kind=PersistentVolumeClaim
LAST SEEN TYPE REASON OBJECT MESSAGE
44m Normal ExternalProvisioning persistentvolumeclaim/www-web-0 waiting for a volume to be created, either by external provisioner "k8s.io/minikube-hostpath" or manually created by system administrator
44m Normal Provisioning persistentvolumeclaim/www-web-0 External provisioner is provisioning volume for claim "default/www-web-0"
44m Normal ProvisioningSucceeded persistentvolumeclaim/www-web-0 Successfully provisioned volume pvc-815beb0a-b5f9-4b27-94ce-d21f2be728d5
Please also remember that all the information provided by kubectl events are the same ones from the kubectl describe <ojbect>.
Lastly, if you look carefully into the event.go code you may see all the events reference for volumes. If you compare those with the Table 13. Volumes you can see that they are almost the same (execpt for WaitForPodScheduled and ExternalExpanding)
This means that Openshift provided an aggregated set of information about possible events from different kubernetes that may occur in the cluster.

How do I set google stackdriver to respect logging severity from kubernetes?

I deployed a go application in google cloud using kubernetes which automatically logs to google stackdriver. Oddly, all log statements are being tagged with severity "ERROR"
For example:
log.Println("This should have log level info")
will be tagged as an error.
Their docs say "Severities: By default, logs written to the standard output are on the INFO level and logs written to the standard error are on the ERROR level."
Anyone know what could be wrong with my setup?
Take a look at this logging package: github.com/teltech/logger, with an accompanying blog post. It will output your logs in a JSON format, including the severity, that is readable by the Stackdriver Fluentd agent.

Viewing TeamCity service messages

I'm troubleshooting a build step in TeamCity 9.0.4. The problem seems to lie within the service message output. Is it possible to view these after the build has completed? They are not included in the build log.
The documentation on service messages simply says In order to be processed by TeamCity, they should be printed into a standard output stream of the build.
https://confluence.jetbrains.com/display/TCD9/Build+Script+Interaction+with+TeamCity
(To some extent the service messages can be viewed by manually rerunning the build step and monitoring standard output, but this is not always feasible.)
The documentation for service message implies that you need to write service messages to standard out/error rather than to a log file. If you write it to standard out, teamcity will automatically pick it up and show it in the **build logs ** tab
What this means is that if you have a
shell script, use echo for your service messages
java class, use System.out.println
and so on
Different languages also have different plugins for this , for ex perl has TapHarness.pl to write teamcity messages to the console.
EDIT:
If you want to just view service messages , you can find them in the build logs on the teamcity agent that the build ran on. If you do not find them in the build logs , either the build log has rolled over or you need to increase the verbosity or debug level of your logs(depends on the language).
There was a problem which is solved nowdays:
TeamCity now parses service messages inside other service messages, but only if original message was tagged with tc:parseServiceMessagesInside. Example:
##teamcity[testStdOut name='test1' out='##teamcity|[buildStatisticValue key=|'my_stat_value|' value=|'125|'|]' tc:tags='tc:parseServiceMessagesInside']
A link to JetBrains bug tracker:
https://youtrack.jetbrains.com/issue/TW-45311

Stop Logstash agent on inactivity?

The central log server I'm working on uses two Logstash agents, each running in its own screen:
a shipper to collect logs from front servers
an indexer to send the logs into Elasticsearch
Sometimes, it can be useful to re-import some logs (on failure, to re-format the logs etc...). For this purpose, I execute a third agent called importer whose job is to re-import old logs.
The problem I'm facing is that I have to monitor the re-import processus until it's completely done and hence becomes killable.
So, I would like to know if there's some kind of option able to stop an agent on idleness.
You might be able to do something with the exec input. (http://logstash.net/docs/1.4.2/inputs/exec). I'm thinking something like cat /some/reload/file; sleep 30; kill $LS_PID not quite sure on how you'd get $LS_PID assigned though.

Resources