How do systemd journal cursors work? - go

I'm having trouble with systemd journal cursors.
If I SeekTail(), I get a value for the cursor and can keep calling Next() and it behaves exactly as expected.
However, if I SeekCursor() and then call Next() entry it jumps back to the Head() and starts reading over again. Why would it do that? I can verify that it did locate the cursor correctly. But it's as though SeekCursor only worked for the specific item and thats all. This is not what I would expect reading the man pages and other documentation.
I'm using go-systemd from the CoreOS project which is a simple wrapper for the systemd C-API.
But the go wrapper is not the issue, the C library is. I can see that journalctl is doing the same thing on Ubuntu.
e.g. append to journal, show tail output, get full entry detail in json. Jump to cursor and show tail
matthewh#xen:~$ echo "Cursor example" | systemd-cat
matthewh#xen:~$ journalctl -f
-- Logs begin at Mon 2017-07-03 08:56:12 NZST. --
May 31 17:50:31 xen code.desktop[6771]: [main 17:50:31] update#setState idle
May 31 17:55:01 xen CRON[4468]: pam_unix(cron:session): session opened for user root by (uid=0)
May 31 17:55:01 xen CRON[4469]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 31 17:55:01 xen CRON[4468]: pam_unix(cron:session): session closed for user root
May 31 17:55:03 xen smokeping[2839]: RRDs::update ERROR: /var/lib/smokeping/Local/LocalMachine.rrd: illegal attempt to update using time 1527746103 when last update time is 4073847643 (minimum one second step)
May 31 17:55:22 xen cat[4479]: Hello
May 31 17:59:28 xen cat[4539]: Cursor example
May 31 18:00:03 xen smokeping[2839]: RRDs::update ERROR: /var/lib/smokeping/Local/LocalMachine.rrd: illegal attempt to update using time 1527746403 when last update time is 4073847643 (minimum one second step)
May 31 18:00:06 xen cat[4547]: Cursor example
May 31 18:01:09 xen cat[4597]: Cursor example
^C
matthewh#xen:~$ journalctl -f -o json-pretty -n1
{
"__CURSOR" : "s=b7f2a0f19c9946abab26788729a244c5;i=52a5;b=1ba1d5cabb5840adb02eedc4aba5b4d6;m=2d96b77f94;t=56d7a319ee462;x=8afac4ada39ae1fb",
"__REALTIME_TIMESTAMP" : "1527746469487714",
"__MONOTONIC_TIMESTAMP" : "195802136468",
"_BOOT_ID" : "1ba1d5cabb5840adb02eedc4aba5b4d6",
"_UID" : "1000",
"_GID" : "1000",
"_CAP_EFFECTIVE" : "0",
"_MACHINE_ID" : "f899a862e4aa4775b8995564d8da565d",
"_HOSTNAME" : "xen",
"_TRANSPORT" : "stdout",
"PRIORITY" : "6",
"_COMM" : "cat",
"MESSAGE" : "Cursor example",
"_STREAM_ID" : "d1fbcc3ff027401e9dc95b5648f9322e",
"_PID" : "4597"
}
^C
matthewh#xen:~$ journalctl -f --cursor="s=b7f2a0f19c9946abab26788729a244c5;i=52a5;b=1ba1d5cabb5840adb02eedc4aba5b4d6;m=2d96b77f94;t=56d7a319ee462;x=8afac4ada39ae1fb"
-- Logs begin at Mon 2017-07-03 08:56:12 NZST. --
May 31 18:01:09 xen cat[4597]: Cursor example
-- Reboot --
Feb 04 13:03:03 xen systemd-journald[420]: Runtime journal (/run/log/journal/) is 8.0M, max 241.0M, 233.0M free.
Feb 04 13:03:03 xen kernel: Initializing cgroup subsys cpuset
Feb 04 13:03:03 xen kernel: Initializing cgroup subsys cpu
Feb 04 13:03:03 xen kernel: Initializing cgroup subsys cpuacct
Feb 04 13:03:03 xen kernel: Linux version 4.4.0-116-generic (buildd#lgw01-amd64-021) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.9) ) #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 (Ubuntu 4.4.0-116.140-generic 4.4.98)
Feb 04 13:03:03 xen kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-116-generic root=UUID=f95a581f-2afb-4428-bade-c913f1c51741 ro quiet splash vt.handoff=7
Feb 04 13:03:03 xen kernel: KERNEL supported cpus:
Feb 04 13:03:03 xen kernel: Intel GenuineIntel
Feb 04 13:03:03 xen kernel: AMD AuthenticAMD
^C
Note the "--reboot--" text and the fact that it jumped back several days in the past. But prior to that, it located my entry via systemd-cat so it was found.
What am I doing wrong? is it a bug or an oversight on my part?
Oddly enough, I have a CoreOS server I was able to test this on and it behaves differently. It behaves as expected. The version of journalctl is the same on both. All the configuration is untouched stock standard.

Related

Memory builds up overtime on Kubernetes pod causing JVM unable to start

We are running a kubernetes environment and we have a pod that is encountering memory issues. The pod runs only a single container, and this container is responsible for running various utility jobs throughout the day.
The issue is that this pod's memory usage grows slowly over time. There is a 6 GB memory limit for this pod, and eventually, the memory consumption grows very close to 6GB.
A lot of our utility jobs are written in Java, and when the JVM spins up for them, they require -Xms256m in order to start. Yet, since the pod's memory is growing over time, eventually it gets to the point where there isn't 256MB free to start the JVM, and the Linux oom-killer kills the java process. Here is what I see from dmesg when this occurs:
[Thu Feb 18 17:43:13 2021] Memory cgroup stats for /kubepods/burstable/pod4f5d9d31-71c5-11eb-a98c-023a5ae8b224/921550be41cd797d9a32ed7673fb29ea8c48dc002a4df63638520fd7df7cf3f9: cache:8KB rss:119180KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:119132KB inactive_file:8KB active_file:0KB unevictable:4KB
[Thu Feb 18 17:43:13 2021] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Thu Feb 18 17:43:13 2021] [ 5579] 0 5579 253 1 4 0 -998 pause
[Thu Feb 18 17:43:13 2021] [ 5737] 0 5737 3815 439 12 0 907 entrypoint.sh
[Thu Feb 18 17:43:13 2021] [13411] 0 13411 1952 155 9 0 907 tail
[Thu Feb 18 17:43:13 2021] [28363] 0 28363 3814 431 13 0 907 dataextract.sh
[Thu Feb 18 17:43:14 2021] [28401] 0 28401 768177 32228 152 0 907 java
[Thu Feb 18 17:43:14 2021] Memory cgroup out of memory: Kill process 28471 (Finalizer threa) score 928 or sacrifice child
[Thu Feb 18 17:43:14 2021] Killed process 28401 (java), UID 0, total-vm:3072708kB, anon-rss:116856kB, file-rss:12056kB, shmem-rss:0kB
Based on research I've been doing, here for example, it seems like it is normal on Linux to grow in memory consumption over time as various caches grow. From what I understand, cached memory should also be freed when new processes (such as my java process) begin to run.
My main question is: should this pod's memory be getting freed in order for these java processes to run? If so, are there any steps I can take to begin to debug why this may not be happening correctly?
Aside from this concern, I've also been trying to track down what is responsible for the growing memory in the first place. I was able to narrow it down to a certain job that runs every 15 minutes. I noticed that after every time it ran, used memory for the pod grew by ~.1 GB.
I was able to figure this out by running this command (inside the container) before and after each execution of the job:
cat /sys/fs/cgroup/memory/memory.usage_in_bytes | numfmt --to si
From there I narrowed down the piece of bash code from which the memory seems to consistently grow. That code looks like this:
while [ "z${_STATUS}" != "z0" ]
do
RES=`$CURL -X GET "${TS_URL}/wcs/resources/admin/index/dataImport/status?jobStatusId=${JOB_ID}"`
_STATUS=`echo $RES | jq -r '.status.status' || exit 1`
PROGRES=`echo $RES | jq -r '.status.progress' || exit 1`
[ "x$_STATUS" == "x1" ] && exit 1
[ "x$_STATUS" == "x3" ] && exit 3
[ $CNT -gt 10 ] && PrintLog "WC Job ($JOB_ID) Progress: $PROGRES Status: $_STATUS " && CNT=0
sleep 10
((CNT++))
done
[ "z${_STATUS}" == "z0" ] && STATUS=Success || STATUS=Failed
This piece of code seems innocuous to me at first glance, so I do not know where to go from here.
I would really appreciate any help, I've been trying to get to the bottom of this issue for days now.
I did eventually get to the bottom of this so I figured I'd post my solution here. I mentioned in my original post that I narrowed down my issue to the while loop that I posted above in my question. Each time the job in question ran, that while loop would iterate maybe 10 times. After the while loop completed, I noticed that utilized memory increased by 100MB each time pretty consistently.
On a hunch, I had a feeling the CURL command within the loop could be the culprit. And in fact, it did turn out that CURL was eating up my memory and not releasing it for whatever reason. Instead of looping and running the following CURL command:
RES=`$CURL -X GET "${TS_URL}/wcs/resources/admin/index/dataImport/status?jobStatusId=${JOB_ID}"`
I replaced this command with a simple python script that utilized the requests module to check our job statuses instead.
I am not sure still why CURL was the culprit in this case. After running CURL --version it appears that the underlying library being used is libcurl/7.29.0. Maybe there is an bug within that library version causing some issues with memory management, but that is just a guess.
In any case, switching from using python's requests module instead of CURL has resolved my issue.

bash variable doubles in value - why?

I have a simple shell script set up to capture images every X seconds. For some reason the value of X seems to double each time through the loop.
#!/bin/bash
# basic setup for time-lapse
SECONDS=1
while true
do
DATE=$(date +"%Y-%m-%d_%H%M%S")
filename=${DATE}_img.jpg
# fswebcam -r 1280x720 --timestamp "%a %Y-%b-%d %H:%M (%Z)" /home/pi/JPGS/$filename
date
echo "pausing for ${SECONDS} seconds"
sleep $SECONDS
date
echo "====="
done
This is the output I get. The value of SECONDS is not manipulated inside the loop, so I'm confused with what is happening here. Also, the original interval was 30 seconds, I changed it to 1 seconds for testing purposes, and the date calls are for testing/debugging too.
Sun Mar 3 17:51:57 CST 2019
pausing for 1 seconds
Sun Mar 3 17:51:58 CST 2019
=====
Sun Mar 3 17:51:58 CST 2019
pausing for 2 seconds
Sun Mar 3 17:52:00 CST 2019
=====
Sun Mar 3 17:52:00 CST 2019
pausing for 4 seconds
Sun Mar 3 17:52:04 CST 2019
=====
Sun Mar 3 17:52:04 CST 2019
pausing for 8 seconds
Sun Mar 3 17:52:12 CST 2019
=====
Sun Mar 3 17:52:12 CST 2019
pausing for 16 seconds
Sun Mar 3 17:52:28 CST 2019
=====
Sun Mar 3 17:52:28 CST 2019
pausing for 32 seconds
Sun Mar 3 17:53:00 CST 2019
=====
Sun Mar 3 17:53:00 CST 2019
pausing for 64 seconds
Sun Mar 3 17:54:04 CST 2019
=====
Sun Mar 3 17:54:04 CST 2019
pausing for 128 seconds
What am I missing here?
This is under a Raspberry Pi
Pick a different name for $SECONDS.
$SECONDS is a built-in shell variable. It expands to the number of seconds since the shell was started.
From the Bash manual:
'SECONDS'
This variable expands to the number of seconds since the shell was
started. Assignment to this variable resets the count to the value
assigned, and the expanded value becomes the value assigned plus the
number of seconds since the assignment.
$SECONDS is actually a special Bash Variable for timing the number of seconds a script has been running. Because it's a timer, it increments automatically every second without the script doing anything. Just change the variable name to something else and you should be fine.

What does [143x40] mean in the output of tmux list-sessions?

I have 4 tmux sessions present. When I use
tmux list-sessions
It shows the sessions with some numbers in the brackets. That is:
t128_1: 1 windows (created Thu Jul 19 12:20:44 2018) [71x38]
t128_2: 1 windows (created Thu Jul 19 12:20:54 2018) [71x38]
t3: 1 windows (created Thu Jul 19 12:19:59 2018) [143x40]
t6: 1 windows (created Thu Jul 19 12:20:27 2018) [71x38]
What does the number [AxB] mean? And why t3 session has a different value than the others? Thanks for any explanation.
That's the size of the terminal (143 columns, 40 rows) the last time a client attached to the session.

Debugging with apport for proprietary program (ubuntu)

A program I compiled and executed (by a shell script, as another user) sometimes crashes:
./run.sh: line 19: 7964 Segmentation fault (core dumped) ./Program ARG1 ARG2 ARG3 2>&1
I wanted to take a look at the core file to figure out where the crash might have happened. Unfortunately, there's no standard core file to be found, but apparently Ubuntu called it's default crash handler apport, which says in its log:
ERROR: apport (pid 8841) Mon Jun 2 17:59:04 2014: called for pid 7964, signal 11, core limit 0
ERROR: apport (pid 8841) Mon Jun 2 17:59:04 2014: executable: /path/to/Program (command line "./Program ARG1 ARG2 ARG3")
ERROR: apport (pid 8841) Mon Jun 2 17:59:04 2014: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 8841) Mon Jun 2 17:59:16 2014: wrote report /var/crash/_path_to_Program.1001.crash
I've been trying to process the crash dump file with apport-retrace, but apport doesn't handle the file very well because it apparently expects Ubuntu-specific packages:
ERROR: report file does not contain one of the required fields: CoreDump DistroRelease Package ExecutablePath
Looking at the crash dump file, I think there's important debugging information inside, so my question: is there another way to process this file, either with gdb, or extract a core dump file from it if the core dump is indeed stored inside?
For reference, here's (partially) the .crash file:
ProblemType: Crash
Architecture: amd64
Date: Mon Jun 2 17:59:04 2014
DistroRelease: Ubuntu 13.04
ExecutablePath: /path/to/Program
ExecutableTimestamp: 1401723071
ProcCmdline: ./Program ARG1 ARG2 ARG3
ProcCwd: /path/to
ProcEnviron: PATH=(custom, no user)
ProcMaps:
... (memory map left out)
ProcStatus:
Name: Program
State: S (sleeping)
Tgid: 7964
Pid: 7964
PPid: 7963
TracerPid: 0
Uid: 1001 1001 1001 1001
Gid: 1001 1001 1001 1001
FDSize: 64
Groups: 4 27 1001
VmPeak: 1009888 kB
VmSize: 1009884 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 205400 kB
VmRSS: 205400 kB
VmData: 762620 kB
VmStk: 136 kB
VmExe: 3312 kB
VmLib: 64144 kB
VmPTE: 852 kB
VmSwap: 0 kB
Threads: 9
SigQ: 0/127009
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001206
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000001fffffffff
Seccomp: 0
Cpus_allowed: ff
Cpus_allowed_list: 0-7
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 3669360
nonvoluntary_ctxt_switches: 85456
Signal: 11
Uname: Linux 3.8.0-35-generic x86_64
UserGroups: adm sudo
CoreDump: base64
H4sICAAAAAAC/0NvcmVEdW1wAA==
... (huge base64 encoded string left out)

hadoop multiline mixed records

I would like to parse logfiles produced by fidonet mailer binkd, which are multi-line and much worse - mixed: several instances can write into one logfile, for example:
27 Dec 16:52:40 [2484] BEGIN, binkd/1.0a-545/Linux -iq /tmp/binkd.conf
+ 27 Dec 16:52:40 [2484] session with 123.45.78.9 (123.45.78.9)
- 27 Dec 16:52:41 [2484] SYS BBSName
- 27 Dec 16:52:41 [2484] ZYZ First LastName
- 27 Dec 16:52:41 [2484] LOC City, Country
- 27 Dec 16:52:41 [2484] NDL 115200,TCP,BINKP
- 27 Dec 16:52:41 [2484] TIME Thu, 27 Dec 2012 21:53:22 +0600
- 27 Dec 16:52:41 [2484] VER binkd/0.9.6a-173/Win32 binkp/1.1
+ 27 Dec 16:52:43 [2484] addr: 2:1234/56.78#fidonet
- 27 Dec 16:52:43 [2484] OPT NDA CRYPT
+ 27 Dec 16:52:43 [2484] Remote supports asymmetric ND mode
+ 27 Dec 16:52:43 [2484] Remote requests CRYPT mode
- 27 Dec 16:52:43 [2484] TRF 0 0
*+ 27 Dec 16:52:43 [1520] done (from 2:456/78#fidonet, OK, S/R: 0/0 (0/0 bytes))*
+ 27 Dec 16:52:43 [2484] Remote has 0b of mail and 0b of files for us
+ 27 Dec 16:52:43 [2484] pwd protected session (MD5)
- 27 Dec 16:52:43 [2484] session in CRYPT mode
+ 27 Dec 16:52:43 [2484] done (from 2:1234/56.78#fidonet, OK, S/R: 0/0 (0/0 bytes))
So the logfile is not only multi-line with unpredictable number of lines per session, but also several records can be mixed in between, like session 1520 has finished in the middle of session 2484.
What would be the right direction in hadoop to parse such a file? Or shall I just parse line-by-line and then merge them somehow into a record later and write those records into a SQL database using another set of jobs later on?
Thanks.
Right direction for Hadoop will be to develop your own input format who's record reader will
read input line by line and produce logical records.
Can be stated - that you actually can do it in mapper also - it might be a bit simpler. Drawback will be that it is not standard packaging of such code for hadoop and thus it is less reusable.
Other direction you mentioned is not "natural" for hadoop in my view. Specifically - why to use all complicated (and expensive) machinery of shuffling to join together several lines which are already in hands.
First of all, parsing the file is not what you are trying to do; you are trying to extract some information from your data.
In your case you can consider multi-step MR job where first MR job will essentially (partially) sort your input by session_id (do some filtering? Some aggregation? Multiple reducers?) and then reducer or next MR job will do actual calculation.
Without explanation of what you are trying to extract from your log files it is hard to give more definitive answer.
Also if your data is small, maybe you can process it without MR machinery at all?

Resources