Problem description:
I have relatively big /var/log/messages file which is rotated.
The file list looks like this:
ls -l /var/log/messages*
-rw-------. 1 root 928873050 Mar 5 10:37 /var/log/messages
-rw-------. 1 root 889843643 Mar 5 07:49 /var/log/messages.1
-rw-------. 1 root 890148183 Mar 5 07:50 /var/log/messages.2
-rw-------. 1 root 587333632 Mar 5 07:51 /var/log/messages.3
My filebeat configuration snippet:
filebeat.prospectors:
- input_type: log
paths:
- /var/log/messages
- /var/lib/ntp/drift
- /var/log/syslog
- /var/log/secure
tail_files: True
With multiple /var/log/messages* files as shown above each time filebeat is restarted it starts to harvest and ingest the old log files.
When I have just one /var/log/messages file, this issue is not observed.
On Linux systems, Filebeat keeps track of files not by filename but with inode number which doesn't change when renamed. This is from Filebeat documentation.
The harvester is responsible for opening and closing the file, which
means that the file descriptor remains open while the harvester is
running. If a file is removed or renamed while it’s being harvested,
Filebeat continues to read the file. This has the side effect that the
space on your disk is reserved until the harvester closes. By default,
Filebeat keeps the file open until close_inactive is reached.
Which means this is what happens in your case
Reads current messages file (inode#1) and keeps track of its inode number in the registry.
Filebeat Stops, but messages file rotated to messages.1 (inode#1) and new messages (inode#2) file got created.
When Filebeat restarts then it will start reading
messages.1 (inode#1) file from where it left off
messages (inode#2) since it matches the path you configured (/var/log/messages)
If your plan is to harvest all messages file even the rotated ones, then it would be better to configure the path as
/var/log/messages*
It seems like the syslog and security plugins were ON in the configuration. That triggered the loading of the rotated syslog files.
Related
I was trying to run hadoop job to do the word shingling, and all my nodes soon get unhealthy state since the storage is used up.
Here is my mapper part:
shingle = 5
for line in sys.stdin:
# remove leading and trailing whitespace
line = line.strip()
for i in range(0, len(line)-shingle+1):
print ('%s\t%s' % (line[i:i+shingle], 1))
For my understanding that 'print' will generate temp file on each node which occupy stroage space. If I took a txt file as an example:
cat README.txt |./shingle_mapper.py >> temp.txt
I can see the size of the original and temp file:
-rw-r--r-- 1 root root 1366 Nov 13 02:46 README.txt
-rw-r--r-- 1 root root 9744 Nov 14 01:43 temp.txt
The temp file size is over 7 times of the input file, so I guess this is the reason that each of my node is used up all storage.
My question is do I understand the temp file correctly? If so, is there any better way to reduce the size of temp files (adding additional storage is not an option for me)?
Tried almost everything, but still cant find any solution for the issue so wanted to ask for little help in such case:
I have got logrotate (v. 3.7.8) configured based on size of log files:
/home/test/logs/*.log {
missingok
notifempty
nodateext
size 10k
rotate 100
copytruncate
}
Rotation of logs is based only on size, invoked whenever message will arrived to rsyslog deamon (v. 5.8.10). Configuration of rsyslog:
$ModLoad omprog
$ActionOMProgBinary /usr/local/bin/log_rotate.sh
$MaxMessageSize 64k
$ModLoad imuxsock
$ModLoad imklog
$ModLoad imtcp
$InputTCPServerRun 514
$template FORMATTER, "%HOSTNAME% | %msg:R,ERE,4,FIELD:(.*)\s(.*)(:::)(.*)--end%\n"
$ActionFileDefaultTemplate FORMATTER
$Escape8BitCharactersOnReceive off
$EscapeControlCharactersOnReceive off
$SystemLogRateLimitInterval 0
$SystemLogRateLimitBurst 0
$FileOwner test
$FileGroup test
$DirOwner test
$DirGroup test
# Log each module execution to separate log files and don't use the prepending module_execution_ in the log name.
$template CUSTOM_LOGS,"/home/test/logs/%programname:R,ERE,1,FIELD:^module_execution_(.*)--end%.log"
if $programname startswith 'module_execution_' then :omprog:
if $programname startswith 'module_execution_' then ?CUSTOM_LOGS
& ~
Script invoked by the omprog just runs logrotate and for test purpose sends new line to logrot file:
#!/bin/bash
echo "1" >> /home/test/logrot
/usr/sbin/logrotate /etc/logrotate.conf -v
How to reproduce:
execute:
for i in {1..50000}; do logger -t "module_execution_test" "test message"; done;
check rotate files - there will be a lot of files test.log.1,2,3 etc. with size near to the 10kB and one test.log with size much bigger then predicted
check:
wc -l /home/test/logrot
It will be growing for some time but then stops even if the messages still arrives (hangs exactly in the time when rotation stops to happen) - it means that rsyslog doesnt call external script anymore
So IMO it looks like a bug in rsyslog or omprog plugin. Any idea what is going on?
br
I've spent some hours trying to figure out why logrotate won't successfully upload my logs to S3, so I'm posting my setup here. Here's the thing--logrotate uploads the log file correctly to s3 when I force it like this:
sudo logrotate -f /etc/logrotate.d/haproxy
Starting S3 Log Upload...
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
/var/log/haproxy-2014-12-23-044414.gz -> s3://my-haproxy-access-logs/haproxy-2014-12-23-044414.gz [1 of 1]
315840 of 315840 100% in 0s 2.23 MB/s done
But it does not succeed as part of the normal logrotate process. The logs are still compressed by my postrotate script, so I know that it is being run. Here is my setup:
/etc/logrotate.d/haproxy =>
/var/log/haproxy.log {
size 1k
rotate 1
missingok
copytruncate
sharedscripts
su root root
create 777 syslog adm
postrotate
/usr/local/admintools/upload.sh 2>&1 /var/log/upload_errors
endscript
}
/usr/local/admintools/upload.sh =>
echo "Starting S3 Log Upload..."
BUCKET_NAME="my-haproxy-access-logs"
# Perform Rotated Log File Compression
filename=/var/log/haproxy-$(date +%F-%H%M%S).gz \
tar -czPf "$filename" /var/log/haproxy.log.1
# Upload log file to Amazon S3 bucket
/usr/bin/s3cmd put "$filename" s3://"$BUCKET_NAME"
And here is the output of a dry run of logrotate:
sudo logrotate -fd /etc/logrotate.d/haproxy
reading config file /etc/logrotate.d/haproxy
Handling 1 logs
rotating pattern: /var/log/haproxy.log forced from command line (1 rotations)
empty log files are rotated, old logs are removed
considering log /var/log/haproxy.log
log needs rotating
rotating log /var/log/haproxy.log, log->rotateCount is 1
dateext suffix '-20141223'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
renaming /var/log/haproxy.log.1 to /var/log/haproxy.log.2 (rotatecount 1, logstart 1, i 1),
renaming /var/log/haproxy.log.0 to /var/log/haproxy.log.1 (rotatecount 1, logstart 1, i 0),
copying /var/log/haproxy.log to /var/log/haproxy.log.1
truncating /var/log/haproxy.log
running postrotate script
running script with arg /var/log/haproxy.log : "
/usr/local/admintools/upload.sh 2>&1 /var/log/upload_errors
"
removing old log /var/log/haproxy.log.2
Any insight appreciated.
It turned out that my s3cmd was configured for my user, not for root.
ERROR: /root/.s3cfg: No such file or directory
ERROR: Configuration file not available.
ERROR: Consider using --configure parameter to create one.
Solution was to copy my config file over. – worker1138
I have some question about syslog fifo and log file.
For example I have my gc.log and I have this configuration on syslog
source s_splunk {
udp(ip("127.0.0.1") port(514));
file("/logs/gc.log" follow_freq(1));
};
destination d_splunk {
tcp (my.splunk.intranet port (1514));
};
log {
source (s_splunk);
destination (d_splunk);
};
to index this gc.log on splunk. But this way I get high cpu consume and I like to change how I'm indexing this log file.
I would like to do indexing by fifo file but I can't change how the application generate this log file.
How can i do this ?
I found some way to solve my problem. I delete my gc.log file and build this file like a fifo file and i changed de permission of this file.
So the JVM use de fifo to log and on the syslog-ng i'm configuring one destination to write the log on file and send to my splunk vip (my.splunk.intranet).
With this solution my syslog don't have high cpu usage.
I'm using following simple code to upload files to hdfs.
FileSystem hdfs = FileSystem.get(config);
hdfs.copyFromLocalFile(src, dst);
The files are generated by webserver java component and rotated and closed by logback in .gz format. I've noticed that sometimes the .gz file is corrupted.
> gunzip logfile.log_2013_02_20_07.close.gz
gzip: logfile.log_2013_02_20_07.close.gz: unexpected end of file
But the following command does show me the content of the file
> hadoop fs -text /input/2013/02/20/logfile.log_2013_02_20_07.close.gz
The impact of having such files is quite disaster - since the aggregation for the whole day fails, and also several slave nodes is marked as blacklisted in such case.
What can I do in such case?
Can hadoop copyFromLocalFile() utility corrupt the file?
Does anyone met similar problem ?
It shouldn't do - this error is normally associated with GZip files which haven't been closed out when originally written to local disk, or are being copied to HDFS before they have finished being written to.
You should be able to check by running an md5sum on the original file and that in HDFS - if they match then the original file is corrupt:
hadoop fs -cat /input/2013/02/20/logfile.log_2013_02_20_07.close.gz | md5sum
md5sum /path/to/local/logfile.log_2013_02_20_07.close.gz
If they don't match they check the timestamps on the two files - the one in HDFS should be modified after the local file system one.