Pipe stdout to loggly via simple bash script - bash

I want to log stdout from some programs to loggly. There are a few simple utilities out there that I've found (e.g. https://github.com/meatballhat/loggly-pipe and https://github.com/segmentio/loggly-cat), but they seem like they're overkill.
Could I get away with doing something this simple:
log.sh:
#!/bin/bash
while read line
do
echo "$line"
curl -H "content-type:text/plain" -d "$line" https://logs-01.loggly.com/inputs/<my-token>/tag/tag1,tag2/ >/dev/null 2>&1
done < /dev/stdin
Then I run my program and pipe it to my loggly logging script:
./my_script.sh | ./log.sh
This seems to work okay, but I wonder if all the complicated-ness of the other solutions out there is necessary for some reason?
Could anything go wrong here?
Thanks!

Think about what your script does. It runs curl once per line of input.
Think about what that means.
If you log 10K lines, you'll then spawn 10K processes. Which will initiate 10K TCP connections. This is a massive waste of computing resources.
Also, you don't handle errors at all (in fact, you actively hide them by sending curl's stderr to /dev/null!). This means the script is not only inefficient, it is unreliable.
I recommend starting all Bash scripts with set -eu to exit on unhandled errors, but that's just one step, not a complete fix for the above.

Related

Is there a way to redirect all stdout and stderr to systemd journal from within script?

I like the idea of using systemd's journal to view and manage the logs of my own scripts. I have become aware you can log to journal from my user scripts on a per message basis..
echo 'hello' | systemd-cat -t myscript -p emerg
Is there a way to redirect all messages to journald, even those generated by other commands? Like..
exec &> systemd-cat
Update:
Some partial success.
Tried Inian's suggestion from terminal.
~/scripts/myscript.sh 2>&1 | systemd-cat -t myscript.sh
and it worked, stdout and stderr were directed to systemd's journal.
Curiously,
~/scripts/myscript.sh &> | systemd-cat -t myscript.sh
didn't work in my Bash terminal.
I still need to find a way to do this inside my script for when other programs call my script.
I tried..
exec 2>&1 | systemd-cat -t myscript.sh
but it doesn't work.
Update 2:
From terminal
systemd-cat ~/scripts/myscript.sh
works. But I'm still looking for a way to do this from within the script.
A pipe to systemd-cat is a process which needs to run concurrently with your script. Bash offers a facility for this, though it's not portable to POSIX sh.
exec > >(systemd-cat -t myscript -p emerg) 2>&1
The >(command) process substitution starts another process and returns a pseudo-filename (something like /dev/fd/63) which you can redirect into. This is basically a wrapper for the mkfifo hacks you could do if you wanted to port this to POSIX sh.
If your script happens to not be a shell script, but some other programming language that allows loading extension modules linked to -lsystemd, there is another way. There is a library function sd_journal_stream_fd that quite precisely matches the task at hand. Calling it from bash itself (as opposed to some child) seems difficult at best. In Python for instance, it is available as systemd.journal.stream. What this function does in essence is connecting a unix domain stream socket and communicating what kind of data is being transmitted (e.g. priority). The difficult part with a shell here is making it connect a unix domain socket (as opposed to connecting in a child).
The key idea to this answer was given by Freenode/libera.chat user grawity.
Apparently, and for reasons that are beyond me, you can't really redirect all stdout and stderr to journald from within a script because it has to be piped in. To work around that I found a trick people were using with syslog's logger which works similarly.
You can wrap all your code into a function and then pipe the function into systemd-cat.
#!/bin/bash
mycode(){
echo "hello world"
echor "echo typo producing error"
}
mycode | systemd-cat -t myscript.sh
exit 0
And then to search journal logs..
journalctl -t myscript.sh --since yesterday
I'm disappointed there isn't a more direct way of doing this.

log of parallel computations, how do I prevent interleaved write? lockfile or flock?

I see that has been discussed several times how to run scripts not concurrently, but I have not see the topic of concurrent write.
I am doing some parallel computation with xargs launching the commands for the actual computations. At the end of each computation I want that process to access a file and put the results in there. I am getting troubles because the write on the log file happens in a way that each process can access the log file at the same time, resulting in interleaved entries with one line from one run, another line from another run that finished about the same time (which is likely to happen due to the parallel nature of the run with xargs).
So in practice let's say that using xargs I run in parallel several insances of a script that reads:
#!/bin/bash
#### do something that takes some time
#### define content of the log
folder="<folder>"$PWD"</folder>\n"
datetag="<enddate>"`date`"</enddate>\n"
#### store log in XML ####
echo -e "<myrun>\n""$folder""$datetag""</myrun>" >> $outputfie
At present I get output file with interleaved runs log like this
<myrun>
<myrun>
<folder>./generations/test/run1</folder>
<folder>./generations/test/run2</folder>
<enddate>Sun Jul 6 11:17:58 CEST 2014</enddate>
</myrun>
<enddate>Sun Jul 6 11:17:58 CEST 2014</enddate>
</myrun>
Is there a way to give "exclusive access" to one instance of the script at a time, so that each script is writing its log without interference with the others?
I have seen flock and lockfile, but I am not sure what fits best to my case and I am seeking for advise/suggestion.
Thanks,
Roberto
I will use traceroute as example as that prints output slowly, but any other command would also work. Compare:
(echo 8.8.8.8;echo 8.8.4.4) | xargs -P6 -n1 traceroute > traceroute.xarg
to:
(echo 8.8.8.8;echo 8.8.4.4) | parallel traceroute > traceroute.para
Make sure you install GNU Parallel and not another parallel, and that /etc/parallel/config is empty.
I thinks this in the end does the job. The loop keeps going until this instance of the script can lock the log file for itself. Then writes and unlocks it.
The other instances of the script that are running in parallel and might be trying to write will find the lock ... or will be able to lock the file for themselves.
while [ -! `lockfile -1 log.lock` ]; do
echo -e "accessing file at "`date`
echo -e "$logblock" >> log
rm -f log.lock
break
done
Does anybody see any drawbacks in this type of solution?

Bash script to start Solr deltaimporthandler

I am after a bash script which I can use to trigger a delta import of XML files via CRON. After a bit of digging and modification I have this:
#!/bin/bash
# Bash to initiate Solr Delta Import Handler
# Setup Variables
urlCmd='http://localhost:8080/solr/dataimport?command=delta-import&clean=false'
statusCmd='http://localhost:8080/solr/dataimport?command=status'
outputDir=.
# Operations
wget -O $outputDir/check_status_update_index.txt ${statusCmd}
2>/dev/null
status=`fgrep idle $outputDir/check_status_update_index.txt`
if [[ ${status} == *idle* ]]
then
wget -O $outputDir/status_update_index.txt ${urlCmd}
2>/dev/null
fi
Can I get any feedback on this? Is there a better way of doing it? Any optimisations or improvements would be most welcome.
This certainly looks usable. Just to confirm, you intend to run this ever X minutes from your crontab? That seems reasonsable.
The only major quibble (IMHO) is discarding STDERR information with 2>/dev/null. Of course it depends on what are your expectations for this system. If this is for a paying customer or employer, do you want to have to explain to the boss, "gosh, I didn't know I was getting error message 'Cant connect to host X' for the last 3 months because we redirect STDERR to /dev/null"! If this is for your own project, and your monitoring the work via other channels, then not so terrible, but why not capture STDERR to file, and if check that there are no errors. as a general idea ....
myStdErrLog=/tmp/myProject/myProg.stderr.$(/bin/date +%Y%m%d.%H%M)
wget -O $outputDir/check_status_update_index.txt ${statusCmd} 2> ${myStdErrLog}
if [[ ! -s ${myStdErrLog} ]] ; then
mail -s "error on myProg" me#myself.org < ${myStdErrLog}
fi
rm ${myStdErrLog}
Depending on what curl includes in its STDERR output, you may need filter what is in the StdErrLog to see if there are "real" error messages that you need to have sent to you.
A medium quibble is your use backticks for command substitution, if you're using dbl-sqr-brackets for evaluations, then why not embrace complete ksh93/bash semantics. The only reason to use backticks is if you think you need to be ultra-backwards compatible and that you'll be running this script under the bourne shell (or possibly one of the stripped down shells like dash).Backticks have been deprecated in ksh since at least 1993. Try
status=$(fgrep idle $outputDir/check_status_update_index.txt)
The $( ... ) form of command substitution makes it very easy to nest multiple cmd-subtitutions, i.e. echo $(echo one $(echo two ) ). (Bad example, as the need to nest cmd-sub is pretty rare, I can't think of a better example right now).
Depending on your situation, but in a large production environement, where new software is installed to version numbered directories, you might want to construct your paths from variables, i.e.
hostName=localhost
portNum=8080
SOLRPATH=/solr
SOLRCMD='delta-import&clean=false"
urlCmd='http://${hostName}:${portNum}${SOLRPATH}/dataimport?command=${SOLRCMD}"
The final, minor quibble ;-). Are you sure ${status} == *idle* does what you want?
Try using something like
case "${status}" in
*idle* ) .... ;;
* ) echo "unknown status = ${status} or similar" 1>&2 ;;
esac
Yes, your if ... fi certainly works, but if you want to start doing more refined processing of infomation that you put in your ${status} variable, then case ... esac is the way to go.
EDIT
I agree with #alinsoar that 2>/dev/null on a line by itself will be a no-op. I assumed that it was a formatting issue, but looking in edit mode at your code I see that it appears to be on its own line. If you really want to discard STDERR messages, then you need cmd ... 2>/dev/null all on one line OR as alinsoar advocates, the shell will accept redirections at the front of the line, but again, all on one line ;-!.
IHTH

Getting stdout+stderr in a log file

I am trying to implement something which my logic says can't be done. But I need your help to understand why can't it be.
Short Version of Question
Is it possible to log stdout+stderr of a script in csh without using file redirection ( >& or tee ).
Detailed Explanation of Question
I have a requirement with a csh script (script1) where I am not allowed to use file redirection.(I will give the reason in a while)
So that means I can't use something like
echo just checking >& logfile
hence I can't use this or tee to create my logfile.
I also have a another script (script2) which is a top level script.
I can either run script1 in standalone mode or through script2.
In either case i need to create a log(stdout+stderr) of script1 in logfile.
There are two possible(but not complete) option for that
write this line in script2
./script1 >& logfile
But then I can't log script1 in logfile when script1 is run in standalone mode.
Another option is to use file redirections in script1 like this:
echo test starting >> logfile
echo test over
In this case thee are two disadvantages:
1) "test over" prints before "test starting" , i.e. the order of occurring of command logs is not certain.
2) It's tedious to put >>& after every statement if I am intending to cover whole script.
Now is there any other way,I can get what I need. That is I can run script1 without file redirection and still get to log its stdout+stderr in logfile.
You mention csh, so this may not help you. On the other had, it may motivate you to stop using csh for scripts, a task for which it is notoriously inappropriate. In sh, you can simply do:
#!/bin/sh
exec > logfile 2>&1
echo foo
To write foo (and the output and errors of all subsequent commands) to the logfile

starting remote script via ssh containing nohup

I want to start a script remotely via ssh like this:
ssh user#remote.org -t 'cd my/dir && ./myscript data my#email.com'
The script does various things which work fine until it comes to a line with nohup:
nohup time ./myprog $1 >my.log && mutt -a ${1%.*}/`basename $1` -a ${1%.*}/`basename ${1%.*}`.plt $2 < my.log 2>&1 &
it is supposed to do start the program myprog, pipe its output to mylog and send an email with some datafiles created by myprog as attachment and the log as body. Though when the script reaches this line, ssh outputs:
Connection to remote.org closed.
What is the problem here?
Thanks for any help
Your command runs a pipeline of processes in the background, so the calling script will exit straight away (or very soon afterwards). This will cause ssh to close the connection. That in turn will cause a SIGHUP to be sent to any process attached to the terminal that the -t option caused to be created.
Your time ./myprog process is protected by a nohup, so it should carry on running. But your mutt isn't, and that is likely to be the issue here. I suggest you change your command line to:
nohup sh -c "time ./myprog $1 >my.log && mutt -a ${1%.*}/`basename $1` -a ${1%.*}/`basename ${1%.*}`.plt $2 < my.log 2>&1 " &
so the entire pipeline gets protected. (If that doesn't fix it it may be necessary to do something with file descriptors - for instance mutt may have other issues with the terminal not being around - or the quoting may need tweaking depending on the parameters - but give that a try for now...)
This answer may be helpful. In summary, to achieve the desired effect, you have to do the following things:
Redirect all I/O on the remote nohup'ed command
Tell your local SSH command to exit as soon as it's done starting the remote process(es).
Quoting the answer I already mentioned, in turn quoting wikipedia:
Nohuping backgrounded jobs is for example useful when logged in via SSH, since backgrounded jobs can cause the shell to hang on logout due to a race condition [2]. This problem can also be overcome by redirecting all three I/O streams:
nohup myprogram > foo.out 2> foo.err < /dev/null &
UPDATE
I've just had success with this pattern:
ssh -f user#host 'sh -c "( (nohup command-to-nohup 2>&1 >output.file </dev/null) & )"'
Managed to solve this for a use case where I need to start backgrounded scripts remotely via ssh using a technique similar to other answers here, but in a way I feel is more simple and clean (at least, it makes my code shorter and -- I believe -- better-looking), by explicitly closing all three streams using the stream-close redirection syntax (as discussed at the following locations:
https://unix.stackexchange.com/questions/131801/closing-a-file-descriptor-vs
https://unix.stackexchange.com/questions/70963/difference-between-2-2-dev-null-dev-null-and-dev-null-21
http://www.tldp.org/LDP/abs/html/io-redirection.html#CFD
https://www.gnu.org/software/bash/manual/html_node/Redirections.html
Rather than the more widely used but (IMHO) hackier "redirect to/from /dev/null", resulting in the deceptively simple:
nohup script.sh >&- 2>&- <&-&
2>&1 works just as well as 2>&-, but I feel the latter is ever-so-slightly more clear. ;) Most people might have a space preceding the final "background job" ampersand, but since it is not required (as the ampersand itself functions like a semicolon in normal usage), I prefer to omit it. :)

Resources