What does && do in redirecting an output to a file? - bash

I found a bash script that lists all files with a blank line at the beginning or end of the file.
for f in `find . -type f`; do
for t in head tail; do
$t -1 $f |egrep '^[ ]*$' >/dev/null && echo "blank line at the $t of $f" ;
done;
done
I would like to pass the output to a file. I changed the echo line to:
$t -1 $f |egrep '^[ ]*$' >/dev/null && echo "blank line at the $t of $f" > blank_lines.log
But this didn't work.
I would like to ask what the symbol && does and why the above line does not pass the output to the file.

This does not redirect output it adds another command to execute if and only if the first one succeeded. I.e command1 && command2 means execute command1 and if it is successful execute command2. Also the result of executing this statement is success if and only if both commands succeeded.

The reason the redirect doesn't work is that you used > to redirect output. This erases the file before writing the command's output to it, meaning that you'll only ever see the last thing written to it. To fix this, either use >> to append (although you may need to truncate the file at the beginning to prevent the output from multiple runs of the script accumulating), or redirect output for the entire loop.
BTW, the for ... find construct you're using to iterate over files will fail horribly if any filenames have spaces or other funny characters in them. It's much better to use find ... -print0 | while IFS= read -d $'\0' ... (and then use double-quotes around $f):
find . -type f -print0 | while IFS= read -d $'\0' f; do
for t in head tail; do
$t -1 "$f" |egrep '^[ ]*$' >/dev/null && echo "blank line at the $t of $f"
done
done > blank_lines.log

It's a short-circuited 'and' operator. It takes no part in output redirection.
In this context, it performs the right hand side only if the left hand side succeeded. I.e. it only prints out 'blank line at the ...' if there was indeed a blank line reported by egrep.

As mentioned by the other guys && is a short-circuited and operator:
command1 && command2
executes command1 and, if that command did run without errors (exit code 0) executes command2. It then returns the exit code of command2.
echo "foo" > /dev/null && echo "bar" > tmp.txt
does work, so the reason your script does not work should be something else.
Perhaps try executing the
$t -1 $f |egrep '^[ ]*$' >/dev/null && echo "blank line at the $t of $f" > blank_lines.log
line without the variables and the for loop.

Related

Speed up shell script/Performance enhancement of shell script

Is there a way to speed up the below shell script? It's taking me a good 40 mins to update about 150000 files everyday. Sure, given the volume of files to create & update, this may be acceptable. I don't deny that. However, if there is a much more efficient way to write this or re-write the logic entirely, I'm open to it. Please I'm looking for some help
#!/bin/bash
DATA_FILE_SOURCE="<path_to_source_data/${1}"
DATA_FILE_DEST="<path_to_dest>"
for fname in $(ls -1 "${DATA_FILE_SOURCE}")
do
for line in $(cat "${DATA_FILE_SOURCE}"/"${fname}")
do
FILE_TO_WRITE_TO=$(echo "${line}" | awk -F',' '{print $1"."$2".daily.csv"}')
CONTENT_TO_WRITE=$(echo "${line}" | cut -d, -f3-)
if [[ ! -f "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}" ]]
then
echo "${CONTENT_TO_WRITE}" >> "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
else
if ! grep -Fxq "${CONTENT_TO_WRITE}" "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
then
sed -i "/${1}/d" "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
"${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
echo "${CONTENT_TO_WRITE}" >> "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
fi
fi
done
done
There are still parts of your published script that are unclear like the sed command. Although I rewrote it with saner practices and much less external calls witch should really speed it up.
#!/usr/bin/env sh
DATA_FILE_SOURCE="<path_to_source_data/$1"
DATA_FILE_DEST="<path_to_dest>"
for fname in "$DATA_FILE_SOURCE/"*; do
while IFS=, read -r a b content || [ "$a" ]; do
destfile="$DATA_FILE_DEST/$a.$b.daily.csv"
if grep -Fxq "$content" "$destfile"; then
sed -i "/$1/d" "$destfile"
fi
printf '%s\n' "$content" >>"$destfile"
done < "$fname"
done
Make it parallel (as much as you can).
#!/bin/bash
set -e -o pipefail
declare -ir MAX_PARALLELISM=20 # pick a limit
declare -i pid
declare -a pids
# ...
for fname in "${DATA_FILE_SOURCE}/"*; do
if ((${#pids[#]} >= MAX_PARALLELISM)); then
wait -p pid -n || echo "${pids[pid]} failed with ${?}" 1>&2
unset 'pids[pid]'
fi
while IFS= read -r line; do
FILE_TO_WRITE_TO="..."
# ...
done < "${fname}" & # forking here
pids[$!]="${fname}"
done
for pid in "${!pids[#]}"; do
wait -n "$((pid))" || echo "${pids[pid]} failed with ${?}" 1>&2
done
Here’s a directly runnable skeleton showing how the harness above works (with 36 items to process and 20 parallel processes at most):
#!/bin/bash
set -e -o pipefail
declare -ir MAX_PARALLELISM=20 # pick a limit
declare -i pid
declare -a pids
do_something_and_maybe_fail() {
sleep $((RANDOM % 10))
return $((RANDOM % 2 * 5))
}
for fname in some_name_{a..f}{0..5}.txt; do # 36 items
if ((${#pids[#]} >= MAX_PARALLELISM)); then
wait -p pid -n || echo "${pids[pid]} failed with ${?}" 1>&2
unset 'pids[pid]'
fi
do_something_and_maybe_fail & # forking here
pids[$!]="${fname}"
echo "${#pids[#]} running" 1>&2
done
for pid in "${!pids[#]}"; do
wait -n "$((pid))" || echo "${pids[pid]} failed with ${?}" 1>&2
done
Strictly avoid external processes (such as awk, grep and cut) when processing one-liners for each line. fork()ing is extremely inefficient in comparison to:
Running one single awk / grep / cut process on an entire input file (to preprocess all lines at once for easier processing in bash) and feeding the whole output into (e.g.) a bash loop.
Using Bash expansions instead, where feasible, e.g. "${line/,/.}" and other tricks from the EXPANSION section of the man bash page, without fork()ing any further processes.
Off-topic side notes:
ls -1 is unnecessary. First, ls won’t write multiple columns unless the output is a terminal, so a plain ls would do. Second, bash expansions are usually a cleaner and more efficient choice. (You can use nullglob to correctly handle empty directories / “no match” cases.)
Looping over the output from cat is a (less common) useless use of cat case. Feed the file into a loop in bash instead and read it line by line. (This also gives you more line format flexibility.)

Display output of command in terminal while using command substitution

So I'm trying to check for the output of a command, but I also want to be able display the output directly in the terminal.
#!/bin/bash
while :
do
OUT=$(streamlink -o "$NAME" "$STREAM" best)
echo "$OUT"
if [[ $OUT == *"No playable streams"* ]]; then
echo "Delaying!"
sleep 15s
fi
done
This is what I tried to do.
The code checks if the output of a command contains that error substring, if so it'd add a delay. It works well on that part.
But it doesn't work well when the command is actually successfully downloading a file as it won't perform that echo until it is finished with the download (which would take hours). So until then I have no way of personally checking the output of the command
Plus the output of this particular command displays and updates the speed and filesize in real-time, something echo wouldn't be able to replicate.
So is there a way to be able to display the output of a command in real-time, while also command substituting them in order to check the output for substrings after the command is finished?
Use a temporary file:
TEMP=$(mktemp) || exit 1
while true
do
streamlink -o "$NAME" "$STREAM" best |& tee "$TEMP"
OUT=$( cat "$TEMP" )
#echo "$OUT" # not longer needed
if [[ $OUT == *"No playable streams"* ]]; then
echo "Delaying!"
sleep 15s
fi
done
# not really needed here because of endless loop
rm -f "$TEMP"

SoF2 shell script not running

I've got the following code in my shell script:
SERVER=`ps -ef | grep -v grep | grep -c sof2ded`
if ["$SERVER" != "0"]; then
echo "Already Running, exiting"
exit
else
echo "Starting up the server..."
cd /home/sof2/
/home/sof2/crons/start.sh > /dev/null 2>&1
fi
I did chmod a+x status.sh
Now I try to run the script but it's returning this error:
./status.sh: line 5: [1: command not found
Starting up the server...
Any help would be greatly appreciated.
Could you please try changing a few things in your script as follows and let me know if that helps you?(changed back-tick to $ and changed [ to [[ in code)
SERVER=$(ps -ef | grep -v grep | grep -c sof2ded)
if [[ "$SERVER" -eq 0 ]]; then
echo "Already Running, exiting"
exit
else
echo "Starting up the server..."
cd /home/sof2/
/home/sof2/crons/start.sh > /dev/null 2>&1
fi
The problem is with the test command. "But", I hear you say, "I am not using the test command". Yes you are, it is also known as [.
if statement syntax is if command. The brackets are not part of if syntax.
Commands have arguments separated (tokenized) by whitespace, so:
[ "$SERVER" != "0" ]
The whitespace is needed because the command is [ and then there are 4 arguments passed to it (the last one must be ]).
A more robust way of comparing numerics is to use double parentheses,
(( SERVER == 0 ))
Notice that you don't need the $ or the quotes around SERVER. Also the spacing is less important, but useful for readability.
[[ is used for comparing text patterns.
As a comment, backticks ` ` are considered deprecated because they are difficult to read, they are replaced with $( ... ).

Safe shell redirection when command not found

Let's say we have a text file named text (doesn't matter what it contains) in current directory, when I run the command (in Ubuntu 14.04, bash version 4.3.11):
nocommand > text # make sure noommand doesn't exists in your system
It reports a 'command not found' error and it erases the text file! I just wonder if I can avoid the clobber of the file if the command doesn't exist.
I try this command set -o noclobber but the same problem happens if I run:
nocommand >| text # make sure noommand doesn't exists in your system
It seems that bash redirects output before looking for specific command to run. Can anyone give me some advices how to avoid this?
Actually, the shell first looks at the redirection and creates the file. It evaluates the command after that.
Thus what happens exactly is: Because it's a > redirection, it first replaces the file with an empty file, then evaluates a command which does not exist, which produces an error message on stderr and nothing on stdout. It then stores stdout in this file (which is nothing so the file remains empty).
I agree with Nitesh that you simply need to check if the command exists first, but according to this thread, you should avoid using which. I think a good starting point would be to check at the beginning of your script that you can run all the required functions (see the thread, 3 solutions), and abort the script otherwise.
Write to a temporary file first, and only move it into place over the desired file if the command succeeds.
nocommand > tmp.txt && mv tmp.txt text
This avoids errors not only when nocommand doesn't exist, but also when an existing command exits before it can finish writing its output, so you don't overwrite text with incomplete data.
With a little more work, you can clean up the temp file in the event of an error.
{ nocommand > tmp.txt || { rm tmp.txt; false; } } && mv tmp.txt text
The inner command group ensures that the exit status of the outer command group is non-zero so that even if the rm succeeds, the mv command is not triggered.
A simpler command that carries the slight risk of removing the temp file when nocommand succeeds but the mv fails is
nocommand > tmp.txt && mv tmp.txt text || rm tmp.txt
This would write to file only if the pipe sends at least a single character:
nocommand | (
IFS= read -d '' -n 1 || exit
exec >myfile
[[ -n $REPLY ]] && echo -n "$REPLY" || printf '\x00'
exec cat
)
Or using a function:
function protected_write {
IFS= read -d '' -n 1 || exit
exec >"$1"
[[ -n $REPLY ]] && echo -n "$REPLY" || printf '\x00'
exec cat
}
nocommand | protected_write myfile
Note that if lastpipe option is enabled, you'll have to place it on a subshell:
nocommand | ( protected_write myfile )
At your option you can also just summon subshell on the function by default:
function protected_write {
(
IFS= read -d '' -n 1 || exit
exec >"$1"
[[ -n $REPLY ]] && echo -n "$REPLY" || printf '\x00'
exec cat
)
}
() summons a subshell. A subshell is a fork and runs on a different process space. In x | y, y is also summoned by default in a subshell unless lastpipe option (try shopt lastpipe) is enabled.
IFS= read -d '' -n 1 waits for a single character (see help read) and would return zero code when it reads one which bypasses exit.
exec >"$1" redirects stdout to file. This makes everything that prints to stdout print to file instead.
Everything besides \x00 when read is stored in REPLY that is why we do printf '\x00' when REPLY has null (empty) value.
exec cat replaces the subshell's process with cat which would send everything that it receives to the file and finish the remaining job. See help exec.
If you do:
set -o noclobber
then
invalidcmd > myfile
if myfile exists in current path then you will get:
-bash: myfile: cannot overwrite existing file
Check using "which" command
#!/usr/bin/env bash
command_name="npm2" # Add your command here
command=`which $command_name`
if [ -z "$command" ]; then #if command exists go ahead with your logic
echo "Command not found"
else # Else fallback
echo "$command"
fi
Hope this helps

Lynx is stopping loop?

I'll just apologize beforehand; this is my first ever post, so I'm sorry if I'm not specific enough, if the question has already been answered and I just didn't look hard enough, and if I use incorrect formatting of some kind.
That said, here is my issue: In bash, I am trying to create a script that will read a file that lists several dozen URL's. Once it reads each line, I need it to run a set of actions on that, the first being to use lynx to navigate to the website. However, in practice, it will run once perfectly on the first line. Lynx goes, the download works, and then the subsequent renaming and organizing of that file go through as well. But then it skips all the other lines and acts like it has finished the whole file.
I have tested to see if it was lynx causing the issue by eliminating all the other parts of the code, and then by just eliminating lynx. It works without Lynx, but, of course, I need lynx for the rest of the output to be of any use to me. Let me just post the code:
!#/bin/bash
while read line; do
echo $line
lynx -accept_all_cookies $line
echo "lynx done"
od -N 2 -h *.zip | grep "4b50"
echo "od done, if 1 starting..."
if [[ $? -eq 0 ]]
then ls *.*>>logs/zips.log
else
od -N 2 -h *.exe | grep "5a4d"
echo "if 2 starting..."
if [[ $? -eq 0 ]]
then ls *.*>>logs/exes.log
else
od -N 2 -h *.exe | grep "5a4d, 4b50"
echo "if 3 starting..."
if [[ $? -eq 1 ]]
then
ls *.*>>logs/failed.log
fi
echo "if 3 done"
fi
echo "if 2 done"
fi
echo "if 1 done..."
FILE=`(ls -tr *.* | head -1)`
NOW=$(date +"%m_%d_%Y")
echo "vars set"
mv $FILE "criticalfreepri/${FILE%%.*}(ZCH,$NOW).${FILE#*.}" -u
echo "file moved"
rm *.zip *.exe
echo "file removed"
done < "lynx"
$SHELL
Just to be sure, I do have a file called "lynx" that contains the urls separated by a return each. Also, I used all those "echo"s to do my own sort of debugging, but I have tried it with and without the echo's. When I execute the script, the echo's all show up...
Any help is appreciated, and thank you all so much! Hope I didn't break any rules on this post!
PS: I'm on Linux Mint running things through the "terminal" program. I'm scripting with bash in Gedit, if any of that info is relevant. Thanks!
EDIT: Actually, the echo tests repeat for all three lines. So it would appear that lynx simply can't start again in the same loop?
Here is a simplified version of the script, as requested:
!#/bin/bash
while read -r line; do
echo $line
lynx $line
echo "lynx done"
done < "ref/url"
read "lynx"
$SHELL
Note that I have changed the sites the "url" file goes to:
`www.google.com
www.majorgeeks.com
http://www.sophos.com/en-us/products/free-tools/virus-removal-tool.aspx`
Lynx is not designed to use in scripts because it locks the terminal. Lynx is an interactive console browser.
If you want to access URLs in a script use wget, for example:
wget http://www.google.com/
For exit codes see: http://www.gnu.org/software/wget/manual/html_node/Exit-Status.html
to parse the html-content use:
VAR=`wget -qO- http://www.google.com/`
echo $VAR
I found a way which may fulfilled your requirement to run lynx command in loop with substitution of different url link.
Use
echo `lynx $line`
(Echo the lynx $line in single quote('))
instead of lynx $line. You may refer below:
your code
!#/bin/bash
while read -r line; do
echo $line
lynx $line
echo "lynx done"
done < "ref/url"
read "lynx"
$SHELL
try on below
!#/bin/bash
while read -r line; do
echo $line
echo `lynx $line`
echo "lynx done"
done < "ref/url"
I should have answered this question a long time ago. I got the program working, it's now on Github!
Anyway, I simply had to wrap the loop inside a function. Something like this:
progdownload () {
printlog "attmpting download from ${URL}"
if echo "${URL}" | grep -q "http://www.majorgeeks.com/" ; then
lynx -cmd_script="${WORKINGDIR}/support/mgcmd.txt" --accept-all-cookies ${URL}
else wget ${URL}
fi
}
URL="something.com"
progdownload

Resources