Bash script support piping data into it - bash

I have a bash script that I want to expand to support piping json into.
Example:
echo '{}' | myscript store
So, I tried the following:
local value="$1"
if [[ -z "$value" ]]; then
while read -r piped; do
value=$piped
done;
fi
Which works in a simple case above, but doing:
cat input.json | myscript store
Only get's the last line of the file input.json, it does not handle every line.
How can I support all cases of piping?

The following works:
if [[ -z "$value" && ! -t 0 ]]; then
while read -r piped; do
value+=$piped
done;
fi
The trick was using += and also checking ! -t 0 which checks if we are piping.

If you want to behave like cat, why not use it?
#! /bin/bash
value="$( cat "$#" )"

Related

Speed up shell script/Performance enhancement of shell script

Is there a way to speed up the below shell script? It's taking me a good 40 mins to update about 150000 files everyday. Sure, given the volume of files to create & update, this may be acceptable. I don't deny that. However, if there is a much more efficient way to write this or re-write the logic entirely, I'm open to it. Please I'm looking for some help
#!/bin/bash
DATA_FILE_SOURCE="<path_to_source_data/${1}"
DATA_FILE_DEST="<path_to_dest>"
for fname in $(ls -1 "${DATA_FILE_SOURCE}")
do
for line in $(cat "${DATA_FILE_SOURCE}"/"${fname}")
do
FILE_TO_WRITE_TO=$(echo "${line}" | awk -F',' '{print $1"."$2".daily.csv"}')
CONTENT_TO_WRITE=$(echo "${line}" | cut -d, -f3-)
if [[ ! -f "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}" ]]
then
echo "${CONTENT_TO_WRITE}" >> "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
else
if ! grep -Fxq "${CONTENT_TO_WRITE}" "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
then
sed -i "/${1}/d" "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
"${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
echo "${CONTENT_TO_WRITE}" >> "${DATA_FILE_DEST}"/"${FILE_TO_WRITE_TO}"
fi
fi
done
done
There are still parts of your published script that are unclear like the sed command. Although I rewrote it with saner practices and much less external calls witch should really speed it up.
#!/usr/bin/env sh
DATA_FILE_SOURCE="<path_to_source_data/$1"
DATA_FILE_DEST="<path_to_dest>"
for fname in "$DATA_FILE_SOURCE/"*; do
while IFS=, read -r a b content || [ "$a" ]; do
destfile="$DATA_FILE_DEST/$a.$b.daily.csv"
if grep -Fxq "$content" "$destfile"; then
sed -i "/$1/d" "$destfile"
fi
printf '%s\n' "$content" >>"$destfile"
done < "$fname"
done
Make it parallel (as much as you can).
#!/bin/bash
set -e -o pipefail
declare -ir MAX_PARALLELISM=20 # pick a limit
declare -i pid
declare -a pids
# ...
for fname in "${DATA_FILE_SOURCE}/"*; do
if ((${#pids[#]} >= MAX_PARALLELISM)); then
wait -p pid -n || echo "${pids[pid]} failed with ${?}" 1>&2
unset 'pids[pid]'
fi
while IFS= read -r line; do
FILE_TO_WRITE_TO="..."
# ...
done < "${fname}" & # forking here
pids[$!]="${fname}"
done
for pid in "${!pids[#]}"; do
wait -n "$((pid))" || echo "${pids[pid]} failed with ${?}" 1>&2
done
Here’s a directly runnable skeleton showing how the harness above works (with 36 items to process and 20 parallel processes at most):
#!/bin/bash
set -e -o pipefail
declare -ir MAX_PARALLELISM=20 # pick a limit
declare -i pid
declare -a pids
do_something_and_maybe_fail() {
sleep $((RANDOM % 10))
return $((RANDOM % 2 * 5))
}
for fname in some_name_{a..f}{0..5}.txt; do # 36 items
if ((${#pids[#]} >= MAX_PARALLELISM)); then
wait -p pid -n || echo "${pids[pid]} failed with ${?}" 1>&2
unset 'pids[pid]'
fi
do_something_and_maybe_fail & # forking here
pids[$!]="${fname}"
echo "${#pids[#]} running" 1>&2
done
for pid in "${!pids[#]}"; do
wait -n "$((pid))" || echo "${pids[pid]} failed with ${?}" 1>&2
done
Strictly avoid external processes (such as awk, grep and cut) when processing one-liners for each line. fork()ing is extremely inefficient in comparison to:
Running one single awk / grep / cut process on an entire input file (to preprocess all lines at once for easier processing in bash) and feeding the whole output into (e.g.) a bash loop.
Using Bash expansions instead, where feasible, e.g. "${line/,/.}" and other tricks from the EXPANSION section of the man bash page, without fork()ing any further processes.
Off-topic side notes:
ls -1 is unnecessary. First, ls won’t write multiple columns unless the output is a terminal, so a plain ls would do. Second, bash expansions are usually a cleaner and more efficient choice. (You can use nullglob to correctly handle empty directories / “no match” cases.)
Looping over the output from cat is a (less common) useless use of cat case. Feed the file into a loop in bash instead and read it line by line. (This also gives you more line format flexibility.)

How do you add a .txt file to a shell script as a variable

Hello guy I am trying to write a basic shell script that adds, creates or lists multiple user accounts from a provide list in the form of a file specified at the command line. I am very new to this and have been banging my head on the keyboard for the last few hours. below is an example of the syntax and the code so for. (I called this script buser)
./buser.sh -a userlist (-a is the option and userlist is the filename, it is only an example)
file=$(< `pwd`/$2)
while :
do
case $1 in
-a)
useradd -m "$file"
break
;;
--add)
useradd -m "$file"
break
;;
--delete)
userdel -r "$file"
break
;;
-d)
userdel -r "$file"
break
;;
-l)
cat /etc/passwd | grep "$file"
break
;;
--list)
cat /etc/passwd | grep "$file"
break
;;
esac
done
when the useradd command reads $file it reads all the names as a single line and I get an error.
any help would be greatly appreciated thank you.
Not sure if I understood correctly.
But assuming you have a file with the following content:
**file.txt**
name1
name2
name3
You would like to call buser.sh -a file.txt and run useradd to name1, name2 and name3? I'm also assuming you're using Linux and useradd is the native program, if so I suggest to read the man, because it does not support to add a list of user at once (https://www.tecmint.com/add-users-in-linux/)
You have to call useradd multiple times instead.
while read user;
do
useradd -m $user
done <$2
A few simplifications, plus an error handler if the option doesn't exist:
while read file ; do
case "$1" in
-a|--add)
useradd -m "$file"
;;
-d|--delete)
userdel -r "$file"
;;
-l|--list)
grep -f `pwd`/"$2" /etc/passwd
break
;;
*)
echo "no such option as '$1'..."
exit 2
;;
esac
done < `pwd`/"$2"
Note: the above logic is a bit redundant... case "$1" keeps doing the same test, (with the same result), every pass. OTOH, it works, and it's less code than a while loop in each command list.
You can use sed to create the commands, and eval to run them:
var=$( sed -e 's/^/useradd -m /' -e 's/$/;/' $file )
eval "$var"
(Edited to put in the -m flag.)

Bash command runs only once in a while loop [duplicate]

This question already has answers here:
While loop stops reading after the first line in Bash
(5 answers)
Closed 6 years ago.
I am writing a Bash file to execute two PhantomJS tasks.
I have two tasks written in external JS files: task1.js & task2.js.
Here's my Bash script so far:
#!/bin/bash
url=$1
cd $(cd $(dirname ${BASH_SOURCE}); pwd -P)
dir=../temp
mkdir -p $dir
file=$dir/file.txt
phantomjs "taks1.js" $url > $file
while IFS="" read -r line || [[ -n $line ]]; do
dir=../build
file=$dir/$line.html
mkdir -p $(dirname $file)
phantomjs "task2.js" $url $line > $file
done < $file
For some unknown reason task2 is being run only once, then the script stops.
If I remove the PhantomJS command, the while loop runs normally until all lines are read from the file.
Maybe someone knows why is that?
Cheers.
Your loop is reading contents from stdin. If any other program you run consumes stdin, the loop will terminate.
Either fix any program that may be consuming stdin to read from /dev/null, or use a different FD for the loop.
The first approach looks like this:
phantomjs "task2.js" "$url" "$line" >"$file" </dev/null
The second looks like this (note the 3< on establishing the redirection, and the <&3 to read from that file descriptor):
while IFS="" read -r line <&3 || [[ -n $line ]]; do
dir=../build
file=$dir/$line.html
mkdir -p "$(dirname "$file")"
phantomjs "task2.js" "$url" "$line" >"$file"
done 3< $file
By the way, consider taking file out of the loop altogether, by having the loop read directly from the first phantomjs program's output:
while IFS="" read -r line <&3 || [[ -n $line ]]; do
dir=../build
file=$dir/$line.html
mkdir -p "$(dirname "$file")"
phantomjs "task2.js" "$url" "$line" >"$file"
done 3< <(phantomjs "task1.js" "$url")

Lynx is stopping loop?

I'll just apologize beforehand; this is my first ever post, so I'm sorry if I'm not specific enough, if the question has already been answered and I just didn't look hard enough, and if I use incorrect formatting of some kind.
That said, here is my issue: In bash, I am trying to create a script that will read a file that lists several dozen URL's. Once it reads each line, I need it to run a set of actions on that, the first being to use lynx to navigate to the website. However, in practice, it will run once perfectly on the first line. Lynx goes, the download works, and then the subsequent renaming and organizing of that file go through as well. But then it skips all the other lines and acts like it has finished the whole file.
I have tested to see if it was lynx causing the issue by eliminating all the other parts of the code, and then by just eliminating lynx. It works without Lynx, but, of course, I need lynx for the rest of the output to be of any use to me. Let me just post the code:
!#/bin/bash
while read line; do
echo $line
lynx -accept_all_cookies $line
echo "lynx done"
od -N 2 -h *.zip | grep "4b50"
echo "od done, if 1 starting..."
if [[ $? -eq 0 ]]
then ls *.*>>logs/zips.log
else
od -N 2 -h *.exe | grep "5a4d"
echo "if 2 starting..."
if [[ $? -eq 0 ]]
then ls *.*>>logs/exes.log
else
od -N 2 -h *.exe | grep "5a4d, 4b50"
echo "if 3 starting..."
if [[ $? -eq 1 ]]
then
ls *.*>>logs/failed.log
fi
echo "if 3 done"
fi
echo "if 2 done"
fi
echo "if 1 done..."
FILE=`(ls -tr *.* | head -1)`
NOW=$(date +"%m_%d_%Y")
echo "vars set"
mv $FILE "criticalfreepri/${FILE%%.*}(ZCH,$NOW).${FILE#*.}" -u
echo "file moved"
rm *.zip *.exe
echo "file removed"
done < "lynx"
$SHELL
Just to be sure, I do have a file called "lynx" that contains the urls separated by a return each. Also, I used all those "echo"s to do my own sort of debugging, but I have tried it with and without the echo's. When I execute the script, the echo's all show up...
Any help is appreciated, and thank you all so much! Hope I didn't break any rules on this post!
PS: I'm on Linux Mint running things through the "terminal" program. I'm scripting with bash in Gedit, if any of that info is relevant. Thanks!
EDIT: Actually, the echo tests repeat for all three lines. So it would appear that lynx simply can't start again in the same loop?
Here is a simplified version of the script, as requested:
!#/bin/bash
while read -r line; do
echo $line
lynx $line
echo "lynx done"
done < "ref/url"
read "lynx"
$SHELL
Note that I have changed the sites the "url" file goes to:
`www.google.com
www.majorgeeks.com
http://www.sophos.com/en-us/products/free-tools/virus-removal-tool.aspx`
Lynx is not designed to use in scripts because it locks the terminal. Lynx is an interactive console browser.
If you want to access URLs in a script use wget, for example:
wget http://www.google.com/
For exit codes see: http://www.gnu.org/software/wget/manual/html_node/Exit-Status.html
to parse the html-content use:
VAR=`wget -qO- http://www.google.com/`
echo $VAR
I found a way which may fulfilled your requirement to run lynx command in loop with substitution of different url link.
Use
echo `lynx $line`
(Echo the lynx $line in single quote('))
instead of lynx $line. You may refer below:
your code
!#/bin/bash
while read -r line; do
echo $line
lynx $line
echo "lynx done"
done < "ref/url"
read "lynx"
$SHELL
try on below
!#/bin/bash
while read -r line; do
echo $line
echo `lynx $line`
echo "lynx done"
done < "ref/url"
I should have answered this question a long time ago. I got the program working, it's now on Github!
Anyway, I simply had to wrap the loop inside a function. Something like this:
progdownload () {
printlog "attmpting download from ${URL}"
if echo "${URL}" | grep -q "http://www.majorgeeks.com/" ; then
lynx -cmd_script="${WORKINGDIR}/support/mgcmd.txt" --accept-all-cookies ${URL}
else wget ${URL}
fi
}
URL="something.com"
progdownload

how to create the option for printing out statements vs executing them in a shell script

I'm looking for a way to create a switch for this bash script so that I have the option of either printing (echo) it to stdout or executing the command for debugging purposes. As you can see below, I am just doing this manually by commenting out one statement over the other to achieve this.
Code:
#!/usr/local/bin/bash
if [ $# != 2 ]; then
echo "Usage: testcurl.sh <localfile> <projectname>" >&2
echo "sample:testcurl.sh /share1/data/20110818.dat projectZ" >&2
exit 1
fi
echo /usr/bin/curl -c $PROXY --certkey $CERT --header "Test:'${AUTH}'" -T $localfile $fsProxyURL
#/usr/bin/curl -c $PROXY --certkey $CERT --header "Test:'${AUTH}'" -T $localfile $fsProxyURL
I'm simply looking for an elegant/better way to create like a switch from the command line. Print or execute.
One possible trick, though it will only work for simple commands (e.g., no pipes or redirection (a)) is to use a prefix variable like:
pax> cat qq.sh
${PAXPREFIX} ls /tmp
${PAXPREFIX} printf "%05d\n" 72
${PAXPREFIX} echo 3
What this will do is to insert you specific variable (PAXPREFIX in this case) before the commands. If the variable is empty, it will not affect the command, as follows:
pax> ./qq.sh
my_porn.gz copy_of_the_internet.gz
00072
3
However, if it's set to echo, it will prefix each line with that echo string.
pax> PAXPREFIX=echo ./qq.sh
ls /tmp
printf %05d\n 72
echo 3
(a) The reason why it will only work for simple commands can be seen if you have something like:
${PAXPREFIX} ls -1 | tr '[a-z]' '[A-Z]'
When PAXPREFIX is empty, it will simply give you the list of your filenames in uppercase. When it's set to echo, it will result in:
echo ls -1 | tr '[a-z]' '[A-Z]'
giving:
LS -1
(not quite what you'd expect).
In fact, you can see a problem with even the simple case above, where %05d\n is no longer surrounded by quotes.
If you want a more robust solution, I'd opt for:
if [[ ${PAXDEBUG:-0} -eq 1 ]] ; then
echo /usr/bin/curl -c $PROXY --certkey $CERT --header ...
else
/usr/bin/curl -c $PROXY --certkey $CERT --header ...
fi
and use PAXDEBUG=1 myscript.sh to run it in debug mode. This is similar to what you have now but with the advantage that you don't need to edit the file to switch between normal and debug modes.
For debugging output from the shell itself, you can run it with bash -x or put set -x in your script to turn it on at a specific point (and, of course, turn it off with set +x).
#!/usr/local/bin/bash
if [[ "$1" == "--dryrun" ]]; then
echoquoted() {
printf "%q " "$#"
echo
}
maybeecho=echoquoted
shift
else
maybeecho=""
fi
if [ $# != 2 ]; then
echo "Usage: testcurl.sh <localfile> <projectname>" >&2
echo "sample:testcurl.sh /share1/data/20110818.dat projectZ" >&2
exit 1
fi
$maybeecho /usr/bin/curl "$1" -o "$2"
Try something like this:
show=echo
$show /usr/bin/curl ...
Then set/unset $show accordingly.
This does not directly answer your specific question, but I guess you're trying to see what command gets executed for debugging. If you replace #!/usr/local/bin/bash with #!/usr/local/bin/bash -x bash will run and echo the commands in your script.
I do not know of a way for "print vs execute" but I know of a way for "print and execute", and it is using "bash -x". See this link for example.

Resources