Save multiple variables from bash script to text file - bash

I have a simple bash script I have written to count the number of lines in a collection of text files, and I store each number of lines as a variable using a for loop. I would like to print each variable to the same text file, so that I may access all the line counts at once, from the same file.
My code is:
for f in *Daily.txt; do
lines=$(cat $f | wc -l);
lines=$(($num_lines -1));
echo $lines > /destdrive/linesTally2014.txt;
done
When I run this, the only output I receive is of the final file, not all the other files.
If anyone could help me with this I would really appreciate it. I am new to bash scripting, so please excuse this novice question.

You create the file on each iteration. Move the I/O redirection after the done. Use:
for f in *Daily.txt
do
echo $(( $(wc -l < $f) - 1))
done > /destdrive/linesTally2014.txt
This avoids the variable; if you have a need for it, you can use a fixed version of the original code (use $lines throughout, instead of using $num_lines once). Note that the code in the question has a UUoC (Useless Use of cat) that this version avoids.

You can avoid the loop with
wc -l *Daily.txt | awk '{ print $1 }' > /destdrive/linesTally2014.txt
or (when you want 1 less)
wc -l *Daily.txt | awk '{ print $1 -1 }' > /destdrive/linesTally2014.txt

The above suggestions are probably better, but the problem you're having with your script is your use of the > for redirection, which overwrites the file. Use >> and it will append to the file.
echo $lines >> /destdrive/linesTally2014.txt

Related

Evaluating a log file using a sh script

I have a log file with a lot of lines with the following format:
IP - - [Timestamp Zone] 'Command Weblink Format' - size
I want to write a script.sh that gives me the number of times each website has been clicked.
The command:
awk '{print $7}' server.log | sort -u
should give me a list which puts each unique weblink in a separate line. The command
grep 'Weblink1' server.log | wc -l
should give me the number of times the Weblink1 has been clicked. I want a command that converts each line created by the Awk command above to a variable and then create a loop that runs the grep command on the extracted weblink. I could use
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
done
(source: Read a file line by line assigning the value to a variable) but I don't want to save the output of the Awk script in a .txt file.
My guess would be:
while IFS='' read -r line || [[ -n "$line" ]]; do
grep '$line' server.log | wc -l | ='$variabel' |
echo " $line was clicked $variable times "
done
But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?
Shell commands in a loop connect the same way they do without a loop, and you aren't very close. But yes, this can be done in a loop if you want the horribly inefficient way for some reason such as a learning experience:
awk '{print $7}' server.log |
sort -u |
while IFS= read -r line; do
n=$(grep -c "$line" server.log)
echo "$line" clicked $n times
done
# you only need the read || [ -n ] idiom if the input can end with an
# unterminated partial line (is illformed); awk print output can't.
# you don't really need the IFS= and -r because the data here is URLs
# which cannot contain whitespace and shouldn't contain backslash,
# but I left them in as good-habit-forming.
# in general variable expansions should be doublequoted
# to prevent wordsplitting and/or globbing, although in this case
# $line is a URL which cannot contain whitespace and practically
# cannot be a glob. $n is a number and definitely safe.
# grep -c does the count so you don't need wc -l
or more simply
awk '{print $7}' server.log |
sort -u |
while IFS= read -r line; do
echo "$line" clicked $(grep -c "$line" server.log) times
done
However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:
awk '{n[$7]++}
END{for(i in n){
print i,"clicked",n[i],"times"}}' |
sort
# or GNU awk 4+ can do the sort itself, see the doc:
awk '{n[$7]++}
END{PROCINFO["sorted_in"]="#ind_str_asc";
for(i in n){
print i,"clicked",n[i],"times"}}'
The associative array n collects the values from the seventh field as keys, and on each line, the value for the extracted key is incremented. Thus, at the end, the keys in n are all the URLs in the file, and the value for each is the number of times it occurred.

BASH output from grep

I am relatively new to bash and I am testing my code for the first case.
counter=1
for file in not_processed/*.txt; do
if [ $counter -le 1 ]; then
grep -v '2018-07' $file > bis.txt;
counter=$(($counter+1));
fi;
done
I want to subtract all the lines containing '2018-07' from my file. The new file needs to be named $file_bis.txt.
Thanks
With sed or awk it's much easier and faster to process complex files.
sed -n '/2018-07/p' not_processed/*.txt
then you get the output in your console. If you want you can pipe the output to a new file.
sed -n '/2018-07/p' not_processed/*.txt >> out.txt
This is to do it on all files in not_processed/*.txt
for file in not_processed/*.txt
do
grep -v '2018-07' $file > "$file"_bis.txt
done
And this is to do it only on the first 2 files in not_processed/*.txt
for file in $(ls not_processed/*.txt|head -2)
do
grep -v '2018-07' $file > "$file"_bis.txt
done
Don't forget to add "" on $file, because otherwise bash considers $file_bis as a new variable, which has no assigned value.
I don't understood why you are using a counter and if condition for this simple requirement. Use below script which will fulfill you requirement:-
#first store all the files in a variable
files=$(ls /your/path/*.txt)
# now use a for loop
for file in $files;
do
grep '2018-07' $file >> bis.txt
done
Better avoid for loop here as below single line is suffice
grep -h '2018-07' /your/path/*.txt > bis.txt

Passing input to sed, and sed info to a string

I have a list of files (~1000) and there is 1 file per line in my text file named: 'files.txt'
I have a macro that looks something like the following:
#!/bin/sh
b=$(sed '${1}q;d' files.txt)
cat > MyMacro_${1}.C << +EOF
myFile = new TFile("/MYPATHNAME/$b");
+EOF
and I use this input script by doing
./MakeMacro.sh 1
and later I want to do
./MakeMacro.sh 2
./MakeMacro.sh 3
...etc
So that it reads the n'th line of my files.txt and feeds that string to my created .C macro.
So that it reads the n'th line of my files.txt and feeds that string to my created .C macro.
Given this statement and your tags, I'm going to answer using shell tools and not really address the issue of the .c macro.
The first line of your script contains a sed script. There are numerous ways to get the Nth line from a text file. The simplest might be to use head and tail.
$ head -n "${i}" files.txt | tail -n 1
This takes the first $i lines of files.txt, and shows you the last 1 lines of that set.
$ sed -ne "${i}p" files.txt
This use of sed uses -n to avoid printing by default, then prints the $ith line. For better performance, try:
$ sed -ne "${i}{p;q;}" files.txt
This does the same, but quits after printing the line, so that sed doesn't bother traversing the rest of the file.
$ awk -v i="$i" 'NR==i' files.txt
This passes the shell variable $i into awk, then evaluates an expression that tests whether the number of records processed is the same as that variable. If the expression evaluates true, awk prints the line. For better performance, try:
$ awk -v i="$i" 'NR==i{print;exit}' files.txt
Like the second sed script above, this will quit after printing the line, so as to avoid traversing the rest of the file.
Plenty of ways you could do this by loading the file into an array as well, but those ways would take more memory and perform less well. I'd use one-liners if you can. :)
To take any of these one-liners and put it into your script, you already have the notation:
if expr "$i" : '[0-9][0-9]*$' >/dev/null; then
b=$(sed -ne "${i}{p;q;}" files.txt)
else
echo "ERROR: invalid line number" >&2; exit 1
fi
If I am understanding you correctly, you can do a for loop in bash to call the script multiple times with different arguments.
for i in `seq 1 n`; do ./MakeMacro.sh $i; done
Based on the OP's comment, it seems that he wants to submit the generated files to Condor. You can modify the loop above to include the condor submission.
for i in `seq 1 n`; do ./MakeMacro.sh $i; condor_submit <OutputFile> ; done
i=0
while read file
do
((i++))
cat > MyMacro_${i}.C <<-'EOF'
myFile = new TFile("$file");
EOF
done < files.txt
Beware: you need tab indents on the EOF line.
I'm puzzled about why this is the way you want to do the job. You could have your C++ code read files.txt at runtime and it would likely be more efficient in most ways.
If you want to get the Nth line of files.txt into MyMacro_N.C, then:
{
echo
sed -n -e "${1}{s/.*/myFile = new TFILE(\"&\");/p;q;}" files.txt
echo
} > MyMacro_${1}.C
Good grief. The entire script should just be (untested):
awk -v nr="$1" 'NR==nr{printf "\nmyFile = new TFile(\"/MYPATHNAME/%s\");\n\n",$0 > ("MyMacro_"nr".C")}' files.txt
You can throw in a ;exit before the } if performance is an issue but I doubt if it will be.

How to process lines which is read from standard input in UNIX shell script?

I get stuck by this problem:
I wrote a shell script and it gets a large file with many lines from stdin, that's how it is executed:
./script < filename
I want use the file as an input to another operation in the script, however I don't know how to store this file's name in a variable.
It is a script that takes a file from stdin as argument and then do awk operation in this file it self. Say if I write in script:
script:
#!/bin/sh
...
read file
...
awk '...' < "$file"
...
it only reads first line of the input file.
And I find a way to write like this:
Min=-1
while read line; do
n=$(echo $line | awk -F$delim '{print NF}')
if [ $Min -eq -1 ] || [ $n -lt $Min ];then
Min=$n
fi
done
it would take very very long time to wait for processing, it seems awk takes much time.
So how to improve this?
/dev/stdin can be quite useful here.
In fact, it's just a chain of links to your input.
So, writing cat /dev/stdin will give you all input from your file and you can deny using input filename at all.
Now answer to question :) Recursively read links, beginning at /dev/stdin, and you will get filename. Bash code:
r(){
l=`readlink $1`
if [ $? -ne 0 ]
then
echo $1
else
r $l
fi
}
filename=`r /dev/stdin`
echo $filename
UPD:
in Ubuntu I found an option -f to readlink. i.e. readlink -f /dev/stdin gives the same output. This option may absent in some systems.
UPD2:tests (test.sh is code above):
$ ./test.sh <input # that is a file
/home/sfedorov/input
$ ./test.sh <<EOF
> line
> EOF
/tmp/sh-thd-214216298213
$ echo 1 | ./test.sh
pipe:[91219]
$ readlink -f /dev/stdin < input
/home/sfedorov/input
$ readlink -f /dev/stdin << EOF
> line
> EOF
/tmp/sh-thd-3423766239895 (deleted)
$ echo 1 | readlink -f /dev/stdin
/proc/18489/fd/pipe:[92382]
You're overdoing this. The way you invoke your script:
the file contents are the script's standard input
the script receives no argument
But awk already takes input from stdin by default, so all you need to do to make this work is:
not give awk any file name argument, it's going to be the wrapping shell's stdin automatically
not consume any of that input before the wrapping script reaches the awk part. Specifically: no read
If that's all there is to your script, it reduces to the awk invocation, so you might consider doing away with it altogether and just call awk directly. Or make your script directly an awk one instead of a sh one.
Aside: the reason your while read line/multiple awk variant (the one in the question) is slow is because it spawns an awk process for each and every line of the input, and process spawning is order of magnitudes slower than awk processing a single line. The reason why the generate tmpfile/single awk variant (the one in your answer) is still a bit slow is because it's generating the tmpfile line by line, reopening to append every time.
Modify your script to that it takes the input file name as an argument, then read from the file in your script:
$ ./script filename
In script:
filename=$1
awk '...' < "$filename"
If your script just reads from standard input, there is no guarantee that there is a named file providing the input; it could just as easily be reading from a pipe or a network socket.
How about invoking the script differently pipe standard output of YourFilename into
your scriptName as follows (the standard output of the cat filename now becomes standard
input to you script, actually in this case to the awk command
For I have filename Names.data and script showNames.sh execute as follows
cat Names.data | ./showNames.sh
Contents of filename Names.data
Huckleberry Finn
Jack Spratt
Humpty Dumpty
Contents of scrip;t showNames.sh
#!/bin/bash
#whatever awk commands you need
awk "{ print }"
Well I finally find this way to solve my problem, although it will take several seconds.
grep '.*' >> /tmp/tmpfile
Min=$(awk -F$delim 'NF < min || min == "" { min = NF };END {printmin}'</tmp/tmpfile)
Just append each line into a temporary file so that after reading from stdin, the tmpfile is the same as input file.

BASH - How to retrieve a single line from the file?

How to retrieve a single line from the file?
file.txt
"aaaaaaa"
"bbbbbbb"
"ccccccc"
"ddddddd"
I need to retrieve the line 3 ("ccccccc")
Thank you.
sed is your friend. sed -n 3p prints the third line (-n: no automatic print, 3p: print when line number is 3). You can also have much more complex patterns, for example sed -n 3,10p to print lines 3 to 10.
If the file is very big, you may consider to not cycle through the whole file, but quit after the print. sed -n '3{p;q}'
If you know you need line 3, one approach is to use head to get the first three lines, and tail to get only the last of these:
varname="$(head -n 3 file.txt | tail -n 1)"
Another approach, using only Bash builtins, is to call read three times:
{ read ; read ; IFS= read -r varname } < file.txt
Here's a way to do it with awk:
awk 'FNR==3 {print; exit}' file.txt
Explanation:
awk '...' : Invoke awk, a tool for manipulating files line-by-line. Instructions enclosed by single quotes are executed by awk.
FNR==3 {print; exit}: FNR stands for "File Number Records"; just think of it as "number of lines read so far for this file". Here we are saying, if we are on the 3rd line of the file, print the entire line and then exit awk immediately so we don't waste time reading the rest of a large file.
file.txt: specify the input file as an argument to awk to save a cat.
There are many possibilities: Try so:
sed '3!d' test
Here is a very fast version:
sed "1d; 2d; 3q"
Are other tools than bash allowed? On systems that include bash, you'll usually find sed and awk or other basic tools:
$ line="$(sed -ne 3p input.txt)"
$ echo "$line"
or
$ read line < <(awk 'NR==3' input.txt)
$ echo "$line"
or if you want to optimize this by quitting after the 3rd line is read:
$ read line < <(awk 'NR==3{print;nextfile}' input.txt)
$ echo "$line"
or how about even simpler tools (though less optimized):
$ line="`head -n 3 input.txt | tail -n 1`"
$ echo "$line"
Of course, if you really want to do this all within bash, you can still make it a one-liner, without using any external tools.
$ for (( i=3 ; i-- ; )); do read line; done < input.txt
$ echo "$line"
There are many ways to achieve the same thing. Pick one that makes sense for your task. Next time, perhaps explain your overall needs a bit better, so we can give you answers more applicable to your situation.
Since, as usual, all the other answers involve trivial and usual stuff (pipe through grep then awk then sed then cut or you-name-it), here's a very unusual and (sadly) not very well-known one (so, I hereby claim that I have the most original answer):
mapfile -s2 -n3 -t < input.txt
echo "$MAPFILE"
I would say this is fairly efficient (mapfile is quite efficient and it's a bash builtin).
Done!
Fast bash version;
while (( ${i:-1} <= 3 )); do
(( $i == 3 )) && read -r line; (( i++ ))
done < file.txt
Output
echo "$line" # Third line
"ccccccc"
Explanation
while (( ${i:-1} <= 3 )) - Count until $i equals 3 then exit loop.
(( $i == 3 )) - If $i is equal to 3 execute read line.
read -r line - Read the file line into variable $line.
(( i++ )) - Increment $i by 1 at each loop.
done < file.txt - Pipe file into while loop.

Resources