how to use compound bash functions from different directories - bash

I have some directory e.g. /ice/cream which contains some files that I want to want to sort in size, and then find a minimum value in the largest file; however I want to do this from the parent directory /ice.
The bash line I wrote only works within /ice/cream, i'd like to make it work from /ice, I tried
awk 'BEGIN {min = 0} {if($7<min) min=$7} END {print min}' $(ls -lS cream/ | head -n 2 | awk '{print $9}')
which does not work because awk doesnt know the path to the file found by the second $() function; please help! Cheers

A safer way to get the largest file; the call to stat may differ depending on your implementation:
max_file () {
local max_size size
max_size=0
for f in "$1"/*; do
size=$(stat -c %s "$f")
if (( size > max_size )); then
max_file="$f"
max_size="$size"
fi
done
echo "$max_file"
}
awk '...' "$(biggest_file cream/)"

Your ls pipeline is way too complicated, and you need a * after the dir/ to get the relative name output:
awk 'BEGIN {min = 0} {if($7<min) min=$7} END {print min}' $(ls -S cream/* | head -1)

As first answered by #Etan Reisner in comment, the line was missing a *; working code is:
awk 'BEGIN {min = 0} {if($7<min) min=$7} END {print min}' $(ls -lS cream/* | head -n 1 | awk '{print $9}')
Thank you.

Related

Extend filename with word from file -

I can change the filename for a file to the first word in the file.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1}')
done
But I would like to extend it inset.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1 FILENAME}')
done
I have tried different variations of this, but none seem to work.
Is there an easy solution?
Kind regards Svend
Firstly, let understand why you did not get desired result
head -1 -q $fname|awk '{print $1 FILENAME}'
You are redirecting standard output of head command to awk command, that is awk is reading standard input and therefore FILENAME is set to empty string. Asking GNU AWK about FILENAME when it does consume standard input does not make much sense, as only data does through pipe and there might not such things as input file at all, e.g.
seq 10 | awk '{print $1*10}'
Secondly, let find way to get desired result, you have access to filename and successfully extracted word, therefore you might concat them that is
for fname in lrccas1
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done
Thirdly, I must warn you that your command does copy (cp) rather than rename file (rm) and does not care if target name does exist or not - if it do, it will be overwritten.
You can do it in pure bash (or sh)
for fname in lrccas1
do
read -r word rest < "$fname" && cp "$fname" "$word$fname"
done
This would do what your shell script appears to be trying to do:
awk 'FNR==1{close(out); out=$1 FILENAME} {print > out}' lrccas1
but you might want to consider something like this instead:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt
so your newly created files don't overwrite your existing ones and then to also remove the originals would be:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt &&
rm -f *.txt
That assumes your original files have some suffix like .txt or other way of identifying the original files, or you have all of your original files into some directory such as $HOME/old and can put the new files in a new directory such as $HOME/new:
cd "$HOME/old" &&
mkdir -p "$HOME/new" &&
awk -v newDir="$HOME/new" 'FNR==1{close(out); out=newDir "/" $1 FILENAME} {print > out}' * &&
echo rm -f *
remove the echo when done testing and happy with the result.
try to execute this: (bash)
for fname in file_name
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done

Bash script read specifc value from files of an entire folder

I have a problem creating a script that reads specific value from all the files of an entire folder
I have a number of email files in a directory and I need to extract from each file, 2 specific values.
After that I have to put them into a new file that looks like that:
--------------
To: value1
value2
--------------
This is what I want to do, but I don't know how to create the script:
# I am putting the name of the files into a temp file
`ls -l | awk '{print $9 }' >tmpfile`
# use for the name of a file
`date=`date +"%T"
# The first specific value from file (phone number)
var1=`cat tmpfile | grep "To: 0" | awk '{print $2 }' | cut -b -10 `
# The second specific value from file(subject)
var2=cat file | grep Subject | awk '{print $2$3$4$5$6$7$8$9$10 }'
# Put the first value in a new file on the first row
echo "To: 4"$var1"" > sms-$date
# Put the second value in the same file on the second row
echo ""$var2"" >>sms-$date
.......
and do the same for every file in the directory
I tried using while and for functions but I couldn't finalize the script
Thank You
I've made a few changes to your script, hopefully they will be useful to you:
#!/bin/bash
for file in *; do
var1=$(awk '/To: 0/ {print substr($2,0,10)}' "$file")
var2=$(awk '/Subject/ {for (i=2; i<=10; ++i) s=s$i; print s}' "$file")
outfile="sms-"$(date +"%T")
i=0
while [ -f "$outfile" ]; do outfile="sms-$date-"$((i++)); done
echo "To: 4$var1" > "$outfile"
echo "$var2" >> "$outfile"
done
The for loop just goes through every file in the folder that you run the script from.
I have added added an additional suffix $i to the end of the file name. If no file with the same date already exists, then the file will be created without the suffix. Otherwise the value of $i will keep increasing until there is no file with the same name.
I'm using $( ) rather than backticks, this is just a personal preference but it can be clearer in my opinion, especially when there are other quotes about.
There's not usually any need to pipe the output of grep to awk. You can do the search in awk using the / / syntax.
I have removed the cut -b -10 and replaced it with substr($2, 0, 10), which prints the first 10 characters from column 2.
It's not much shorter but I used a loop rather than the $2$3..., I think it looks a bit neater.
There's no need for all the extra " in the two output lines.
I sugest to try the following:
#!/bin/sh
RESULT_FILE=sms-`date +"%T"`
DIR=.
fgrep -l 'To: 0' "$DIR" | while read FILE; do
var1=`fgrep 'To: 0' "$FILE" | awk '{print $2 }' | cut -b -10`
var2=`fgrep 'Subject' "$FILE" | awk '{print $2$3$4$5$6$7$8$9$10 }'`
echo "To: 4$var1" >>"$RESULT_FIL"
echo "$var2" >>"$RESULT_FIL"
done

Sum of file sizes with awk on a list of files

I have a list of files and want to sum over their file sizes.
So, I created a (global) variable as counter and are trying to loop over that list, get the file size with ls and cut&add it with
export COUNTER=1
for x in $(cat ./myfiles.lst); do ls -all $x | awk '{COUNTER+=$5}'; done
However, my counter is empty?
> echo $COUNTER
> 1
Does someone has an idea for my, what I am missing here?
Cheers and thanks,
Thomas
OK, I found a way piping the result from the awk pipe into a variable
(which is probably not elegant but working ;) )
for x in $(cat ./myfiles.lst); do a=$(ls -all $x |awk '{print $5}'); COUNTER=$(($COUNTER+$a)) ; done
> echo $COUNTER
> 4793061514
awk is getting called for every file, so in COUNTER you got the last file's size.
A better solution is:
ls -all <myfiles.lst | awk '{COUNTER+=$5} END {print COUNTER}'
But you are reinventing the wheel here. You can do something like
du -s <myfiles.lst
(If you have du installed. Note: see the comments below my answer about du. I had tested this with cygwin and with that it worked like a charm.)
Shorter version of the last:
ls -l | awk '{sum += $5} END {print sum}'
Now, say you want to filter by certain types of files, age, etc... Just throw the ls -l into a find, and you can filter using find's extensive filter parameters:
find . -type f -exec ls -l {} \; | awk '{sum += $5} END {print sum}'
ls -ltS | awk -F " " {'print $5'} | awk '{s+=$1} END {print s}'

How to list all files and put number in front of them , using shell

I want to count all files that I have in my directory and put number in front of them, and in a new line, for example :
file.txt nextfile.txt example.txt
and the output to be :
1.file.txt
2.nextfile.txt
3.example.txt
and so on.
i am trying something with : ls -L |
You can do this if you have nl installed:
ls -1 | nl
(Note with modern shells (ls usually a built-in) the -1 part is not needed. And this applies to the below solutions too.)
Or with awk:
ls -1 | awk '{print NR, $0}'
Or with a single awk command:
awk '{c=1 ; for (f in ARGV) {print c, f ; c++ } }' *
Or with cat:
cat -n <(ls -1)
You can do this by using shell built-in printf in a for loop:
n=0
for i in *; do
printf "%d.%s\n" $((n++)) "$i"
done

How can I specify a row in awk in for loop?

I'm using the following awk command:
my_command | awk -F "[[:space:]]{2,}+" 'NR>1 {print $2}' | egrep "^[[:alnum:]]"
which successfully returns my data like this:
fileName1
file Name 1
file Nameone
f i l e Name 1
So as you can see some file names have spaces. This is fine as I'm just trying to echo the file name (nothing special). The problem is calling that specific row within a loop. I'm trying to do it this way:
i=1
for num in $rows
do
fileName=$(my_command | awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]])"
echo "$num $fileName"
$((i++))
done
But my output is always null
I've also tried using awk -v record=$i and then printing $record but I get the below results.
f i l e Name 1
EDIT
Sorry for the confusion: rows is a variable that list ids like this 11 12 13
and each one of those ids ties to a file name. My command without doing any parsing looks like this:
id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3
I can only use the id field to run a the command that I need, but I want to use the File Info field to notify the user of the actual File that the command is being executed against.
I think your $i does not expand as expected. You should quote your arguments this way:
fileName=$(my_command | awk -F "[[:space:]]{2,}+" "NR==$i {print \$2}" | egrep "^[[:alnum:]]")
And you forgot the other ).
EDIT
As an update to your requirement you could just pass the rows to a single awk command instead of a repeatitive one inside a loop:
#!/bin/bash
ROWS=(11 12)
function my_command {
# This function just emulates my_command and should be removed later.
echo " id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3"
}
awk -- '
BEGIN {
input = ARGV[1]
while (getline line < input) {
sub(/^ +/, "", line)
split(line, a, / +/)
for (i = 2; i < ARGC; ++i) {
if (a[1] == ARGV[i]) {
printf "%s %s\n", a[1], a[2]
break
}
}
}
exit
}
' <(my_command) "${ROWS[#]}"
That awk command could be condensed to one line as:
awk -- 'BEGIN { input = ARGV[1]; while (getline line < input) { sub(/^ +/, "", line); split(line, a, / +/); for (i = 2; i < ARGC; ++i) { if (a[1] == ARGV[i]) {; printf "%s %s\n", a[1], a[2]; break; }; }; }; exit; }' <(my_command) "${ROWS[#]}"
Or better yet just use Bash instead as a whole:
#!/bin/bash
ROWS=(11 12)
while IFS=$' ' read -r LINE; do
IFS='|' read -ra FIELDS <<< "${LINE// +( )/|}"
for R in "${ROWS[#]}"; do
if [[ ${FIELDS[0]} == "$R" ]]; then
echo "${R} ${FIELDS[1]}"
break
fi
done
done < <(my_command)
It should give an output like:
11 File Name1
12 Fi leNa me2
Shell variables aren't expanded inside single-quoted strings. Use the -v option to set an awk variable to the shell variable:
fileName=$(my_command | awk -v i=$i -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]])"
This method avoids having to escape all the $ characters in the awk script, as required in konsolebox's answer.
As you already heard, you need to populate an awk variable from your shell variable to be able to use the desired value within the awk script so thi:
awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]]"
should be this:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
Also, though, you don't need awk AND grep since awk can do anything grep van do so you can change this part of your script:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
to this:
awk -v i="$i" -F "[[:space:]]{2,}+" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
and you don't need a + after a numeric range so you can change {2,}+ to just {2,}:
awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
Most importantly, though, instead of invoking awk once for every invocation of my_command, you can just invoke it once for all of them, i.e. instead of this (assuming this does what you want):
i=1
for num in rows
do
fileName=$(my_command | awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}')
echo "$num $fileName"
$((i++))
done
you can do something more like this:
for num in rows
do
my_command
done |
awk -F '[[:space:]]{2,}' '$2~/^[[:alnum:]]/{print NR, $2}'
I say "something like" because you don't tell us what "my_command", "rows" or "num" are so I can't be precise but hopefully you see the pattern. If you give us more info we can provide a better answer.
It's pretty inefficient to rerun my_command (and awk) every time through the loop just to extract one line from its output. Especially when all you're doing is printing out part of each line in order. (I'm assuming that my_command really is exactly the same command and produces the same output every time through your loop.)
If that's the case, this one-liner should do the trick:
paste -d' ' <(printf '%s\n' $rows) <(my_command |
awk -F '[[:space:]]{2,}+' '($2 ~ /^[::alnum::]/) {print $2}')

Resources