execute an if statement on every folder - shell

I have for example 3 files (it could 1 or it could be 30) like this :
name_date1.tgz
name_date2.tgz
name_date3.tgz
When extracted it will look like :
name_date1/data/info/
name_date2/data/info/
name_date3/data/info/
Here how it looks inside each folder:
name_date1/data/info/
you.log
you.log.1.gz
you.log.2.gz
you.log.3.gz
name_date2/data/info/
you.log
name_date3/data/info/
you.log
you.log.1.gz
you.log.2.gz
What I want to do is concatenate all you file from each folder and concatenate one more time all the concatenated one to one single file.
1st step: extract all the folder
for a in *.tgz
do
a_dir=${a%.tgz}
mkdir $a_dir 2>/dev/null
tar -xvzf $a -C $a_dir >/dev/null
done
2nd step: executing an if statement on each folder available and cat everything
myarray=(`find */data/info/ -maxdepth 1 -name "you.log.*.gz"`)
ls -d */ | xargs -I {} bash -c "cd '{}' &&
if [ ${#myarray[#]} -gt 0 ];
then
find data/info -name "you.log.*.gz" -print0 | sort -z -rn -t. -k4 | xargs -0 zcat | cat -
data/info/you.log > youfull1.log
else
cat - data/info/you.log > youfull1.log
fi "
cat */youfull1.log > youfull.log
My issue when I put multiple name_date*.tgzit gives me this error:
gzip: stdin: unexpected end of file
With the error, I still have all my files concatenated, but why error message ?
But when I put only one .tgz file then I don't have any issue regardless the number you file.
any suggestion please ?

Try something simpler. No need for myarray. Pass files one at a time as they are inputted and decide what to do with them one at a time. Try:
find */data/info -type f -maxdepth 1 -name "you.log*" -print0 |
sort -z |
xargs -0 -n1 bash -c '
if [[ "${1##*.}" == "gz" ]]; then
zcat "$1";
else
cat "$1";
fi
' --
If you have to iterate over directories, don't use ls, still use find.
find . -maxdepth 1 -type d -name 'name_date*' -print0 |
sort -z |
while IFS= read -r -d '' dir; do
cat "$dir"/data/info/you.log
find "$dir"/data/info -type f -maxdepth 1 -name 'you.log.*.gz' -print0 |
sort -z -t'.' -n -k3 |
xargs -r -0 zcat
done
or (if you have to) with xargs, which should give you the idea how it's used:
find . -maxdepth 1 -type d -name 'name_date*' -print0 |
sort -z |
xargs -0 -n1 bash -c '
cat "$1"/data/info/you.log
find "$1"/data/info -type f -maxdepth 1 -name "you.log.*.gz" -print0 |
sort -z -t"." -n -k3 |
xargs -r -0 zcat
' --
Use -t option with xargs to see what it's doing.

Related

Finding most recent file from a list of directories from find command

I use find . -type d -name "Financials" to find all the directories called "Financials" under the current directory. Since I am on Mac, I can use the following (which I found from another stackoverflow question) to find the latest modified file in my current directory: find . -type f -print0 | xargs -0 stat -f "%m %N" | sort -rn | head -1 | cut -f2- -d" ". What I would like to do is find a way to pipe the results of the first command into the second command--i.e. to find the most recently modified file in each "Financials" directory. Is there a way to do this?
I think you could:
find . -type d -name "Financials" -print0 |
xargs -0 -I{} find {} -type f -print0 |
xargs -0 stat -f "%m %N" | sort -rn | head -1 | cut -f2- -d" "
But if you want separately for each dir, then... why not just loop it:
find . -type d -name "Financials" |
while IFS= read -r dir; do
echo "newest file in $dir is $(
find "$dir" -type f -print0 |
xargs -0 stat -f "%m %N" | sort -rn | head -1 | cut -f2- -d" "
)"
done
Nest the 2nd file+xargs inside a first find+xargs:
find . -type d -name "Financials" -print0 \
| xargs -0 sh -c '
for d in "$#"; do
find "$d" -type f -print0 \
| xargs -0 stat -f "%m %N" \
| sort -rn \
| head -1 \
| cut -f2- -d" "
done
' sh
Note the trailing "sh" in sh -c '...' sh -- that word becomes "$0" inside the shell script so the for-loop can iterate over all the directories.
A robust way that will also avoid problems with funny filenames that contain special characters is:
find all files within this particular subdirectory, and extract the inode number and modifcation time
$ find . -type f -ipath '*/Financials/*' -printf "%T# %i\n"
extract youngest file's inode number
$ ... | awk '($1>t){t=$1;i=$2}END{print i}'
search file information by inode number
$ find . -type f -inum "$inode" '*/Financials/*'
So this gives you:
$ inode="$(find . -type f -ipath '*/Financials/*' -printf "%T# %i\n" | awk '($1>t){t=$1;i=$2}END{print i}')"
$ find . -type f -inum "$inode" '*/Financials/*'

Find and count compressed files by extension

I have a bash script that counts compressed files by file extension and prints the count.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$($FIND_COMPRESSED | wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
However, the script works only if there are NO files with .deb .tar .gz .tgz .zip.
If there are some, say test.zip and test.tar in the current folder, I get this error:
./arch.sh: line 5: 1: command not found
Yet, if I copy the contents of the FIND_COMPRESSED variable into the COUNT_LINES, all works fine.
#!/bin/bash
FIND_COMPRESSED=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$')
COUNT_LINES=$(find . -type f | sed -e 's/.*\.//' | sort | uniq -c | sort -rn | grep -Ei '(deb|tgz|tar|gz|zip)$'| wc -l)
if [[ $COUNT_LINES -eq 0 ]]; then
echo "No archived files found!"
else
echo "$FIND_COMPRESSED"
fi
What am I missing here?
So when you do that variable like that, it tries to execute it like a command, which is why it fails when it has contents. When it's empty, wc simply returns 0 and it marches on.
Thus, you need to change that line to this:
COUNT_LINES=$(echo $FIND_COMPRESSED | wc -l)
But, while we're at it, you can also simplify the other line with something like this:
FIND_COMPRESSED=$(find . -type f -iname "*deb" -or -iname "*tgz" -or -iname "*tar*") #etc
you can do
mapfile FIND_COMPRESSED < <(find . -type f -regextype posix-extended -regex ".*(deb|tgz|tar|gz|zip)$" -exec bash -c '[[ "$(file {})" =~ compressed ]] && echo {}' \;)
COUNT_LINES=${#FIND_COMPRESSED[#]}

Why does my xargs command with a pipe work only for a single file, but not multiple?

I am trying to pipe a few commands in a row; it works with a single file, but gives me an error once I try it on multiple files at once.
On a single file in my working folder:
find . -type f -iname "summary.5runs.*" -print0 | xargs -0 cut -f1-2 | head -n 2
#It works
Now I want to scan all files with a certain prefix/suffix in the name in all subdirectories of my working folder, then write the results to text file
find . -type f -iname "ww.*.out.txt" -print0 | xargs -0 cut -f3-5 | head -n 42 > summary.5runs.txt
#Error: xargs: cut: terminated by signal 13
I guess my problem is to reiterate through multiple files, but I am not sure how to do it.
Your final head stops after 42 lines of total output, but you want it to operate per file. You could fudge around with a subshell in xargs:
xargs -0 -I{} bash -c 'cut -f3-5 "$1" | head -n 42' _ {} > summary.5runs.txt
or you could make it part of an -exec action:
find . -type f -iname "ww.*.out.txt" \
-exec bash -c 'cut -f3-5 "$1" | head -n 42' _ {} \; > summary.5runs.txt
Alternatively, you could loop over all the files in the subshell so you have to spawn just one:
find . -type f -iname "ww.*.out.txt" \
-exec bash -c 'for f; do cut -f3-5 "$f" | head -n 42; done' _ {} + \
> summary.5runs.txt
Notice the {} + instead of {} \;.

How to count files in subdir and filter output in bash

Hi hoping someone can help, I have some directories on disk and I want to count the number of files in them (as well as dir size if possible) and then strip info from the output. So far I have this
find . -type d -name "*,d" -print0 | xargs -0 -I {} sh -c 'echo -e $(find "{}" | wc -l) "{}"' | sort -n
This gets me all the dir's that match my pattern as well as the number of files - great!
This gives me something like
2 ./bob/sourceimages/psd/dzv_body.psd,d
2 ./bob/sourceimages/psd/dzv_body_nrm.psd,d
2 ./bob/sourceimages/psd/dzv_body_prm.psd,d
2 ./bob/sourceimages/psd/dzv_eyeball.psd,d
2 ./bob/sourceimages/psd/t_zbody.psd,d
2 ./bob/sourceimages/psd/t_gear.psd,d
2 ./bob/sourceimages/psd/t_pupil.psd,d
2 ./bob/sourceimages/z_vehicles_diff.tga,d
2 ./bob/sourceimages/zvehiclesa_diff.tga,d
5 ./bob/sourceimages/zvehicleswheel_diff.jpg,d
From that I would like to filter based on max number of files so > 4 for example, I would like to capture filetype as a variable for each remaining result e.g ./bob/sourceimages/zvehicleswheel_diff.jpg,d
I guess I could use awk for this?
Then finally I would like like to remove all the results from disk, with find I normally just do something like -exec rm -rf {} \; but I'm not clear how it would work here
Thanks a lot
EDITED
While this is clearly not the answer, these commands get me the info I want in the form I want it. I just need a way to put it all together and not search multiple times as that's total rubbish
filetype=$(find . -type d -name "*,d" -print0 | awk 'BEGIN { FS = "." }; {
print $3 }' | cut -d',' -f1)
filesize=$(find . -type d -name "*,d" -print0 | xargs -0 -I {} sh -c 'du -h
{};' | awk '{ print $1 }')
filenumbers=$(find . -type d -name "*,d" -print0 | xargs -0 -I {} sh -c
'echo -e $(find "{}" | wc -l);')
files_count=`ls -keys | nl`
For instance:
ls | nl
nl printed numbers of lines

Bash Script interactive mv issues

I'm working on a bash script to help organize files and I want to use mv -i to make sure I don't write over something important.
The script is working right now except for the -i for the mv.
It shows (y/n [n]) not overwritten part, but then goes and and doesn't allow me to interact with it.
createList()
{
ls *.epub | sed 's/-.*//' |uniq >> list.txt
ls *.mobi | sed 's/-.*//' |uniq >> list2.txt
}
atag()
{
find /Users/j/Desktop/Source -maxdepth 1 -iname "*.epub" -type f -print0 | xargs -0 -I '{}' tag -a Purple {}
find /Users/j/Desktop/Source -maxdepth 1 -iname "*.mobi" -type f -print0 | xargs -0 -I '{}' tag -a Purple {}
}
moveEpub()
{
while read -r line; do
if [ -d "/Users/j/Desktop/Dest/$line" ]; then
if [ -d "/Users/j/Desktop/Dest/$line/EPUB" ]; then
find /Users/j/Desktop/Source/ -maxdepth 1 -iname "*$line*" -and ! -iname ".*$line*" -type f -print0 | xargs -0 -I '{}' mv -i {} /Users/j/Desktop/Dest/"$line"/EPUB/
else
mkdir "/Users/j/Desktop/Dest/$line/EPUB"
find /Users/j/Desktop/Source/ -maxdepth 1 -iname "*$line*" -and ! -iname ".*$line*" -type f -print0 | xargs -0 -I '{}' mv -i {} /Users/j/Desktop/Dest/"$line"/EPUB/
fi
fi
done < "list.txt"
}
moveMobi()
{
while read -r line; do
if [ -d "/Users/j/Desktop/Dest/$line" ]; then
if [ -d "/Users/j/Desktop/Dest/$line/MOBI" ]; then
find /Users/j/Desktop/Source/ -maxdepth 1 -iname "*$line*" -and ! -iname ".*$line*" -type f -print0 | xargs -0 -I '{}' mv -i {} /Users/j/Desktop/Dest/"$line"/MOBI/
else
mkdir "/Users/j/Desktop/Dest/$line/MOBI"
find /Users/j/Desktop/Source/ -maxdepth 1 -iname "*$line*" -and ! -iname ".*$line*" -type f -print0 | xargs -0 -I '{}' mv --interactive {} /Users/j/Desktop/Dest/"$line"/MOBI/
fi
fi
done < "list2.txt"
}
clear
createList
atag
moveEpub
moveMobi
rm list.txt
rm list2.txt
If you want mv -i to interact with the terminal, that means its stdin needs to be attached to that terminal. There are several places, here, where you're overriding stdin.
For instance:
# THIS LOOP OVERRIDES STDIN
while read -r line
...
done <list.txt
...redirects stdin for the entire duration of the loop, so instead of reading from the user, mv reads from list.txt. To change this, use a different file descriptor:
# This loop uses FD 3 for stdin
while read -r line <&3
...
done 3<list.txt
Another place is in calling xargs. Instead of:
# Overrides stdin for xargs and mv to contain output from find
find ... -print0 | xargs -0 -I '{}' mv -i '{}' "$dest"
...use:
# directly executes mv from find, stdin not modified
find ... -exec mv -i '{}' "$dest" ';'
That said, I would suggest ditching list.txt and list2.txt altogether; you simply don't need them; for that matter, you don't need find either.
dest=/Users/j/Desktop/Dest
source=/Users/j/Desktop/Source
moveEpub() {
local -A finished=( ) # WARNING: This requires bash 4.0 or newer.
for name in *.epub; do
prefix=${name%%-*} # remove everything past the first dash
[[ ${finished[$prefix]} ]] && continue # skip if already done with this prefix
finished[$prefix]=1 # set flag to skip other files w/ this prefix
[[ -d $dest/$prefix ]] || continue # skip if no directory exists for this prefix
mkdir -p "$dest/$prefix/EPUB" # create destination if not existing
mv -i "$source"/*"$prefix"* "$dest/$prefix/EPUB"
done
}
You can use built in find action -exec instead of piping to xargs :
find /Users/j/Desktop/Source/ -maxdepth 1 \
-iname "*$line*" -and ! -iname ".*$line*" -type f \
-exec mv -i {} /Users/j/Desktop/Dest/"$line"/EPUB/ \;

Resources