Tar compress files when some can be missing - bash

I am writing a bash script that pulls files from another server to the current directory. The issue is that I get a lot of files and I only need ~3 of them; however all 3 might not be there.
For example, make server all:
server call --> file1.txt file2.txt file3.xls file4.json .... (etc)
Then compress files with tar:
tar zcf needed_files.tgz file4.json file23.doc *.txt
But file4.json was not there, so I would expect tar to compress file23.doc and all .txt files but the script fails with:
tar: file4.json: Cannot stat: No such file or directory
I have tried other combinations of tar commands like czvf but no luck.

tar should successfully compress the existing files despite the "no such file or directory" errors.
Anyway, you could also use nullglob in combination with extglob #() to get only the existing files:
shopt -s extglob nullglob
files=( "fileA"#() "fileB"#() *.txt )
(( ${#files[#]} )) && tar zcf needed_files.tgz -- "${files[#]}"

Try an extended glob.
shopt -s extglob # set extended globbing on
if echo file[1234].+(txt|xls|json) | grep -vq '\['
then tar cvzf needed_files.tgz file[1234].+(txt|xls|json)
else echo No matching files for extglob 'file[1234].+(txt|xls|json)'
fi
If matching files exist, it will list them.
If not, it will literally echo back the pattern.
grepping out the pattern metacharacters tells you whether there are any files in the set. If they do exist, use the same glob to provide the files to tar, and it will receive exactly the set of matching files. If they don't, the condition test lets you skip it.
Of course, it breaks if you make files with [ in the names, etc...
Or, you could do it in a loop....
for f in file[1234].+(txt|xls|json)
do if [[ -e "$f" ]]
then [[ -e needed_files.tar ]] && c=r || c=c
tar ${c}vf needed_files.tar "$f"
fi
done
Not perfect, but might suit your tastes better.
Neither is a great solution, but one of them ought to get you rolling.

tar zcf needed_files.tgz $(ls -d file4.json file23.doc *.txt 2>/dev/null)
Notice that prints only existing files
ls -d file4.json file23.doc *.txt 2>/dev/null
Also you can use --ignore-failed-read option, but it will also ignore other read errors.

Related

Check if a filename has a string in it

I'm having problems creating an if statement to check the files in my directory for a certain string in their names.
For example, I have the following files in a certain directory:
file_1_ok.txt
file_2_ok.txt
file_3_ok.txt
file_4_ok.txt
other_file_1_ok.py
other_file_2_ok.py
other_file_3_ok.py
other_file_4_ok.py
another_file_1_not_ok.sh
another_file_2_not_ok.sh
another_file_3_not_ok.sh
another_file_4_not_ok.sh
I want to copy all files that contain 1_ok to another directory:
#!/bin/bash
directory1=/FILES/user/directory1/
directory2=/FILES/user/directory2/
string="1_ok"
cd $directory
for every file in $directory1
do
if [$string = $file]; then
cp $file $directory2
fi
done
UPDATE:
The simpler answer was made by Faibbus, but refer to Inian if you want to remove or simply move files that don't have the specific string you want.
The other answers are valid as well.
cp directory1/*1_ok* directory2/
Use find for that:
find directory1 -maxdepth 1 -name '*1_ok*' -exec cp -v {} directory2 \;
The advantage of using find over the glob solution posted by Faibbus is that it can deal with an unlimited number of files which contain 1_ok were the glob solution will lead to an argument list too long error when calling cp with too many arguments.
Conclusion: For interactive use with a limited number of input files the glob will be fine, for a shell script, which has to be stable, I would use find.
With your script I suggest:
#!/bin/bash
source="/FILES/user/directory1"
target="/FILES/user/directory2"
regex="1_ok"
for file in "$source"/*; do
if [[ $file =~ $regex ]]; then
cp -v "$file" "$target"
fi
done
From help [[:
When the =~ operator is used, the string to the right of the operator
is matched as a regular expression.
Please take a look: http://www.shellcheck.net/
Using extglob matching in bash with the below pattern,
+(pattern-list)
Matches one or more occurrences of the given patterns.
First enable extglob by
shopt -s extglob
cp -v directory1/+(*not_ok*) directory2/
An example,
$ ls *.sh
another_file_1_not_ok.sh another_file_3_not_ok.sh
another_file_2_not_ok.sh another_file_4_nnoot_ok.sh
$ shopt -s extglob
$ cp -v +(*not_ok*) somedir/
another_file_1_not_ok.sh -> somelib/another_file_1_not_ok.sh
another_file_2_not_ok.sh -> somelib/another_file_2_not_ok.sh
another_file_3_not_ok.sh -> somelib/another_file_3_not_ok.sh
To remove the files except the one containing this pattern, do
$ rm -v !(*not_ok*) 2>/dev/null

Shell Script to list files in a given directory and if they are files or directories

Currently learning some bash scripting and having an issue with a question involving listing all files in a given directory and stating if they are a file or directory. The issue I am having is that I only get either my current directory or if a specify a directory it will just say that it is a directory eg. /home/user/shell_scripts will return shell_scipts is a directory rather than the files contained within it.
This is what I have so far:
dir=$dir
for file in $dir; do
if [[ -d $file ]]; then
echo "$file is a directory"
if [[ -f $file ]]; then
echo "$file is a regular file"
fi
done
Your line:
for file in $dir; do
will expand $dir just to a single directory string. What you need to do is expand that to a list of files in the directory. You could do this using the following:
for file in "${dir}/"* ; do
This will expand the "${dir}/"* section into a name-only list of the current directory. As Biffen points out, this should guarantee that the file list wont end up with split partial file names in file if any of them contain whitespace.
If you want to recurse into the directories in dir then using find might be a better approach. Simply use:
for file in $( find ${dir} ); do
Note that while simple, this will not handle files or directories with spaces in them. Because of this, I would be tempted to drop the loop and generate the output in one go. This might be slightly different than what you want, but is likely to be easier to read and a lot more efficient, especially with large numbers of files. For example, To list all the directories:
find ${dir} -maxdepth 1 -type d
and to list the files:
find ${dir} -maxdepth 1 -type f
if you want to iterate into directories below, then remove the -maxdepth 1
This is a good use for globbing:
for file in "$dir/"*
do
[[ -d "$file" ]] && echo "$file is a directory"
[[ -f "$file" ]] && echo "$file is a regular file"
done
This will work even if files in $dir have special characters in their names, such as spaces, asterisks and even newlines.
Also note that variables should be quoted ("$file"). But * must not be quoted. And I removed dir=$dir since it doesn't do anything (except break when $dir contains special characters).
ls -F ~ | \
sed 's#.*/$#/& is a Directory#;t quit;s#.*#/& is a File#;:quit;s/[*/=>#|] / /'
The -F "classify" switch appends a "/" if a file is a directory. The sed code prints the desired message, then removes the suffix.
for file in $(ls $dir)
do
[ -f $file ] && echo "$file is File"
[ -d $file ] && echo "$file is Directory"
done
or replace the
$(ls $dir)
with
`ls $`
If you want to list files that also start with . use:
for file in "${dir}/"* "${dir}/"/.[!.]* "${dir}/"/..?* ; do

Collapse nested directories in bash

Often after unzipping a file I end up with a directory containing nothing but another directory (e.g., mkdir foo; cd foo; tar xzf ~/bar.tgz may produce nothing but a bar directory in foo). I wanted to write a script to collapse that down to a single directory, but if there are dot files in the nested directory it complicates things a bit.
Here's a naive implementation:
mv -i $1/* $1/.* .
rmdir $1
The only problem here is that it'll also try to move . and .. and ask overwrite ./.? (y/n [n]). I can get around this by checking each file in turn:
IFS=$'\n'
for file in $1/* $1/.*; do
if [ "$file" != "$1/." ] && [ "$file" != "$1/.." ]; then
mv -i $file .
fi
done
rmdir $1
But this seems like an inelegant workaround. I tried a cleaner method using find:
for file in $(find $1); do
mv -i $file .
done
rmdir $1
But find $1 will also give $1 as a result, which gives an error of mv: bar and ./bar are identical.
While the second method seems to work, is there a better way to achieve this?
Turn on the dotglob shell option, which allows the your pattern to match files beginning with ..
shopt -s dotglob
mv -i "$1"/* .
rmdir "$1"
First, consider that many tar implementations provide a --strip-components option that allows you to strip off that first path. Not sure if there is a first path?
tar -tf yourball.tar | awk -F/ '!s[$1]++{print$1}'
will show you all the first-level contents. If there is only that one directory, then
tar --strip-components=1 -tf yourball.tar
will extract the contents of that directory in tar into the current directory.
So that's how you can avoid the problem altogether. But it's also a solution to your immediate problem. Having extracted the files already, so you have
foo/bar/stuff
foo/bar/.otherstuff
you can do
tar -cf- foo | tar --strip-components=2 -C final_destination -xf-
The --strip-components feature is not part of the POSIX specification for tar, but it is on both the common GNU and OSX/BSD implementations.

Recursively rename image collection with subfolders

I'm trying to rename files in a huge folder of images, that contains lots of subfolders and within them images.
Something like this:
ImageCollection/
January/
Movies/
123123.jpg
asd.jpg
Landscapes/
qweqas.jpg
February/
Movies/
ABC.jpg
QWY.jpg
Landscapes/
t.jpg
And I want to run the script and rename them in ascending order but keeping them in their corresponding folder, like this:
ImageCollection/
January/
Movies/
0.jpg
1.jpg
Landscapes/
2.jpg
February/
Movies/
3.jpg
4.jpg
Landscapes/
5.jpg
Until now I have the following:
#!/usr/bin/env bash
x=0
for i path/to/dir/*/*.jpg; do
new=$(printf path/to/dir/%d ${x})
mv ${i} ${new}
let x=x+1
done
But my problem, relies on not being able to keep the files in their corresponding subfolders, instead everything is moved to the path/to/dir root folder.
A pure Bash solution (except from the mv, of course):
#!/bin/bash
shopt -s nullglob
### Optional: if you also want the .JPG (uppercase) files
# shopt -s nocaseglob
i=1
for file in ImageCollection/*/*.jpg; do
dirname=${file%/*}
newfile=$dirname/$i.jpg
echo mv "$file" "$newfile" && ((++i))
done
This will not perform the renaming, only show what's going to happen. Remove the echo if your happy with the result you see.
You can use the -n option to mv too, so as to not overwrite existing files. (I would definitely use it in this case!). If -n is not available, you may use:
[[ ! -e $newfile ]] && mv "$file" "$newfile" && ((++i))
This is 100% safe regarding filenames (or dirnames) containing spaces or other funny symbols.
#!/bin/bash
x=0
for f in `find path_to_main_dir_or_top_folder | grep "\.jpg$"`;
do
mv $f $(dirname $f)/$x.jpg && ((x++))
done
echo shenzi
$f will hold all files with full path for all *.jpg files
dirname command will give you the fill path (excluding the filename) for a given $f file.
$x.jpg will do the trick. $x value will increment per iteration in the loop.

Extract file using bash script

I created a script which will extract all *.tar.gz file. This file is decompressed five times .tar.gz file, but the problem is that only the first *.tar.gz file is being extracted.
for file in *.tar.gz; do
gunzip -c "$file" | tar xf -
done
rm -vf "$file"
What should I do this? Answers are greatly appreciated.
If your problem is that the tar.gz file contains another tar.gz file which should be extracted as well, you need a different sort of loop. The wildcard at the top of the for loop is only evaluated when the loop starts, so it doesn't include anything extracted from the tar.gz
You could try something like
while true; do
for f in *.tar.gz; do
case $f in '*.tar.gz') exit 0;; esac
tar zxf "$f"
rm -v "$f"
done
done
The case depends on the fact that (by default) when no files match the wildcard, it remains unexpanded. You may have to change your shell's globbing options if they differ from the default.
If you really mean that it is compressed (not decompressed) five times, despite the single .gz extension, perhaps you need instead
for i in 1 2 3 4; do
gunzip file.tar.gz
mv file.tar file.tar.gz
done
tar zxf file.tar.gz

Resources