How to restrict pattern matching to the first matching file - bash

I have a very delicate bash problem:
given a directory foo
$ ls
foo.tgz bar.tgz baz.tgz
I want to generate a bash single liner, that extracts the first tarball of a pattern like:
bash -c "tar -zxvf foo.tgz file1" # fine
bash -c 'tar -zxvf *.tgz file1" # oups trying to extract bar.tgz from foo.tgz!
Is there a possibility to restrict pattern matching to the first expanded parameter?
Refinement:
find -iname '*.tgz' | xargs tar -zxvf # oups! cannot add restriction to only extract file1

Use the -quit primary with find:
find -iname '*.tgz' -exec tar -zxvf '{}' \; -quit
As the actions are processed from left-to-right, this will run tar on the first match, then end the find command.

Sure, try this:
tar -zxvf $(ls *.tgz | head -1) file1
You should consider what happens if nothing matches the pattern....

Related

Create archive from difference of two folders

I have the following problem.
There are two nested folders A and B. They are mostly identical, but B has a few files that A does not. (These are two mounted rootfs images).
I want to create a shell script that does the following:
Find out which files are contained in B but not in A.
copy the files found in 1. from B and create a tar.gz that contains these files, keeping the folder structure.
The goal is to import the additional data from image B afterwards on an embedded system that contains the contents of image A.
For the first step I put together the following code snippet. Note to grep "Nur" : "Nur in" = "Only in" (german):
diff -rq <A> <B>/ 2>/dev/null | grep Nur | awk '{print substr($3, 1, length($3)-1) "/" substr($4, 1, length($4)-1)}'
The result is the output of the paths relative to folder B.
I have no idea how to implement the second step. Can someone give me some help?
Using diff for finding files which don't exist is severe overkill; you are doing a lot of calculations to compare the contents of the files, where clearly all you care about is whether a file name exists or not.
Maybe try this instead.
tar zcf newfiles.tar.gz $(comm -13 <(cd A && find . -type f | sort) <(cd B && find . -type f | sort) | sed 's/^\./B/')
The find commands produce a listing of the file name hierarchies; comm -13 extracts the elements which are unique to the second input file (which here isn't really a file at all; we are using the shell's process substitution facility to provide the input) and the sed command adds the path into B back to the beginning.
Passing a command substitution $(...) as the argument to tar is problematic; if there are a lot of file names, you will run into "command line too long", and if your file names contain whitespace or other irregularities in them, the shell will mess them up. The standard solution is to use xargs but using xargs tar cf will overwrite the output file if xargs ends up calling tar more than once; though perhaps your tar has an option to read the file names from standard input.
With find:
$ mkdir -p A B
$ touch A/a A/b
$ touch B/a B/b B/c B/d
$ cd B
$ find . -type f -exec sh -c '[ ! -f ../A/"$1" ]' _ {} \; -print
./c
./d
The idea is to use the exec action with a shell script that tests the existence of the current file in the other directory. There are a few subtleties:
The first argument of sh -c is the script to execute, the second (here _ but could be anything else) corresponds to the $0 positional parameter of the script and the third ({}) is the current file name as set by find and passed to the script as positional parameter $1.
The -print action at the end is needed, even if it is normally the default with find, because the use of -exec cancels this default.
Example of use to generate your tarball with GNU tar:
$ cd B
$ find . -type f -exec sh -c '[ ! -f ../A/"$1" ]' _ {} \; -print > ../list.txt
$ tar -c -v -f ../diff.tar --files-from=../list.txt
./c
./d
Note: if you have unusual file names the --verbatim-files-from GNU tar option can help. Or a combination of the -print0 action of find and the --null option of GNU tar.
Note: if the shell is POSIX (e.g., bash) you can also run find from the parent directory and get the path of the files relative from there, if you prefer:
$ mkdir -p A B
$ touch A/a A/b
$ touch B/a B/b B/c B/d
$ find B -type f -exec sh -c '[ ! -f A"${1#B}" ]' _ {} \; -print
B/c
B/d

How to cd into grep output?

I have a shell script which basically searches all folders inside a location and I use grep to find the exact folder I want to target.
for dir in /root/*; do
grep "Apples" "${dir}"/*.* || continue
While grep successfully finds my target directory, I'm stuck on how I can move the folders I want to move in my target directory. An idea I had was to cd into grep output but that's where I got stuck. Tried some Google results, none helped with my case.
Example grep output: Binary file /root/ant/containers/secret/Documents/2FD412E0/file.extension matches
I want to cd into 2FD412E0and move two folders inside that directory.
dirname is the key to that:
cd $(dirname $(grep "...." ...))
will let you enter the directory.
As people mentioned, dirname is the right tool to strip off the file name from the path.
I would use find for such kind of task:
while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
done < <(find /root/ -type f \
-exec grep "Apples" --files-with-matches {} \;)
Consider using find's -maxdepth option. See the man page for find.
Well, there is actually simpler solution :) I just like to write bash scripts. You might simply use single find command like this:
find /root/ -type f -exec grep Apples {} ';' -exec ls -l {} ';'
Note the second -exec. It will be executed, if the previous -exec command exited with status 0 (success). From the man page:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ; is encountered. The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find.
Replace the ls -l command with your stuff.
And if you want to execute dirname within the -exec command, you may do the following trick:
find /root/ -type f -exec grep -q Apples {} ';' \
-exec sh -c 'cd `dirname $0`; pwd' {} ';'
Replace pwd with your stuff.
When find is not available
In the comments you write that find is not available on your system. The following solution works without find:
grep -R --files-with-matches Apples "${dir}" | while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
echo $target_dir
done

Moving a directory tree with only some of the files

I have files structured as follows:
folder1
-file1.a
-file2.b
-file3.c
folder2
-file1.a
-file2.b
folder3
-file1.a
-file2.b
-file3.c
I want to copy just the files with .a and preserving the tree structure:
folder1
-file1.a
folder2
-file2.a
folder3
-file3.a
How can I accomplish this using bash?
If you have GNU cp (linux) you can:
cp --parent folder*/**/*.a /path/to/destination
or with two piped tar
tar czf - folder*/**/*.a | tar -C /path/to/dest -xvf -
or
find folder* -name \*.a -print | cpio -o | (cd /path/to/dest ; cpio -idv)
or better, from the #JonathanLeffler's comment
find folder* -name '*.a' -print | cpio -pvd /path/to/dest
#and with null terminated
find ... -print0 | cpio -p0dv /path/to/dest
The ** mean (from man bash)
Matches any string, including the null string. When the globstar shell option is enabled, and *
is used in a pathname expansion context, two adjacent *s used as a single pattern will match all
files and zero or more directories and subdirectories. If followed by a /, two adjacent *s will
match only directories and subdirectories.
the globstar (default on)
globstar
If set, the pattern ** used in a pathname expansion context will match all files and zero or
more directories and subdirectories. If the pattern is followed by a /, only directories and
subdirectories match.
One way is to create a tar file with files you need...and then extract that tar where ever you want it...eg. below
$ tar -cvf allfiles.tar `find . -name "*.a" -print`
$ cp allfiles.tar /newdirectory
$ cd /newdirectory
$ tar -xvf allfiles.tar
Another Way would be ...
# below command will find all files with .a extension and then
# copy them to the newdir directory with folder structure.
$ cp --parents `find . -name "*.a" -print` newdir
You could use a for loop:
for f in folder{1..3}; do mkdir -p "dest/$f" && cp "src/$f/file1.a" "dest/$f"; done
For example:
$ ls src/*
src/folder1:
file1.a file2.b file3.c
src/folder2:
file1.a file2.b file3.c
src/folder3:
file1.a file2.b file3.c
$ for f in folder{1..3}; do mkdir -p "dest/$f" && cp "src/$f/file1.a" "dest/$f"; done
$ ls dest/*
dest/folder1:
file1.a
dest/folder2:
file1.a
dest/folder3:
file1.a

Extract a tar file with no directory information

I have a folder I am turning into a tar file for transfer over the network it looks like this
/foo/bar/file1
/foo/file2
/foo/baz/bin/file3
I want to extract it with no directory information so the extracted contents look like this
file1
file2
file3
How does one accomplish this
The GNU tar manual elaborates on --transform:
tar --show-transformed --transform 's,.*/,,' tfv myarchive.tar
tar --transform 's,.*/,,' xf myarchive.tar
What about the Unix philosophy of stringing together tools?
Use tar to extract as usual, then
find foo -type f | xargs mv '{}' .
rm -rf foo # optional, nuke the empty dirs.
or variations thereof with find ... -print0 and xargs -0.

Unix script to find all folders in the directory, then tar and move them

Basically I need to run a Unix script to find all folders in the directory /fss/fin, if it exists; then I have tar it and move to another directory /fs/fi.
This is my command so far:
find /fss/fin -type d -name "essbase" -print
Here I have directly mentioned the folder name essbase. But instead, I would like to find all the folders in the /fss/fin and use them all.
How do I find all folders in the /fss/fin directory & tar them to move them to /fs/fi?
Clarification 1:
Yes I need to find only all folders in the directory /fss/fin directory using a Unix shell script and tar them to another directory /fs/fi.
Clarification 2:
I want to make it clear with the requirement. The Shell Script should contain:
Find all the folders in the directory /fss/fin
Tar the folders
Move the folders in another directory /fs/fi which is located on the server s11003232sz.net
On user requests it should untar the Folders and move them back to the orignal directory /fss/fin
here is an example I am working with that may lead you in the correct direction
BackUpDIR="/srv/backup/"
SrvDir="/srv/www/"
DateStamp=$(date +"%Y%m%d");
for Dir in $(find $SrvDir* -maxdepth 0 -type d );
do
FolderName=$(basename $Dir);
tar zcf "$BackUpDIR$DateStamp.$FolderName.tar.gz" -P $Dir
done
Since tar does directories automatically, you really don't need to do very much. Assuming GNU tar:
tar -C /fss/fin -cf - essbase |
tar -C /fs/fi -xf -
The '-C' option changes directory before operating. The first tar writes to standard output (the lone '-') everything found in the essbase directory. The output of that tar is piped to the second tar, which reads its standard input (the lone '-'; fun isn't it!).
Assuming GNU find, you can also do:
(cd /fss/fin; tar -cf - $(find . -maxdepth 1 -type d | sed '/^\.$/d')) |
tar -xf - -C /fs/fi
This changes directory to the source directory; it runs 'find' with a maximum depth of 1 to find the directories and removes the current directory from the list with 'sed'; the first 'tar' then writes the output to the second one, which is the same as before (except I switched the order of the arguments to emphasize the parallelism between the two invocations).
If your top-level directories (those actually in /fss/fin) have spaces in the names, then there is more work to do again - I'm assuming none of the directories to be backed up start with a '.':
(cd /fss/fin; find * -maxdepth 0 -type d -print 0 | xargs -0 tar -cf -) |
tar -xf - -C /fs/fi
This weeds out the non-directories from the list generated by '*', and writes them with NUL '\0' (zero bytes) marking the end of each name (instead of a newline). The output is written to 'xargs', which is configured to expect the NUL-terminated names, and it runs 'tar' with the correct directory names. The output of this ensemble is sent to the second tar, as before.
If you have directory names starting with a '.' to collect, then add '.[a-z]*' or another suitable pattern after the '*'; it is crucial that what you use does not list '.' or '..'. If you have names starting with dashes in the directory, then you need to use './*' and './.[a-z]*'.
If you've got still more perverse requirements, enunciate them clearly in an amendment to the question.
find /fss/fin -d 1 -type d -name "*" -print
The above command gives you the list of 1st level subdirectories of the /fss/fin.
Then you can do anything with this. E.g. tar them to your output directory as in the command below
tar -czf /fss/fi/outfile.tar.gz `find /fss/fin -d 1 -type d -name "*" -print`
Original directory structure will be recreated after untar-ing.
Here is a bash example (change /fss/fin, /fs/fi with your paths):
dirs=($(find /fss/fin -type d))
for dir in "${dirs[#]}"; do
tar zcf "$dir.tgz" "$dir" -P -C /fs/fi && mv -v "$dir" /fs/fi/
done
which finds all the folders, tar them separately, and if successful - move them into different folder.
This should do it:
#!/bin/sh
list=`find . -type d`
for i in $list
do
if [ ! "$i" == "." ]; then
tar -czf ${i}.tar.gz ${i}
fi
done
mv *.tar.gz ~/tardir

Resources