TAR-ing on-the-fly - bash

I'm trying to fetch all files within all directories on our SAN. I'm starting with my local to test out how I want to do it. So, at my Documents directory:
ls -sR > documents_tree.txt
With just my local, that's fine. It gives the exact output I want. But since I'm doing it on our SAN, I'm going to have to compress on-the-fly, and I'm not sure the best way of doing this. So far I have:
ls -sR > documents_tree.txt | tar -cvzf documents_tree.tgz documents_tree.txt
When I try to check the output, it is impossible for me to un-tar the file using tar -xvf documents_tree.tar after I have gunzipped it.
So, what is the correct way to compress on-the-fly? How can I accurately check my work? Will this work when performing the same process on a SAN?

You don't need to use tar to compress a single file, just use gzip:
ls -sR | gzip > documents_tree.txt.gz
You can then use gunzip documents_tree.txt to uncompress it, or tools like gzcat and zless to view it without having to uncompress it first.

Building upon your comment on the OP and using your initial command, the following works for me:
ls -sR > documents_tree.txt && tar -cvzf documents_tree.tgz documents_tree.txt

Related

How to `scp` directory preserving structure but only pick certain files?

I need to secure copy (scp) to remotely copy a directory with its sub structure preserved from the UNIX command line. The sub directories have identically named files that I WANT and bunch of other stuff that I don't. Here is how the structure looks like.
directorytocopy
subdir1
1.wanted
2.wanted
...
1.unwanted
2.notwanted
subdir2
1.wanted
2.wanted
...
1.unwanted
2.notwanted
..
I just want the .wanted files preserving the directory structure. I realize that it is possible to write a shell (I am using bash) script to do this. Is it possible to do this in a less brute force way? I cannot copy the whole thing and delete the unwanted files because I do not have enough space.
Adrian has the best idea to use rsync. You can also use tar to bundle the wanted files:
cd directorytocopy
shopt -s nullglob globstar
tar -cf - **/*.wanted | ssh destination 'cd dirToPaste && tar -xvf -'
Here, using tar's -f option with the filename - to use stdin/stdout as the archive file.
This is untested, and may fail because the archive may not contain the actual subdirectories that hold the "wanted" files.
Assuming GNU tar on the source machine, and assuming that filenames of the wanted files won't contain newlines and they are short enough to fit the tar headers:
find /some/directory -type f -name '*.wanted' | \
tar cf - --files-from - | \
ssh user#host 'cd /some/other/dir && tar xvpf -'
rsync with and -exclude/include list follwing #Adrian Frühwirth's suggestion would be a to do this.

Searching a file inside a .tar.gz file by date in unix

I would like to ask if there is a way to search for a file inside a .tar.gz file without extracting it? If there is, is there a way to search for that file by date?
My OS is AIX.
Thanks!
tar can be instructed to preserve atimes on files it archives, but not all tars do this, and I am unfortunately not familiar with AIX-specific tar in this case. What you need to know is whether tar was invoked with --atime-preserve (AIX tar may not support this; be sure to check), and when you call an extraction you must use the -p flag. So, you'd have something like this:
tar zxpf file.tar.gz the/file/you/want.txt
You will likely find that Unix (cf Linux) tar won't support the -j and -z so you would have to use:
gzip -dc file.tar.gz | tar xf - the/file/you/want.txt
to run the command from a pipe. In this case, you would need to know the name of the file you want extracted, which you can get from:
tar tf file.tar.gz
using compression as required. Obviously you can tack on a | grep foo if you are looking for a file named foo.
It is not, I do not think, possible to extract a file from tar based upon the modification date of the file in the tarball – at least I was not able to find support for such in the documentation. Remember, tar is just the tape archiver and is not meant to do such fancy things. :-)
Lastly, you can do this:
tar xvf file.tar `tar tf file.tar | grep foo`
if you want to pull out all the files matching 'foo' from file.tar (compression above yada yada). I do not suggest running that command on an actual tape drive!
$ tar tzf archive.tar.gz | grep "search"

Bash create .tar.gz on Solaris

I'm writting a bash script, which should create a .tar.gz archive from specified directory including file structure. This is a homework and I need it to work on Solaris system my school uses. That means I can't use tar like this
tar xvzf archive.tar.gz ./
because this system uses some older version of tar without -z option. I use it like this
tar xvf - ./ | gzip > archive.tar.gz
It works fine except a strange fact, that when I examine contents of the archive, it contains itself (in other words, it contains "archive.tar.gz"). How can I prevent this?
This works:
tar cvf - | gzip > archive.tar.gz
Thing is, that file named "archive.tar.gz" is created immediately when you run your command. Meaning, before gzip is called. It's just blank file, but it is in directory. To prevent including it into resulting archive, you can try to modify your script in one of following ways:
tar xvf - ./ | gzip > ../archive.tar.gz
tar xvf - {path_to_dir_you_want_to_compress_files_from} | gzip > archive.tar.gz
Sadly, I can't check if either of this scripts works, because I don't have Solaris anywhere. Please, let me know if any of that works.

Unix tar: do not preserve full pathnames

When I try to compress files and directories with tar using absolute paths, the absolute path is preserved in the resulting compressed file. I need to use absolute paths to tell tar where the folder I wish to compress is located, but I only want it to compress that folder – not the whole path.
For example, tar -cvzf test.tar.gz /home/path/test – where I want to compress the folder test. However, what I actually end up compressing is /home/path/test. Is there anything that can be done to avoid this? I have tried playing with the -C operand to no avail.
This is ugly... but it works...
I had this same problem but with multiple folders, I just wanted to flat every files out. You can use the option "transform" to pass a sed expression and... it works as expected.
this is the expression:
's/.*\///g' (delete everything before '/')
This is the final command:
tar --transform 's/.*\///g' -zcvf tarballName.tgz */*/*.info
Use -C to specify the directory from which the files look like you want, and then specify the files as seen from that directory:
tar -cvzf test.tar.gz -C /home/path test
multi-directory example
tar cvzf my.tar.gz example.zip -C dir1 files_under_dir1 -C dir2 files_under_dir2
the files under dir 1/2 should not have path.
tar can perform transformations on the filenames on the way in and out of the archive. Consider this example that stores a bunch of files in a flat tarfile:
in the root ~/ directory
find logger -name \*.sh | tar -cvf test.tar -T - --xform='s|^.*/||' --show-transformed
The -T - option tell tar to read a list of files from stdin, the --xform='s|^.*/||' applies the sed expression to all the filenames as after they are read and before they are stored. --show-transformed is just a nicety to show you the file names after they are transformed, the default is to show the names as they are read.
There are no doubt other ways besides using find to specify files to archive. For instance, if you have dotglob set in bash, you can use ** patterns to wildcard any number of directories, shortening the previous to this:
tar -cvf test.tar --xform='s|^.*/||' --show-transformed logger/**/*.sh
You’ll have to judge what is best for your situation given the files you’re after.
find -type f | tar --transform 's/.*\///g' -zcvf comp.tar.gz -T -
Where find -type f finds all the files in the directory tree and using tar with --transform compresses them without including the folder structure. This is very useful if you want to compress only the files that are the result of a certain search or the files of a specific type like:
find -type f -name "*.txt" | tar --transform 's/.*\///g' -zcvf comp.tar.gz -T -
Unlike the other answers, you don't have to include */*/* specifying the depth of the directory. find handles that for you.

HP-UX - How can I read a text file from tar archive without extracting it?

I have a tar archive which contains several text files. I would like to write a script to display (stdout) the content of a file without extracting it to the current directory.
Actually I would like to do the same as:
tar tf myArchive.tar folder/someFile.txt
cat folder/someFile.txt
rm -R folder
but without the rm...
I tried this way but it didn't work:
tar tf myArchive.tar folder/someFile.txt | cat
Thanks
Use x to extract, with f from archive file. Then add also option -O to direct extracted files to standard output.
tar xf myArchive.tar folder/someFile.txt -O

Resources