How to extract only if it not extracted already? - makefile

I have a makefile which has an action to extract an archive. It does it even when the archive has already been extracted (and there were no changes to it).
all:
tar zxvf soplex-1.7.2.tgz
Is there a way to prevent this? I tried using the k flag to make it keep the existing files but it gives me this
soplex-1.7.2/src/vector.cpp
tar: soplex-1.7.2/src/vector.cpp: Cannot open: File exists

This isn't exactly good make practice but this sort of operation doesn't really fit into the way make does things either (unless you want to use a known sentinel file from inside the tarball as your marker).
all:
tar -df soplex-1.7.2.tgz 2>/dev/null || tar -xvf soplex-1.7.2.tgz
(You can manually supply the z flag to tar if your tar can't figure out that it needs it itself.)
Also note that this is very expensive in the case that one of the later files in the tarball is the one that is missing/modified since it requires two sequential scans of the entire tarball and the related disk activity.

Related

make rebuild target depending on zip file

Why make rebuilds the target (I suppose) if the dependency is a binary file?
To reproduce:
create (and enter it) a new empty directory
download the GameLift SDK (it is just an example: the Makefile content on this question is an example with this file)
create a simple Makefile with the content below
issue more times the make command
all: GameLift_12_22_2020/GameLift-SDK-Release-4.0.2/GameLift-Cpp-ServerSDK-3.4.1/CMakeLists.txt
GameLift_12_22_2020/GameLift-SDK-Release-4.0.2/GameLift-Cpp-ServerSDK-3.4.1/CMakeLists.txt: GameLift_12_22_2020.zip
unzip -oq GameLift_12_22_2020.zip
I would have expected to see the unzip command to be executed only first time I issue the make command, but it continue to be executed in next make runs... why?
There are two possibilities, we cannot know which is the case with the information you've provided.
The first is that the file GameLift_12_22_2020/GameLift-SDK-Release-4.0.2/GameLift-Cpp-ServerSDK-3.4.1/CMakeLists.txt is not present in the zip file, so the second time make runs it looks to see if that file exists and it doesn't, so it re-runs the rule. If, in the same directory you run make, you use ls GameLift_12_22_2020/GameLift-SDK-Release-4.0.2/GameLift-Cpp-ServerSDK-3.4.1/CMakeLists.txt (after the unzip runs) and you get "file not found" or similar, this is your problem.
If that's not it, then the problem is that the timestamp of the file in the zip file is older than the zip file itself, and when unzip unpacks the file it sets the timestamp to this older time.
So when make goes to build it finds the CMakeLists.txt file but the modification time is older than the zip file, so make unpacks the zip file again to try to update it.
You can use ls -l to see the modification time on that file. If this is the case you should touch the file when you unpack it, so it's newer:
GameLift_12_22_2020/GameLift-SDK-Release-4.0.2/GameLift-Cpp-ServerSDK-3.4.1/CMakeLists.txt: GameLift_12_22_2020.zip
unzip -oq GameLift_12_22_2020.zip
touch $#

How to tar a folder while files inside the folder might being written by some other process

I am trying to create a script for cron job. I have around 8 GB folder containing thousands of files. I am trying to create a bash script which first tar the folder and then transfer the tarred file to ftp server.
But I am not sure while tar is tarring the folder and some other process is accessing files inside it or writing to the files inside it.
Although its is fine for me if the tarred file does not contains that recent changes while the tar was tarring the folder.
suggest me the proper way. Thanks.
tar will hapilly tar "whatever it can". But you will probably have some surprises when untarring, as tar also stored the size of the file it tars, before taring it. So expect some surprises.
A very unpleasant surprise would be : if the size is truncated, then tar will "fill" it with "NUL" characters to match it's recorded size... This can give very unpleasant side effects. In some cases, tar, when untarring, will say nothing, and silently add as many NUL characters it needs to match the size (in fact, in unix, it doesn't even need to do that : the OS does it, see "sparse files"). In some cases, if truncating occured during the taring of the file, tar will complain it encounters an Unexpected End of File when untarring (as it expected XXX bytes but only reads fewer than this), but will still say that the file should be XXX bytes (and the unix OSes will then create it as a sparse file, with "NUL" chars magically appended at the end to match the expected size when you read it).
(to see the NUL chars : an easy way is to less thefile (or cat -v thefile | more on a very old unix. Look for any ^#)
But on the contrary, if files are only appended to (logs, etc), then the side effect is less problematic : you will only miss some bits of them (which you say you're ok about), and not have that unpleasant "fill with NUL characters" side effects. tar may complain when untarring the file, but it will untar it.
I think tar failed (so do not create archive) when an archived file is modified during archiving. As Etan said, the solution depends on what you want finally in the tarball.
To avoid a tar failure, you can simply COPY the folder elsewhere before to call tar. But in this case, you cannot be confident in the consistency of the backuped directory. It's NOT an atomic operation, so some files will be todate while other files will be outdated. It can be a severe issue or not follow your situation.
If you can, I suggest you configure how these files are created. For example: "only recent files are appended, files older than 1 day are never changed", in this case you can easily backup only old files and the backup will be consistent.
More generally, you have to accept to loose last data AND be not consistant (each files is backup at a different date), or you have to act at a different level. I suggest :
Configure the software that produces the data to choose a consistency
Or use OS/Virtualization features. For example it's possible to do consistent snapshot of a storage on some virtual storage...

Faster Alternatives to cp -l for a Whole File Structure?

Okay, so I need to create a copy of a file-structure, however the structure is huge (millions of files) and I'm looking for the fastest way to copy it.
I'm currently using cp -lR "$original" "$copy" however even this is extremely slow (takes several hours).
I'm wondering if there are any faster methods I can use? I know of rsync --link-dest but this isn't any quicker, but really I'd want it to be quicker as I want to create these snap-shots every hour or-so.
The alternative is copying only changes (which I can find quickly) into each folder then "flattening" them when I need to free up space (rsync newer folders into older ones until the last complete snapshot is reached), but I would really rather that each folder be its own complete snapshot.
Why are you discarding link-dest? I use a script with that option and take snapshots pretty often and the performance is pretty good.
In case you reconsider, here's the script I use: https://github.com/alvaroreig/varios/blob/master/incremental_backup.sh
If you have pax installed, you can use it. Think of it as tar or cpio, but standard, per POSIX.
#!/bin/sh
# Move "somedir/tree" to "$odir/tree".
itree=$1
odir=$2
base_tree=$(basename "$itree")
pax -rw "$itree" -s "#$itree#$base_tree#g" "$odir"
The -s replstr is an unfortunate necessity (you'll get $odir/$itree otherwise), but it works nicely, and it has been quicker than cp for large structures thus far.
tar is of course another option if you don't have pax as one person suggested already.
Depending on the files, you may achieve performance gains by compressing:
cd "$original && tar zcf - . | (cd "$copy" && tar zxf -)
This creates a tarball of the "original" directory, sends the data to stdout, then changes to the "copy" directory (which must exist) and untars the incoming stream.
For the extraction, you may want to watch the progress: tar zxvf -

Batch script to move files into a zip

Is anybody able to point me in the right direction for writing a batch script for a UNIX shell to move files into a zip one at at time and then delete the original.
I cant use the standard zip function because i don't have enough space to fit the zip being created.
So any suggestions please
Try this:
zip -r -m source.zip *
Not a great solution but simple, i ended up finding a python script that recursively zips a folder and just added a line to delete the file after it is added to the zip
You can achieve this using find as
find . -type f -print0 | xargs -0 -n1 zip -m archive
This will move every file into the zip preserving the directory structure. You are then left with empty directories that you can easily remove. Moreover using find gives you a lot of freedom on what files you want to compress.
I use :
zip --move destination.zip src_file1 src_file2
Here the detail of "--move" option from the man pages
--move
Move the specified files into the zip archive; actually, this
deletes the target directories/files after making the specified zip
archive. If a directory becomes empty after removal of the files, the
directory is also removed. No deletions are done until zip has
created the archive without error. This is useful for conserving disk
space, but is potentially dangerous so it is recommended to use it in
combination with -T to test the archive before removing all input
files.

bash scripting..copying files without overwriting [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 14 years ago.
Improve this question
I would like to know if it is possible to copy/move files to a destination based on the origin name.
Basically, I have a /mail folder, which has several subfolders such as cur and new etc. I then have an extracted backup in /mail/home/username that is a duplicate. mv -f will not work, as I do not have permission to overwrite the directories, but only the files within.
I get errors such as mv: cannot overwrite directory `/home/username/mail/username.com'
What I want to do is for each file in the directory username.com, move it to the folder of the same name in /mail. There could be any number of folders in place of username.com, with seperate sub sirectories of their own.
What is the best way to do this?
I have to do it this way as due to circumstances I only have access to my host with ftp and bash via php.
edit: clarification
I think I need to clarify what happened. I am on a shared host, and apparently do not have write access to the directories themselves. At least the main ones such as mail and public_html. I made a backup of ~/mail with tar, but when trying to extract it extracted to ~/mail/home/mail etc, as I forgot about the full path. Now, I cannot simply untar because the path is wrong, and I cannot mv -f because I only have write access to files, not directories.
For copying, you should consider using cpio in 'pass' mode (-p):
cd /mail; find . -type f | cpio -pvdmB /home/username/mail
The -v is for verbose; -d creates directories as necessary; -m preserves the modification times on the files; -B means use a larger block size, and may be irrelevant here (it used to make a difference when messing with tape devices). Omitted from this list is the -u flag that does unconditional copying, overwriting pre-existing files in target area. The cd command ensures that the path names are correct; if you just did:
find /mail -type f | cpio -pvdmB /home/username
you would achieve the same result, but only by coincidence - because the sub-directory under /home/username was the same as the absolute pathname of the original. If you needed to do:
find /var/spool/mail -type f | cpio -pvdmB /home/username/mail
then the copied files would be found under /home/username/mail/var/spool/mail, which is unlikely to be what you had in mind.
You can achieve a similar effect with (GNU) tar:
(cd /mail; tar -cf - . ) | (cd /home/username/mail; tar -xf - )
This copies directories, not just files. To do that, you need GNU-only facilities:
(cd /mail; find . -type f | tar -cf - -F - ) | (cd /home/username/mail; tar -xf - )
The first solo dash means 'write to stdout'; the second means 'read from stdin'; the '-F' option means 'read the file names to copy from the named file'.
I'm not entirely clear on what it is that you want to do, but you could try the following:
for file in /mail/*; do
mv -f $file /home/username/mail/$(basename $file)
done
This will move every file and subdirectory in /mail from there into /home/username/mail.
Is using tar an option? You could tar up the directory, and extract it under /mail/ (for I am assuming that is what you want roughly) with tar overwriting existing files and directories.
I'm a bit confused about what it is exactly that you want to do. But you should be able to use the approach of Adam's solution and redirect the errors to a file.
for file in /mail/*; do
mv -f $file /home/username/mail/$(basename $file) 2> /tmp/mailbackup.username.errors
done
DIrectories will not be overwritten and you can check the file so that it only contaions errors you anticipate.
Can you untar it again? The -P option to tar will not strip leading "/", so the absolute pathnames will be respected. From your edit, it sounds like this'll fix it.
Even with your clarification I'm still having a problem understanding exactly what you're doing. However, any chance you can use rsync? The src and dest hosts can be the same host for rsync. As I recall, you can tell rsync to only update files that already exist in the destination area (--existing) and also to ignore directory changes (--omit-dir-times).
Again, I'm not quite understanding your needs here, but rsync is very flexible in backing up files and directories.
Good luck.

Resources