Decompressed gziped file disappears? - amazon-ec2

I have an Amazon EC2 instance running CentOs. Unfortunately I don't have a gui. I tried setting up x11 forwarding but apparently it works differently with Ubuntu than it does with CentOs. But thats not the point. I download a pretty large .gz file (8.7Gb) and extracted using the following command:
gzip -d [filename] &
it took nearly an hour to decompress, and using ls -l I could see that the uncompressed directory was going to be nearly 30 gb. Anyway the process finishes and when I ls again the directory is no where to be found. I tried ls -a as well but still nothing. Any thoughts on this?

This sounds like gzip is silently failing when it runs out of space. How large is your instance's EBS volume / local disk that you're unzipping onto? (run df -h and figure out which device you're unzipping in.)
Additionally you could try to run gzip in verbose mode to catch any errors it might not be showing. I don't have a CentOS machine handy, but you might be able to use gzip -l [filename] to figure out whether your file is too big for the target directory.

Related

wget hangs after large file download

I'm trying to download a large file over a ftp. (5GB file). Here is my script.
read ZipName
wget -c -N -q --show-progress "ftp://Password#ftp.server.com/$ZipName"
unzip $ZipName
The files downloads at 100% but never goes to the unzip command. No special error message, no outputs in the terminal. Just blank new line. I have to send CTRL + c and run back to script to unzip since wget detects that the file is fully downloaded.
Why does is hangs out like this? Is it because of the large file, or passing an argument in command?
By the way I can't use ftp because it's not on the VM i'm working on, and it's a temporary VM so no root privilege to install anything.
I've made some tests, and I think that size of the disk was the reason.
I've tried with curl -O and it worked for the same disk space.

Bash script: get folder size through FTP

I searched a lot and googled a lot too but I didn't find anything...
I'm using bash on linux...
I have to download a certain folder from a ftp server (i know ftp is deprecated, can't use ftps or sftp right now btw i'm in a local network).
I want to do a sort of integrity check of the downloaded folder, which has a lot of subfolders and files, so i choose to compare folder size as a test.
I'm downloading through wget but my question is...how can I check the folder size BEFORE downloading it so that i can store the size in a file and then compare with the downloaded one? In ftp, so...
I tried with a simply curl to the parent directory but there is no size information there...
Thanks for the help!!
wget will recursively download all directory content one at a time. There's no way to determine the size of an entire directory using ftp. Though it's possible using ssh.
I recommend installing ssh server on your machine and after gaining an access, you can use the following command to get the size of the desired directory:
du -h desired_directory | tail -n 1
I do not recommend this method though, it's more reliable to get the hash checksum of the remote content and compare them with your downloaded content. It's far more reliable and it's already used by many download clients to check the integrity of the files.
It basically depends on what your ftp client and your ftp server can do. With some I know, the default ls does the job and they even have a size command:
ftp> help size
size show size of remote file
ftp> size foo
213 305
ftp> ls foo
200 PORT command successful.
150 Opening ASCII mode data connection for file list
-rw-r--r-- 1 foo bar 305 Aug 1 2013 foo
226 Transfer complete.
You can't check folder size via protocol FTP. Try to connect remote folder on the main server via curlftpfs (if it possible).

I can unzip on a remote machine but not on my computer

On a cluster I zipped a large (61GB, 9.2GB when zipped) directory.
zip -r zzDirectory Directory
I then scp the zzDirectory on my personal computer.
scp -r name#host.com:/path/to/zzDirectory.zip path/in/my/computer/zzDirectory.zip
And finally I unzipped it. I tried to unzip from the bash but it failed
warning [zzDirectory.zip]: 5544449626 extra bytes at beginning or within zipfile
(attempting to process anyway)
error [zzDirectory.zip]: start of central directory not found;
zipfile corrupt.
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
So I doubled click on the icon from the finder and the system started to unzip zzDirectory.zip. However, some files are missing and it looks like (I am not 100% sure yet) that some newline characters (\n) are missing as well. unzip used to work fine on my computer before.
In order to investigate where the problem come from, I unzipped zzDirectory.zip on the cluster and everything seem to work fine (no missing files).
I repeated the transfer and unzipped again but the problem persists. Note that transfers are made via internet. My OS is Mac OSX Yosemite 10.10.2.
How can I solve this issue? I would prefer not to transfer data that are not zipped because of band width issue. Do you think I should try to tar or should I use specific options that goes with the unzip command line?
On OS X you could try:
ditto -x -k the_over4gb.zip /path/to/dir/where/want/unzip
e.g:
ditto -x -k zzDirectory.zip .

Rsync bash script and hard linking files

I am creating a bash script to backup my files with rsync.
Backups all come from a single directory.
I only want new or modified files to be backed up.
Currently, I am telling rsync to backup the dir, and to check the files compared to the last backup.
The way I am doing this is
THE_TIME=`date "+%Y-%m-%dT%H:%M:%S"`
rsync -aP --link-dest=/Backup/Current /usr/home/user/backup /Backup/Backup-$THE_TIME
rm -f /Backup/Current
ln -s /Backup/Backup-$THE_TIME /Backup/Current
I am pretty sure I have the syntax correct for this. Each backup will check against the "Current" folder, and upload only as necesary. It will then delete the Current folder, and re-create the symlink to the newest backup it just did.
I am getting an error when I run the script:
rsync: link "/Backup/Backup-2010-08-04-12:21:15/dgs1200series_manual_310.pdf"
=> /Backup/Current/dgs1200series_manual_310.pdf
failed: Operation not supported (45)
The host OS is running HFS filesystem, which supports hard linking. I am trying to figure out if something else is not supporting this, or if I have a problem in my code.
Thanks for any help
Edit:
I am able to create a hard link on my local machine.
I am also able to create a hard link on the remote server (when logged in locally)
I am NOT able to create a hard link on the remote server when mounted via afp. Even if both files exist on the server.
I am guessing this is a limitation of afp.
Just in case your command line is only an example: Be sure to always specify the link-dest directory with an absolute pathname! That’s something which took me quite some time to figure out …
Two things from the man page stand out that are worth checking:
If file's aren't linking, double-check their attributes. Also
check if some attributes are getting forced outside of rsync's
control, such a mount option that squishes root to a single
user, or mounts a removable drive with generic ownership (such
as OS X's “Ignore ownership on this volume” option).
and
Note that rsync versions prior to 2.6.1 had a bug that could
prevent --link-dest from working properly for a non-super-user
when -o was specified (or implied by -a). You can work-around
this bug by avoiding the -o option when sending to an old rsync.
Do you have the "ignore ownership" option turned on? What version of rsync do you have?
Also, have you tried manually creating a similar hardlink using ln at the command line?
I don't know if this is the same issue, but I know that rsync can't sync a file when the destination is a FAT32 partition and the filename has a ":" (colon) in it. [The source filesystem is ext3, and the destination is FAT32]
Try reconfiguring the date command so that it doesn't use a colon and see if that makes a difference.
e.g.
THE_TIME=`date "+%Y-%m-%dT%H_%_%S"`

Untar multipart tarball on Windows

I have a series of files named filename.part0.tar, filename.part1.tar, … filename.part8.tar.
I guess tar can create multiple volumes when archiving, but I can't seem to find a way to unarchive them on Windows. I've tried to untar them using 7zip (GUI & commandline), WinRAR, tar114 (which doesn't run on 64-bit Windows), WinZip, and ZenTar (a little utility I found).
All programs run through the part0 file, extracting 3 rar files, then quit reporting an error. None of the other part files are recognized as .tar, .rar, .zip, or .gz.
I've tried concatenating them using the DOS copy command, but that doesn't work, possibly because part0 thru part6 and part8 are each 100Mb, while part7 is 53Mb and therefore likely the last part. I've tried several different logical orders for the files in concatenation, but no joy.
Other than installing Linux, finding a live distro, or tracking down the guy who left these files for me, how can I untar these files?
Install 7-zip. Right click on the first tar. In the context menu, go to "7zip -> Extract Here".
Works like a charm, no command-line kung-fu needed:)
EDIT:
I only now noticed that you mention already having tried 7zip. It might have balked if you tried to "open" the tar by going "open with" -> 7zip - Their command-line for opening files is a little unorthodox, so you have to associate via 7zip instead of via the file association system built-in to windows. If you try the right click -> "7-zip" -> "extract here", though, that should work- I tested the solution myself (albeit on a 32-bit Windows box- Don't have a 64 available)
1) download gzip http://www.gzip.org/ for windows and unpack it
2) gzip -c filename.part0.tar > foo.gz
gzip -c filename.part1.tar >> foo.gz
...
gzip -c filename.part8.tar >> foo.gz
3) unpack foo.gz
worked for me
As above, I had the same issue and ran into this old thread. For me it was a severe case of RTFM when installing a Siebel VM . These instructions were straight from the manual:
cat \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.1of3 \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.2of3 \
OVM_EL5U3_X86_ORACLE11G_SIEBEL811ENU_SIA21111_PVM.tgz.3of3 \
| tar xzf –
Worked for me!
The tar -M switch should it for you on windows (I'm using tar.exe).
tar --help says:
-M, --multi-volume create/list/extract multi-volume archive
I found this thread because I had the same problem with these files. Yes, the same exact files you have. Here's the correct order: 042358617 (i.e. start with part0, then part4, etc.)
Concatenate in that order and you'll get a tarball you can unarchive. (I'm not on Windows, so I can't advise on what app to use.) Note that of the 19 items contained therein, 3 are zip files that some unarchive utilities will report as being corrupted. Other apps will allow you to extract 99% of their contents. Again, I'm not on Windows, so you'll have to experiment for yourself.
Enjoy! ;)
This works well for me with multivolume tar archives (numbered .tar.1, .tar.2 and so on) and even allows to --list or --get specific folders or files in them:
#!/bin/bash
TAR=/usr/bin/tar
ARCHIVE=bkup-01Jun
RPATH=home/user
RDEST=restore/
EXCLUDE=.*
mkdir -p $RDEST
$TAR vf $ARCHIVE.tar.1 -F 'echo '$ARCHIVE'.tar.${TAR_VOLUME} >&${TAR_FD}' -C $RDEST --get $RPATH --exclude "$EXCLUDE"
Copy to a script file, then just change the parameters:
TAR=location of tar binary
ARCHIVE=Archive base name (without .tar.multivolumenumber)
RPATH=path to restore (leave empty for full restore)
RDEST=restore destination folder (relative or absolute path)
EXCLUDE=files to exclude (with pattern matching)
Interesting thing for me is you really DON'T use the -M option, as this would only ask you questions (insert next volume etc.)
Hello perhaps would help.
I had the same problems ...
a save on my web site made automaticaly in Centos at 4 am create multiple file in multivolume tar format (saveblabla.tar, saveblabla.tar1.tar, saveblabla.tar2.tar,etc..)
after downloading this file on my PC (windows) i can't extract them with both windows cmd or 7zip (unknow error).
I thirst binary copy file to reassemble tar files. (above in that thread)
copy /b file1+file2+file3 destination
after that, 7zip worked !!! Thanks for you help

Resources