using "dd" to capture and restore fails? - amazon-ec2

I used dd to capture the two local vm partitions like this...
# dd if=/dev/sda1 | gzip >mySda1.gz
# dd if=/dev/sda2 | gzip >mySda2.gz
Then I attached two volumes of sufficient size to an already running instance and mounted them (as /mnt/one and /mnt/two), then copied the .gz files up to the instance and used these commands to restore the partitions
# gunzip –c mySda1.gz | dd of=/dev/xvdk
# gunzip –c mySda2.gz | dd of=/dev/xvdl
The gunzip commands do not show failure, but when I then go /mnt/one and issue command ls -a there is nothing there. Why is this? The .gz files are very large. Why does the mounted partition show as blank even if the gunzip command completed?

Before you can write directly to a partition, you must first ensure that it is unmounted.
Linux will not notice if you write directly to the disk behind its back (and, more importantly, will assume that this will not happen---it will likely get very confused if you try modifying a mounted file system.)
So, the correct procedure would be as follows:
umount /dev/xvdk
gunzip –c mySda1.gz | dd of=/dev/xvdk
mount /dev/xvdk
and again for /dev/xvdl.

Related

rsync over ssh results in 0 files, but no error message

I'm trying to rsync a large directory of around 200 GB from a server to my local external hard drive. I can ssh onto the server and see the directory fine. I can also cd into the external hard drive fine. When I try and rsync the file across, I don't get an error, but the last line of the rsync output is 'total size is 0 speedup is 0.00', and there are no files in the destination directory.
Here's how I ssh onto the server successfully:
ssh skgtmdf#live.rd.ucl.ac.uk
Here's my rsync command:
rsync -avrt -progress -e "ssh skgtmdf#live.rd.ucl.ac.uk:/mnt/gpfs/live/rd01__/ritd-ag-project-rd012x-smmca23/" "/Volumes/DUAL DRIVE/ONP/2022.08.10_results/"
And here's the rsync output:
sending incremental file list
drwxrwxrwx 65,536 2022/08/10 21:32:06 .
sent 57 bytes received 64 bytes 242.00 bytes/sec
total size is 0 speedup is 0.00
What am I doing wrong?
The way you have it quoted, the source path is part of the remote shell option (-e value) rather than a separate argument as it should be.
rsync -avrt -progress -e "ssh skgtmdf#live.rd.ucl.ac.uk:/mnt/gpfs/live/rd01__/ritd-ag-project-rd012x-smmca23/" "/Volumes/DUAL DRIVE/ONP/2022.08.10_results/"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is all part of the `-e` option value
This means rsync doesn't see that as a sync source at all, but just part of the command it'll use to connect to the remote system. I'm not sure why this doesn't lead to an error. In any case, the fix is simple: don't include ssh with the source path.
As I noticed later (see comments) the --progress option needs a double-dash or it'll be wildly misparsed. Fixing both of these things gives:
rsync -avrt --progress -e ssh "skgtmdf#live.rd.ucl.ac.uk:/mnt/gpfs/live/rd01__/ritd-ag-project-rd012x-smmca23/" "/Volumes/DUAL DRIVE/ONP/2022.08.10_results/"
In fact, since ssh is the default command for making a remote connection, you can leave off -e ssh entirely:
rsync -avrt --progress "skgtmdf#live.rd.ucl.ac.uk:/mnt/gpfs/live/rd01__/ritd-ag-project-rd012x-smmca23/" "/Volumes/DUAL DRIVE/ONP/2022.08.10_results/"
rsync -azve ssh user#host:/src/ target/
Normally, you don't need to wrap -e flag with ". It's probably messing the connection string

Assign value to a variable 'during' command - bash

Apologies if the title isn't worded very well, hard to explain exactly what I'm trying to do without an example
I am running a database backup command that creates a file with a timestamp. In the same command I am then uploading that file to a remote location.
pg_dump -U postgres -W -F t db > $backup_dir/db_backup_$(date +%Y-%m-%d-%H.%M.%S).tar && gsutil cp $backup_dir/db_backup_$(date +%Y-%m-%d-%H.%M.%S).tar $bucket_dir
As you can see here it is creating the timestamp during the pg_dump command. However in the 2nd half of the command, the timestamp will now be different and it won't find the file.
I'm looking for a way to 'save' or assign the value of the backup file name from the first half of the command, so that I can then use it in the 2nd half of the command.
Ideally this would be done across 2 separate commands however in this particular use case I'm limited to 1.
a variation of the advice already given in comments -
fn=db_backup_$(date +%Y-%m-%d-%H.%M.%S).tar &&
pg_dump -U postgres -W -F t db > "$backup_dir/$fn" &&
gsutil cp "$backup_dir/$fn" "$bucket_dir"
The $fn var makes the whole thing shorter and more readable, too.

Is there a way through which i can access,use and manipulate files from one server using shell scripting on another server >

I tried accessing files from remote server "10.101.28.83" and manipulating files to create folders on host server where script has been run. But * is the output of echo "$(basename "$file")" command which implies that files are not read from remote server.
#!/bin/bash
#for file in /root/final_migrated_data/*; do
for file in root#10.101.28.83:/root/final_migrated_data/* ; do
echo "$(basename "$file")"
IN="$(basename "$file")"
IFS='_'
read -a addr <<< "$(basename "$file")"
# addr[0] is case_type, addr[1] is case_no, addr[2] is case_year
dir_path="/newdir1";
backup_date="${addr[0]}_${addr[1]}_${addr[2]}";
backup_dir="${dir_path}/${backup_date}";
mkdir -p "${backup_dir}";
cp /root/final_migrated_data/"${backup_date}"_* "${backup_dir}"
done
I expect the output of echo "$(basename "$file")" to be the list of files present at the location /root/final_migrated_data/ of remote server but the actual output is * .
You can use sshfs. As the name suggests, sshfs allows to mount locally (both for reading and writing) a distant filesystem to which you have SSH access. As long as you already know SSH, its usage is very straightforward:
# mount the distant directory on your local machine:
sshfs user#server.com:/my/directory /local/mountpoint
# manipulate the filesystem just like any other (cd, ls, mv, rm…)
# unmount:
umount /local/mountpoint
If you need to mount the same distant filesystem often, you can even add it to your /etc/fstab, refer to the documentation for how to do it.
Note however that using an SSH filesystem is slow, because each operation on the filesystem implies fetching or sending data through the network, and sshfs is not particularly optimized against that (it does not cache file contents, for instance). Other solutions exist, which may be more complex to set up but offer better performance.
See by yourself whether speed is a problem. In your case, if you are simply copying files from one place in your server to another place in your server, it seems rather absurd to make it transit through your home computer and back again. It may be more advantageous to simply run your script on your server directly.

BASH 'df' command showing the same numbers for all directories?

I'm trying to get the disk usage of everything within certain directories, which I've been attempting to do with commands like this:
df -h -k /var/www/html/exampledirectory1
df -h -k /var/www/html/exampledirectory2
df -h -k /var/www/html/exampledirectory3
The problem is, that every single directory in the server (even if I just run 'df -h' while within a certain directory) is giving me the exact same numbers, down to the Kilobyte.
Obviously this can't be correct, but I have no idea what it is I'm doing wrong. Can anyone help me out?
(I'm using BASH version 4.2.25 and I'm running Ubuntu 14.10)
You want to use the du command. df is used for measuring disk usage of a whole partition. Here is an example to determine the disk spaced used by a directory and all sub-directories:
du -sh /home/darwin

Instance of Google Compute Engine freezes trying to upload files on Google Cloud Storage

I have wrote this shell script that download archives from a url list, decompresses them and finally moves them in a Cloud Storage bucket.
#!/bin/bash
# declare STRING variable
for iurl in $(cat ./html-rdfa.list); do
filename=$(basename "$iurl")
file="${filename%.*}"
if gsutil ls gs://rdfa/$file; then
echo "yes"
else
wget $iurl
gunzip $filename
gsutil cp -n $file gs://rdfa
rm $file
sleep 2
fi
done
html-rdfa.list contains the url list. The instance is created using the debian 7 image provided by gooogle.
The script run correctly for the first 5 or 6 files, but then the instance freezes and i have to delete the instance. The ram or the disk of the instance are not full when it freezes.
I think the problem is caused by the command gsutil cp, but it is strange that CPU load is practically 0 and also the RAM is free but it is impossibilo to use the instance without restarting them.
Are you writing the temporary files to the default 10GB root disk? If so, you may be running into the Persistent Disk throughput caps. To see if this is the case, create a new Persistent Disk, then mount it as a data disk and use that disk for the temporary files. Consider starting with ~200GB disk and see if that is enough throughput for your workload. Also, see the docs on Persistent Disk performance.

Resources