Copy a file every 10 seconds using a batch script - only once

Copy a file every 10 seconds using a batch script - only once - windows

I'm trying to find a way to run a batch script on Windows that backs up my project directory to our local network file share server.
Example of what I would usually run:
robocopy /mir "C:\PROJECT_FOLDER_PATH" "\\NETWORK_FOLDER_PATH"
But, every now and then, my IT admin approaches me about a massive copy operation that is slowing down the network.
As my projects folder grows over time, this becomes more of an annoyance. I try to run the script only while signing off later in the day to minimize the number of people affected in the office but, I was trying to come up with a better solution.
I've written a script that uses 7zip to create a 7zip archive and splits it into volumes of 250MB. So now I have a folder that just contains several smaller files and no folders to worry about. But, if I batch copy all of these to the server, I'm concerned I'm still running into the same problem.
So my initial idea was to run copy one file at a time every 5-10sec. rather than all at once. But I would only want the script to run once. I know I could write a loop and rely on robocopy's /mir tag to skip files that have already been backed up, but I don't want to have to monitor the script once I start it.
I want to run the script when I'm ready to do a backup and then have it copy the files up to the network at intervals to avoid over taxing our small network.

Robocopy has a special option to throttle data traffic while copying.
/ipg:n - Specifies the inter-packet gap to free bandwidth on slow lines.
The number n is the number of milliseconds for Robocopy to wait after each block of 64 KB.
The higher the number, the slower Robocopy gets, but also: the less likely you will run into a conflict with your IT admin.
Example:
robocopy /mir /ipg:50 "C:\PROJECT_FOLDER_PATH" "\\NETWORK_FOLDER_PATH"
On a file of 1 GB (about 16,000 blocks of 64 KB each), this will increase the time it takes to copy the file with 800 seconds (16,000 x 50 ms).
Suppose it normally takes 80 seconds to copy this file; this might well be the case on a 100 Mbit connection.
Then the total time becomes 80 + 800 = 880 seconds (almost 15 minutes).
The bandwidth used is 8000 Mbit / 880 sec = 9.1 Mbit/s.
This leaves more than 90 Mbit/s of bandwidth for other processes to use.
Other options you may find useful:
/rh:hhmm-hhmm - Specifies run times when new copies may be started.
/pf - Checks run times on a per-file (not per-pass) basis.
Source:
https://technet.microsoft.com/en-us/library/cc733145(v=ws.11).aspx
http://www.zeda.nl/index.php/en/copy-files-on-slow-links
http://windowsitpro.com/windows-server/robocopy-over-network

Related

minio slow read speed for small files

i was under the impression that minio is well suited for small file storage and read (https://blog.min.io/minio-optimizes-small-objects/) , i finally migrated my 2 million small text files but the read speed surprisengly slower than directly from the disk ... is there a way to compact/merge those small files ? or is there something that i am doing wrong ...
my usual use case : reading a 10 000 random read files
when it was directly from the disk i average around 120 seconds
i transfered then to a local network solution : it took a round 500-600 seconds to read
now with minio its around 600 seconds
RQ : (
the disk is capable of outputting greater speed but in large files also for minio it works great with large files
)...
do you guys have any idea ... i am really stuck :(

minio was never a good system to deal with that kinda of problem i think u just need to get a faster hardware (drive) ssd should work fine for u

Why is rsync so slow with huge files containing a lot of changes?

We have a weekly process that archives a big number of frequently changing files into a single tar file and synchronizes it to another host using rsync as following (resulting in a very low speedup metric, usually close to 1.00):
rsync -avr <src> <dst>
Over the years, this archive has steadily increased in size and is now over 200G large. With the increasing file size, rsync has come to a point where it takes about 20 hours to finish the synchronization. However, deleting the file at the destination before the rsync process starts, causes the transfer to complete in only about 1 hour.
I understand that rsync's delta-transfer algorithm introduces some overhead, but it seems that it is not linear but exponentially growing with very large file sizes. If the actual transfer of bytes over the network takes 1h, what exactly is rsync doing in the remaining 19h?

Copy 13+ Million Tiny Files to New Server

Situation:
To replace an 10+ year old Windows 2000 2-Node cluster with shared MSA SCSI storage with a newer Windows 2003 2-Node cluster with shared FC storage.
The shared storage is current split into two drives X(data) and Q(quorum).
The X Drive consists of a Flat File DB consisting of 13.1 million+ files in 1.3 million+ folders. These files need to be copied from the old cluster to the new cluster with minimal down time.
File Count: 13,023,328
Total Filesize: 8.43 GB (File Size not Size on Disk)
Folder Count: 1308153
The old Win 200 Cluster has been up for over 10 years, continually reading/writing and is now also heavily fragmented. The X Drive on the Win 2000 Cluster also contains 7 backups of the DB, which are created/updated via Robo Copy once per day, this currently takes 4-5 hours and adds a real lag to system performance.
Old Cluster
- 2 x HP DL380 G4 |
1 x HP MSA 500 G2 (SCSI) | Raid 5 (4 disks + Spare)| Win 2k
New Cluster
- 2 x HP DL380 G7 |
1 x HP StorageWorks P2000 G2 MSA (Fibre Channel) | Win 2k3
The Database can be offline for 5 to 8 hours comfortably, and 15 hours absolute maximum, due to the time sensitive data it provides.
Options We've Tried:
Robo / FastCopy both seemed to sit around 100-300 files copied per second, with the database offline.
Peersync Copy from a local node backup (D: drive), this completed in 17 hours with an average of 250 files per second.
Question/Options:
Block by Block Copy - We think might be the fastest, but it will also copy the backups from the original X drive.
Redirect Daily Backup - Redirect the daily backup from the local X Drive to a network share of the new X Drive. Slow to begin with, but will then only be up to 12 hours out of date when we come to switch over, as it could be run while the old system is live. Final Sync on the move day, should take no more than 10 hours, to 100% confirm the old and new systems are identical.
Custom Copy Script - We have access to C# and Python
Robo/Fast Copy/ Other File Copy, open to suggestions and settings
Disk Replace / Raid Rebuild - The risky or impossible option, replace each of the older disks, with a new smaller form factor disk, in old G2 caddy, allow raid to rebuild, replace and rebuild until all drives are replaced. On day of migration, move the 4 disks to new P2000 MSA, in the same raid order?
Give Up - And leave it running on the old hardware until it dies a fiery death.*
We seem to be gravitating to Option 2, but thought we should put this to some of the best minds in the world before committing.
ps. Backups on the new cluster are to a new (M) drive using Shadow Copy.
* Unfortunately not a real option, as we do need to move to the newer hardware as the old storage and clustercan no longer cope with demand.

We went with Option 2, and redirected the twice daily backup from the original cluster to the new MSA raid on the new cluster.
It was run as a pull from the new cluster using PeerSync and a Windows share on the old cluster.
We tried to use the PeerSync TCP client which would have been faster / more efficient, but it wasn't compatible with Windows 2000. PeerSync was chosen over most other copy tools out there due to its compatibility and non-locking file operations, allowing the original cluster to be online throughout with minimal performance impact.
This took around 13.5 hours for the initial copy, and then around 5.5 hours for the incremental diff copies. The major limiting factor was the original clusters shared MSA RaidSet, the drives were online and being access through the backups, so the normal operation slowed down the backup times.
The final sync took about 5 hours and that was the total time the database was offline, for the hardware upgrade.

Is the server load affected by the number of images in a directory?

Is it true that large directories can cause the increased I/O wait ?
I was told to put not more than 1000 images per directory.
Thanks.

Yes!!!
you can divide files in sub directories under main directory... it make IO to find files faster...
max file limit may differ with file system you are using you can find here
you can divide your file/folder structure as per need.
1000 to 3000 is good number if you are going to have a lot of files

Most efficient qsub with limited time

I am working on a cluster where I submit jobs through the qsub engine.
I am granted a maximum of 72h of computational time at once. The output of my simulation is a folder which typically contains about 1000 files (about 10 Gb). I copy my output back after 71h30m of simulation. This means that everything that is produced after 71h30m (+ time to copy?) is lost. Is there a way to make the process more efficient, that is not having to manually estimate the time needed to copy output back?
Also before copying back my output I compress files with bzip2, what resources are used to do that? Should I ask a 1 node more than what I need to run the simulation only to compress files?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio