I'm synchronizing my Sites folder on two Macs in the local network. Some Wordpress sites in this folder contain readme.txt files that are different because of different plugin or theme versions on the two Macs. They differ in file size and timestamp.
My default.prf has a setting for merging .txt files:
merge = Name *.txt -> diff3 CURRENT1 CURRENTARCHOPT CURRENT2 > NEW
which works fine for most text files, but not for these readme.txt files. When I start syncing, Unison gives me merge errors for these files. Here's a typical output:
Contacting server...
Connected [//mac1.local//Users/timm -> //mac2.local//Users/timm]
Looking for changes
Waiting for changes from server
Reconciling changes
local mac1.local
new file <-M-> new file Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/readme.txt
Proceed with propagating updates? [] y
Propagating updates
UNISON 2.48.4 started propagating changes at 15:21:22.86 on 16 May 2017
Merge command: diff3 '/Users/timm/Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/.unison.merge1-readme.txt' '/Users/timm/Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/.unison.merge2-readme.txt' > '/Users/timm/Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/.unison.mergenew1-readme.txt'
Merge result (exited (2)):
diff3: missing operand after `/Users/timm/Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/.unison.merge2-readme.txt'
diff3: Try `diff3 --help' for more information.
Saving synchronizer state
Synchronization incomplete at 15:21:24 (0 items transferred, 0 skipped, 1 failed)
failed: Sites/wp-sites/example.dev/wp-content/plugins/kocuj-sitemap/readme.txt
I realise that merge must fail on new files, but I'd think that the program should copy them over rather than trying to merge. I even tried ignoring the (pretty much useless) files:
ignore = Name {Sites/*/readme.txt}
but for some reason Unison doesn't ignore them. The thing is, I need text file merging because I rely on simple text files for my documentation. The default.prf looks like this:
# default profile
servercmd=/opt/local/bin/unison
root = /Users/timm/
root = ssh://mac2.local//Users/timm/
path = Sites
auto = true
times = true
# ignore permissions
perms = 0
rsrc = false
ignore = Name {.DS_Store}
ignore = Name {.localized}
ignore = Name {*/*.app/*}
ignore = Name {*/temp/*}
ignore = Name {*/cache/*}
ignore = Name {Sites/*/wp-content/languages/*}
ignore = Name {Sites/*/wp-content/plugins/*}
ignore = Name {Sites/*/wp-content/themes/twenty*}
ignore = Name {Sites/*/wp-sites/wordpress/*}
# ???
ignore = Name {Sites/*/readme.txt}
# diff and merge
diff = diff
merge = Name *.txt -> diff3 CURRENT1 CURRENTARCHOPT CURRENT2 > NEW
backup = Name *.txt
backupcurrent = Name *.txt
maxbackups = 10
log = true
logfile = /Users/timm/.unison/unison.log
I'm running Unison 2.48.4 on Mac El Capitan 10.11.6.
Did I overlook a setting? Is there any other way to make Unison copy and not merge new files?
Related
I am using nifi to transfer files between ftp locations.
I have to transfer files from a sftp location to a ftp directory.
I have the below folder structure in the remote sftp location.
/rootfolder/
/subfolder1
/subfolder2
/subfolder3
I need to download respective files from each subfolder to a local directory which has the similar structure.
My workflow includes
ListSFTP -> FetchSFTP (3) -> PutFTP
In ListSFTP
Remote path: /rootfolder
In FetchSFTP1
Remote path: /rootfolder/subfolder1
In FetchSFTP2
Remote path: /rootfolder/subfolder2
In FetchSFTP3
Remote path: /rootfolder/subfolder3
But, this does not seem to work. can someone help me how i can transfer files from a remote sftp sub-folder(s).
Thanks,
Aadil
You should be able to set ListSFTP to recursive search and then coming out of ListSFTP each flow file will have attributes for "path" ad "filename".
Lets say you had one file under each directory in your example, you should get three flow files like the following:
ff 1
path = /rootfolder/subfolder1
filename = file1
ff 2
path = /rootfolder/subfolder2
filename = file2
ff 3
path = /rootfolder/subfolder3
filename = file3
You should only need one FetchSFTP processor with Remote Filename set to ${path}/${filename}.
If you have the same structure on your destination system, just set PutFTP's Remote Path to ${path}.
If you have a slightly different structure, use UpdateAttribute to modify "path" right before PutFTP.
I have the following code to download a torrent off of a magnet URI.
#python
#lt.storage_mode_t(0) ## tried this, didnt work
ses = lt.session()
params = { 'save_path': "/save/here"}
ses.listen_on(6881,6891)
ses.add_dht_router("router.utorrent.com", 6881)
#ses = lt.session()
link = "magnet:?xt=urn:btih:395603fa..hash..."
handle = lt.add_magnet_uri(ses, link, params)
while (not handle.has_metadata()):
time.sleep(1)
handle.pause () # got meta data paused, and set priority
handle.file_priority(0, 1)
handle.file_priority(1,0)
handle.file_priority(2,0)
print handle.file_priorities()
#output is [1,0,0]
#i checked no files written into disk yet.
handle.resume()
while (not handle.is_finished()):
time.sleep(1) #wait until download
It works, However in this specific torrent, there are 3 files, file 0 - 2 kb, file 1 - 300mb, file 3 - 2kb.
As can be seen from the code, file 0 has a priority of 1, while the rest has priority 0 (i.e. don't download).
The problem is that when the 0 file finishes downloading, i want to it to stop and not download anymore. but it will sometimes download 1 file -partially, sometimes 100mb, or 200mb, sometimes couple kb and sometimes the entire file.
So my question is: How can i make sure only file 0 is downloaded, and not 1 and 2.
EDIT: I added a check for whether i got metadata, then set priority and then resume it, however this still downloads the second file partially.
The reason this happens is because of the race between adding the torrent (which starts the download) and you setting the file priorities.
To avoid this you can set the file priorities along with adding the torrent, something like this:
p = parse_magnet_uri(link)
p['file_priorities'] = [1, 0, 0]
handle = ses.add_torrent(p)
UPDATE:
You don't need to know the number of files, it's OK to provide file priorities for more files than ends up being in the torrent file. The remaining ones will just be ignored. However, if you don't want to download anything (except for the metadata/.torrent) from the swarm, a better way is to set the flag_upload_mode flag. See documentation.
p = parse_magnet_uri(link)
p['flags'] |= add_torrent_params_flags_t.flag_upload_mode
handle = ses.add_torrent(p)
I need to compare two directories to validate a backup.
Say my directory looks like the following:
Filename Filesize Filename Filesize
user#main_server:~/mydir/ user#backup_server:~/mydir/
file1000.txt 4182410737 file1000.txt 4182410737
file1001.txt 8241410737 - <-- missing on backup_server!
... ...
file9999.txt 2410418737 file9999.txt 1111111111 <-- size != main_server
Is there a quick one liner that would get me close to output like:
Invalid Backup Files:
file1001.txt
file9999.txt
(with the goal to instruct the backup script to refetch these files)
I've tried to get variations of the following to no avail.
[main_server] $ rsync -n ~/mydir/ user#backup_server:~/mydir
I cannot do rsync to backup the directories itself because it takes way too long (8-24hrs). Instead I run multiple threads of scp to fetch files in batches. This completes regularly <1hr. However, occasionally I find a few files that were somehow missed (perhaps dropped connection).
Speed is a priority, so file sizes should be sufficient. But I'm open to including a checksum, provided it doesn't slow the process down like I find with rsync.
Here's my test process:
# Generate Large Files (1GB)
for i in {1..100}; do head -c 1073741824 </dev/urandom >foo-$i ; done
# SCP them from src to dest
for i in {1..100}; do ( scp ~/mydir/foo-$i user#backup_server:~/mydir/ & ) ; sleep 0.1 ; done
# Confirm destination has everything from source
# This is the point of the question. I've tried:
rsync -Sa ~/mydir/ user#backup_server:~/mydir
# Way too slow
What do you recommend?
By default, rsync uses the quick check method which only transfers files that differ in size or last-modified time. As you report that the sizes are unchanged, that would seem to indicate that the timestamps differ. Two options to handlel this are:
Use -p to preserve timestamps when transferring files.
Use --size-only to ignore timestamps and transfer only files that differ in size.
I need to recover the content of the show log module of Omnet++/Tkenv into a file, I added in the omnetpp.ini:
cmdenv-express-mode = false
cmdenv-output-file = log.txt
but I have two types of problems:
1) after the simulation, I did not find the "log.txt" If I do not create it
2) and when I created it before launching the simulation under ../omnetpp-4.6/log.txt also I find it empty
I used EV << to display the content of variables that I used, I need to resolve this problem in order to analyze the traffic so how can I do that please?
You have to start your simulation in Cmdenv mode. To do that go to Run | Run Configurations | select your configuration, then select Command line as User interface. The log file is created in simulations directory by default.
So, here's the problem: I have files which are regular files, and they are put into a ZIP file (see below for details on ZIP). Then I unzip them (see below for details on the tool used), and the files are restored. The date of the file is restored, as in standard in the ZIP/UNZIP tools used. When querying using DIR, or in Windows Explorer, the files involved have the same date as they had, before being handled by the ZIP/UNZIP process.
So, all OK.
But then, I'm using the XCOPY /D command, to further manipulate different copies of those files on the disk ... and, XCOPY says : one file is newer than the other one. Given the fact the date, hour, up until minutes is the same .. the difference would be regarding a smaller entity, like seconds ?
All involved disks have NTFS file system.
Example:
C:\my>dir C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd
Volume in drive C is mycomputerC
Volume Serial Number is 1234-5678
Directory of C:\windows
31/12/2014 19:50 51 Background_mycomputer.cmd
1 File(s) 51 bytes
Directory of C:\my\directory
31/12/2014 19:50 51 Background_mycomputer.cmd
1 File(s) 51 bytes
0 Dir(s) 33.655.316.480 bytes free
C:\my>xcopy C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd /D
Overwrite C:\my\directory\Background_mycomputer.cmd (Yes/No/All)? y
C:\windows\Background_mycomputer.cmd
1 File(s) copied
C:\my>xcopy C:\my\directory\Background_mycomputer.cmd C:\windows\Background_mycomputer.cmd /D
0 File(s) copied
C:\my>xcopy C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd /D
0 File(s) copied
C:\my>unzip -v
UnZip 6.00 of 20 April 2009, by Info-ZIP. Maintained by C. Spieler. Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.
Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip/ ;
see ftp://ftp.info-zip.org/pub/infozip/UnZip.html for other sites.
Compiled with Microsoft C 13.10 (Visual C++ 7.1) for
Windows 9x / Windows NT/2K/XP/2K3 (32-bit) on Apr 20 2009.
UnZip special compilation options:
ASM_CRC
COPYRIGHT_CLEAN (PKZIP 0.9x unreducing method not supported)
NTSD_EAS
SET_DIR_ATTRIB
TIMESTAMP
UNIXBACKUP
USE_EF_UT_TIME
USE_UNSHRINK (PKZIP/Zip 1.x unshrinking method supported)
USE_DEFLATE64 (PKZIP 4.x Deflate64(tm) supported)
UNICODE_SUPPORT [wide-chars] (handle UTF-8 paths)
MBCS-support (multibyte character support, MB_CUR_MAX = 1)
LARGE_FILE_SUPPORT (large files over 2 GiB supported)
ZIP64_SUPPORT (archives using Zip64 for large files supported)
USE_BZIP2 (PKZIP 4.6+, using bzip2 lib version 1.0.5, 10-Dec-2007)
VMS_TEXT_CONV
[decryption, version 2.11 of 05 Jan 2007]
UnZip and ZipInfo environment options:
UNZIP: [none]
UNZIPOPT: [none]
ZIPINFO: [none]
ZIPINFOOPT: [none]
C:\my>ver
Microsoft Windows [Version 6.1.7601]
C:\my>zip -?
Copyright (c) 1990-2006 Info-ZIP - Type 'zip "-L"' for software license.
Zip 2.32 (June 19th 2006). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
The default action is to add or replace zipfile entries from list, which
can include the special name - to compress standard input.
If zipfile and list are omitted, zip compresses stdin to stdout.
-f freshen: only changed files -u update: only changed or new files
-d delete entries in zipfile -m move into zipfile (delete files)
-r recurse into directories -j junk (don't record) directory names
-0 store only -l convert LF to CR LF (-ll CR LF to LF)
-1 compress faster -9 compress better
-q quiet operation -v verbose operation/print version info
-c add one-line comments -z add zipfile comment
-# read names from stdin -o make zipfile as old as latest entry
-x exclude the following names -i include only the following names
-F fix zipfile (-FF try harder) -D do not add directory entries
-A adjust self-extracting exe -J junk zipfile prefix (unzipsfx)
-T test zipfile integrity -X eXclude eXtra file attributes
-! use privileges (if granted) to obtain all aspects of WinNT security
-R PKZIP recursion (see manual)
-$ include volume label -S include system and hidden files
-e encrypt -n don't compress these suffixes
C:\my>
Question: I do not want XCOPY to make updates where I know they are invalid cause the time format is doing something wrong. How do I prevent that ?
From how I see, there's different things involved, being XCOPY, very specific ZIP and UNZIP, and NTFS file system. Which one is doing something wrong ?
I must stress that apart from ZIP and UNZIP, there are no other changes done to the file, like changing 1 file, then making a change to another one, in less than 60 seconds time.
At moment of test, the time shown was NOT the current time, and not close to it either. No file is adjusting to the current time, the times refer to last changes of the file in question, which may be any time in the past. In this case, it's one day later, but it can be anything.
I noticed the peculiar behavior Raymond Chen describes when writing a Powershell script (GitHub link) to freshen a zip archive using the System.IO.Compression and System.IO.Compression.FileSystem libraries.
Interestingly, Zip archives can store multiple copies of the same file with identical metadata (name, relative path, modification dates). Extracting the second copy of the file will fail in Windows Explorer because the file already exists.
When trying to prevent re-zipping a file was already archived, I checked the relative path and date, and noticed that there was a discrepancy of up to two seconds in the LastWriteTime. This workaround compensates for the loss of precision:
$AlreadyArchivedFile = ($WriteArchive.Entries | Where-Object {#zip will store multiple copies of the exact same file - prevent this by checking if already archived.
(($_.FullName -eq $RelativePath) -and ($_.Length -eq $File.Length) ) -and
([math]::Abs(($_.LastWriteTime.UtcDateTime - $File.LastWriteTimeUtc).Seconds) -le 2) #ZipFileExtensions timestamps are only precise within 2 seconds.
})
Also, the IsDaylightSavingTime flag is not stored in the Zip archive. As a result I was surprised when extracted files became an hour newer than the original archived file. I tried this several times and saw the extracted file's timestamp incremented by an hour every time it was compressed and extracted.
Here's a very ugly workaround that decreases the archived file time by one hour to make the original source file and extracted file timestamps consistent:
If($File.LastWriteTime.IsDaylightSavingTime() -and $ArchivedFile){#HACK: fix for buggy date - adds an hour inside archive when the zipped file was created during PDT (files created during PST are not affected).
$entry = $WriteArchive.GetEntry($RelativePath)
$entry.LastWriteTime = ($File.LastWriteTime.ToLocalTime() - (New-TimeSpan -Hours 1))
}
There's probably a better way to handle this. Unfortunately I'm not aware of any way to store a Daylight Savings indicator for a file in a .Zip archive, and that information is lost.