I am trying to compare between 2 folders in silent mode and get the exit code
However when I run it I always get error 100
C:\Users\admin>"C:\Program Files (x86)\Beyond Compare 3\BComp.com" /qc c:\Temp\source c:\Temp\destination
What am i doing wrong?
Quick Compare (/qc) is only for files and does not work for folders.
Source: Scooter Software
According to _Beyond Compare Command Line Reference, the /qc switch means quick comparison of two files and syntax is /qc=<type> | /quickcompare=<type>. Performs a quick comparison of two files and sets the DOS error level on exit. The specified type can be size, crc, or binary. If a type is not specified, a rules-based comparison will be performed. Error levels are documented in docs linked above.
First: Is a /qc switch allowed for folders as well?
And if so, how to interpret the If a type is not specified condition? I'd say
either omit the type /qc= and retain eguals,
or omit the type /qc,
or omit the switch at all?
Related
I study genetic data from 288 fish samples (Fish_one, Fish_two ...)
I have four files per fish, each with a different suffix.
eg. for sample_name Fish_one:
file 1 = "Fish_one.1.fq.gz"
file 2 = "Fish_one.2.fq.gz"
file 3 = "Fish_one.rem.1.fq.gz"
file 4 = "Fish_one.rem.2.fq.gz"
I would like to apply the following concatenate instructions to all my samples, using maybe a text file containing a list of all the sample_name, that would be provided to a loop?
cp sample_name.1.fq.gz sample_name.fq.gz
cat sample_name.2.fq.gz >> sample_name.fq.gz
cat sample_name.rem.1.fq.gz >> sample_name.fq.gz
cat sample_name.rem.2.fq.gz >> sample_name.fq.gz
In the end, I would have only one file per sample, ideally in a different folder.
I would be very grateful to receive a bit of help on this one, even though I'm sure the answer is quite simple for a non-novice!
Many thanks,
NoƩ
I would like to apply the following concatenate instructions to all my
samples, using maybe a text file containing a list of all the
sample_name, that would be provided to a loop?
In the first place, the name of the cat command is mnemonic for "concatentate". It accepts multiple command-line arguments naming sources to concatenate together to the standard output, which is exactly what you want to do. It is poor form to use a cp and three cats where a single cat would do.
In the second place, although you certainly could use a file of name stems to drive the operation you describe, it's likely that you don't need to go to the trouble to create or maintain such a file. Globbing will probably do the job satisfactorily. As long as there aren't any name stems that need to be excluded, then, I'd probably go with something like this:
for f in *.rem.1.fq.gz; do
stem=${f%.rem.1.fq.gz}
cat "$stem".{1,2,rem.1,rem.2}.fq.gz > "${other_dir}/${stem}.fq.gz"
done
That recognizes the groups present in the current working directory by the members whose names end with .rem.1.fq.gz. It extracts the common name stem from that member's name, then concatenates the four members to the correspondingly-named output file in the directory identified by ${other_dir}. It relies on brace expansion to form the arguments to cat, so as to minimize code and (IMO) improve clarity.
Whenever I use the Goto Anything search in Sublime Text and start typing to search the files in my current project I get a whole bunch of results based on Sublime Text's fuzzy-search algorithm, each prepended with a number.
I assume this is some sort of score for the search "strength" but I just wanted confirm this. What is this number based on?
It seems like the numbers are indeed representative of match strength, as you assumed.
I noticed an odd effect when testing your hypothesis, and then proceeded to create the dummy files CustomCompletions.CustomCompletions & CustomCompletions ( a file with no extension ) for further comparison.
Here are the results:
As you can see,
CustomCompletions has the highest ranking with 1524
CustomCompletions.py & CustomCompletions.todo share a rank of 1507
CustomCompletions.CustomCompletions & CustomCompletions.sublime-settings share a rank of 1490
All of the remaining files, which contain additional text in the base name, continue to receive lower rankings.
What I found odd was that the 2nd & 3rd groups had different rankings, despite sharing a base file name that exactly matches the query.
I figured that it might be due to the number of characters in the file extension, so I tested that assumption by creating the following files:
CustomCompletions.a
CustomCompletions.ab
CustomCompletions.abc
CustomCompletions.abcd
CustomCompletions.abcde
CustomCompletions.abcdef
CustomCompletions.abcdefg
CustomCompletions.abcdefgh
CustomCompletions.abcdefghi
CustomCompletions.abcdefghij
CustomCompletions.1
CustomCompletions.12
CustomCompletions.123
CustomCompletions.1234
CustomCompletions.12345
CustomCompletions.123456
CustomCompletions.1234567
CustomCompletions.12345678
CustomCompletions.123456789
CustomCompletions.1234567890
But it turns out they all ranked at 1507, the same ranking as the 2nd group.
Because of that outcome, I am still unsure what criteria affects the ranking of files which share a base name that is an exact match for the Goto Anything query, but have differing file extensions.
FindNextFile WinApi function is used to list content of directories. Microsoft is stating in documentation, that order is file system dependent. However NTFS should be in alphabetical order most of the time.
The order in which this function returns the file names is dependent on the file system type. With the NTFS file system and CDFS file systems, the names are usually returned in alphabetical order. With FAT file systems, the names are usually returned in the order the files were written to the disk, which may or may not be in alphabetical order. However, as stated previously, these behaviors are not guaranteed.
My application needs some ordering of object in directories. Because majority of Windows users use NTFS, I would like to optimize my application for that case. Therefore I use function _wcsicmp for name compare. Most of the time it is correct and results from FindNextFile are sorted according to _wcsicmp. However sometime result are not sorted. I thought, that it is natural, because FindFirstFile doesn't guaranteed the order and I must sort it anyway (in case of another file system). Then I noticed strange pattern. It looks like character '_' is returned after letters. Folder with content (a.txt, b.txt, _.txt) is returned in order a, b, _. Function _wcsicmp will sort that as _, a, b. Tested on Windows 8.1. I ran some test and this behavior is consistent.
Can someone explain me what is the comparison criteria used by NTFS? Or why is FindNextFile returning names out of alphabetical order?
Because NTFS sort rules are not so simple as just to sort in alphabetical order. Here is an msdn blog article to shed some light on the problem:
Why do NTFS and Explorer disagree on filename sorting?
One reason to this can be that NTFS captures the case mapping table at the time the drive is formatted and continues to use that table, even if the OS's case mapping tables change subsequently.
You can use CompareStringEx and set the flag SORT_DIGITSASNUMBERS
Minimum system requirement for this function is Windows Vista
LINK
int CompareStringEx(0,0x00000008/*SORT_DIGITSASNUMBERS*/,
lpString1, cchCount1, lpString2, cchCount2, NULL, NULL, 0);
Comparison result for this function is weird, it returns 1, 2, or 3:
#define CSTR_LESS_THAN 1 // string 1 less than string 2
#define CSTR_EQUAL 2 // string 1 equal to string 2
#define CSTR_GREATER_THAN 3 // string 1 greater than string 2
You can also try _wcsicoll for older systems. If I recall correctly _wcsicoll works better but not the same as Windows's sort.
I've been writing a program in R that outputs randomization schemes for a research project I'm working on with a few other people this summer, and I'm done with the majority of it, except for one feature. Part of what I've been doing is making it really user friendly, so that the program will prompt the user for certain pieces of information, and therefore know what needs to be randomized. I have it set up to check every piece of user input to make sure it's a valid input, and give an error message/prompt the user again if it's not. The only thing I can't quite figure out is how to get it to check whether or not the file name for the .csv output is valid. Does anyone know if there is a way to get R to check if a string makes a valid windows file name? Thanks!
These characters aren't allowed: /\:*?"<>|. So warn the user if it contains any of those.
Some other names are also disallowed: COM, AUX, NUL, COM1 to COM9, LPT1 to LPT9.
You probably want to check that the filename is valid using a regular expression. See this other answer for a Java example that should take minimal tweaking to work in R.
https://stackoverflow.com/a/6804755/134830
You may also want to check the filename length (260 characters for maximum portability, though longer names are allowed on some systems).
Finally, in R, if you try to create a file in a directory that doesn't exist, it will still fail, so you need to split the name up into the filename and directory name (using basename and dirname) and try to create the directory first, if necessary.
That said, David Heffernan gives good advice in his comment to let Windows do the wok in deciding whether or not it can create the file: you don't want to erroneously tell the user that a filename is invalid.
You want something a little like this:
nice_file_create <- function(filename)
{
directory_name <- dirname(filename)
if(!file.exists(directory_name))
{
ok <- dir.create(directory_name)
if(!ok)
{
warning("The directory of that path could not be created.")
return(invisible())
}
}
tryCatch(
file.create(filename),
error = function(e)
{
warning("The file could not be created.")
}
)
}
But test it thoroughly first! There are all sorts of edge cases where things can fall over: try UNC network path names, "~", and paths with "." and ".." in them.
I'd suggest that the easiest way to make sure a filename is valid is to use fs::path_sanitize().
It removes control characters, reserved characters, and Windows-reserved filenames, truncating the string at 255 bytes in length.
I have a folder with these files:
alongfilename1.txt <--- created first
alongfilename3.txt <--- created second
When I run DIR /x in command prompt, I see these short names assigned:
ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename3.txt
Now, if I add another file:
alongfilename1.txt
alongfilename2.txt <--- created third
alongfilename3.txt
I see this:
ALONGF~1.TXT alongfilename1.txt
ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt
Fine. It seems to be assigning the "~#" according to the date/time that I created the file. Is this correct?
Now, if I delete "alongfilename1.txt", the other two files keep their short names.
ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt
When will that ID (in this case, ~1) be released for use in another shortname. Will it ever?
Also, is it possible that a file on my machine has a short name of X, whereas the same file has a short name of Y on another machine? I'm particularly concerned for installations whose custom actions utilize DOS short names.
Thanks, guys.
If I were you, I would never rely on any version of any file system driver (be it Microsoft's, be it another OS's) to be consistent about the algorithm it uses to generate short file names. The exact behavior of the Microsoft Fastfat and NTFS drivers is not "officially" documented (except as somewhat high level overviews) thus are not part of the API contract. What works today might not work tomorrow if you update the driver.
In addition, there is absolutely no requirement that short names contain tilde characters - see for example this post by Raymond Chen.
There's a treasure trove of info to be found about this topic in the MSDN blogs - for example:
Registry key to force Windows to use short filenames
NTFS curiosities (Part I): Short file names
Also, do not rely on the sole presence of alphanumerical characters. Look at the Linux VFAT driver which says, for example, that any combination of uppercase letters, digits, and the following characters is valid: $ % ' ` - # { } ~ ! # ( ) & _ ^. NTFS will operate in compatibility mode with that...
The short filename is created with the file. The algorithm works like this (usually, but see moocha's reply):
counter = 1
stripped_filename = strip_dots(strip_non_ascii_characters(filename))
shortfn = first_6_characters(stripped_filename)
while (file_exists(shortfn + "~" + counter + "." + extension)) {
increment counter by 1
if more digits are added to counter, shorten shortfn by 1
/* e.g. if counter comes to 9 and shortf~9.txt is taken. try short~10.txt next */
}
This means that once the file is created, it will keep its short name until it's deleted.
As soon as the file is deleted, the short name may be used again.
If you move the file somewhere else, it may get a new short name (e.g. you're moving c:\somefilewithlongname.txt ("c:\somefi~1.txt") to d:\stuff\somefilewithlongname.txt, if there's d:\stuff\somefileelse.txt ("d:\stuff\somefi~1.txt"), the short name of the moved file will be somefi~2.txt). It seems that the short name is only persistent within a given directory on a given machine.
So: the short filenames will be generated by the filesystem, usually by the method outlined above. It is better to assume that short filenames are not persistent, as c:\longfi~1.txt on one machine might be "c:\longfilename.txt", whereas on another it might be "c:\longfish_story.txt"; also, when a file is deleted, the short name is immediately available again.
I believe MSDOS stores the association between the long and the short name in a per directory file.
It does not depends on the date/time.
If you move your files in a new directory... this will reset the algo mentionned by Piskvor applies itself again
In the new directory (after a move), you will get:
ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename2.txt
ALONGF~3.TXT alongfilename3.txt
even though alongfilename2.txt has initially been created third.
This link says how NTFS does it. I would guess it's still the same idea on more recent version.
In Windows 2000, both FAT and NTFS use
the Unicode character set for their
names, which contain several forbidden
characters that MS-DOS cannot read. To
generate a short MS-DOS-readable file
name, Windows 2000 deletes all of
these characters from the LFN and
removes any spaces. Because an
MS-DOS-readable file name can have
only one period, Windows 2000 also
removes all extra periods from the
file name. Next, Windows 2000
truncates the file name, if necessary,
to six characters and appends a tilde
( ~ ) and a number. For example, each
non-duplicate file name is appended
with ~1 . Duplicate file names end
with ~2 , then ~3, and so on. After
the file names are truncated, the file
name extensions are truncated to three
or fewer characters. Finally, when
displaying file names at the command
line, Windows 2000 translates all
characters in the file name and
extension to uppercase.
When the files are provided by a network server which is running Samba, then the short names are generated by the server, and they do not follow a predictable pattern.
So it is not safe to assume that you can predict the form of the short name.
G:\>dir /x *.txt
Directory of G:\
08/25/2009 12:34 PM 1,848 S2XYYV~1.TXT strace_output.txt
03/01/2010 05:32 PM 325,428 TEY7IH~O.TXT tomcat-dump-march-1.txt
03/11/2010 12:01 AM 5,811 DI356A~S.TXT ddmget-output.txt
01/23/2009 01:03 PM 313,880 DLA94Q~K.TXT ddm-log-fn.txt
04/20/2010 07:42 PM 7,491 A50QZP~A.TXT april-20-2010.txt