What does DUPO indicates in MS DOS? - dos

while sorting in MS DOS, what does DUPO does in below code sample?
SORT BLDAPPA.JMP BLDAPPA.J%1 /S(1,13,C,A) DUPO(1,13)

The MS-DOS sort command doesn't support a second filename argument, nor does it support the /S option. This command is apparently meant to invoke a third party program that isn't part of MS-DOS (or Windows) that's also unfortunately named SORT.
Googling turns up Opttech Sort which supports a DUPOUT parameter, which can be shortened to DUPO. This option apparently eliminates duplicate lines keeping only one of them. So DUPO(1,13) would eliminate lines whose first 13 bytes are same as another line in the file.

Related

GNU split (UNIX command) creating files not matching pattern after reaching "z"

So I was spliting some large files, everything worked properly until a file of 81GB came to scene. The split command seems that made its job, but the last files has a non correlated name. Look at the right bottom of picture.
And I'm using the command like this:
split -b 125M ./2014.txt 2014/2014_
Anyone knows why instead of create the file 2014_za created the 2014_zaaa?
You can only have 676 files named [a-z][a-z], while your command required more.
Here are some options for what split could do:
Crash.
This is the behavior mandated by POSIX, and followed by macOS.
Start writing larger suffixes.
This is a bad choice because after _zz comes _aaa, but now the files will show up in the wrong order in ls and cat * will no longer join them in correct order.
Save the last range, _z, for longer suffixes.
This is a good choice because after _yz comes _zaaa, which has room to grow while still remaining in alphabetical order. This is what GNU does, and the behavior you're seeing.
If you want all the names to be uniform without triggering any of these behaviors, just use a larger suffix length with -a 6 to ensure you have enough room.

Pipe and Filter

I am new to the command prompt want to know how all the command works …
I want to know how to apply pipe and filter in cmd to go through the directory and print only those files/folders that were created on a particular date, and want to use a data that occurs 2 -3 times within the files and folders of your test directory.
Go to Start>Help and support enter command prompt in the box and press the magnifying glass symbol at the end. Then select Command Reference Overview [H1S] from the list displayed. This will show the commands available.
Obviously, there are other articles that may be of aid as other selections.
Generally, typing
commandname /?
from the prompt will show (often cryptic) help
Essentially, the pipe symbol, | is used to direct the output of one command to the input of a second, so
dir | sort
for instance takes the screen-output of the DIR command and SORTs it.
The next question uses the critical term created. Each file may have THREE different times, the time the file was created, the time the file was last written to and the time the file was last accessed. It's possible to access all three times, but the default is the time last written. This is the normal time reported by DIR, and can variously be referred to as the file time or the update time amongst other terms.
Hence, to list the files (using the common written date) and select on a particular date, try
dir | find "dateyourequire"
where you need to replace dateyourequire with the target date, in the format matching that displayed by your DIR command. BTW commands are NOT case-sensitive - with one important exception.
Now date-format is a whole new ballgame, and you need to be very careful because the date shown is according to local convention. Some people use DD/MM/YY for instance - others use MM-DD-YY and there are many others. If you are discussing date and time, you need to say EVERY TIME the convention you are using.
You need to explain what you mean by your data that occurs 2 -3 times point. I can make neither head nor tail of it. Examples are usually a good way.
<your_command> | findstr "search_string*"
you can use parameter like /I /B etc. you can also use Regular expression for more information about findstr please see the reference help findstr

FindFirstFile Multiple file types

Is it possible to use Windows API function FindFirstFile to search for multiple file types, e.g *.txt and *.doc at the same time?
I tried to separate patterns with '\0' but it does not work - it searches only the first pattern (I guess, that's because it thinks that '\0' is the end of string).
Of course, I can call FindFirstFile with *.* pattern and then check my patterns or call it for every pattern, but I don't like this idea - I will use it only if there no other solutions.
This is not supported. Run it twice with different wildcards. Or use *.* and filter the result. This is definitely the better choice, wildcards are ambiguous anyway due to support for legacy MS-DOS 8.3 filenames. A wildcard like *.doc will find both .doc and .docx files for example. A filename like longfilename.docx also creates an entry named LONGFI~1.DOC
The MSDN docs mention nothing about FindFirstFile allowing multiple search patterns, hence it doesn't exist.
In this case your best bet is to scan using an open selection (like C:\\some directory\* or *) and then filter based on WIN32_FIND_DATA's cFileName member, using strrchr (or the appropriate Unicode variant) to find the extension. It should run pretty fast for the small set of characters that make up a file extension.
If you know the that all the extensions are say 3 characters, you should be able to mask it off as *.??? to speed things up.

very long lines - windows grep character (not a line based) tool

Is there a grep-like tool for windows where I can restrict the number of characters it outputs in a line where a searched for pattern is found.
One of the upstream software systems generates huge text files which we then feed as the input to our system.
Sometimes the input files get corrupted and I need to do a quick textual search to find if particular the bits of data are missing or not. To make it even worse - the input files is just one very very long line of text - and when I use grep or findstr - the result of the search is huge chunk of text.
I am wandering - how can I limit the number of characters grep to show before/after the pattern I searched for.
Cheers.
Two things spring to my mind:
Call grep with the --only-matching option so that only the text that matches is emitted. Depending on your regex, this may or may not help.
Write a very simple executable, call it trunc, which reads from stdin line by line and output the first n characters to stdout. Then simply pipe the output from grep to trunc.
The latter option is relatively simple. If you didn't want to go the whole hog and produce a proper native exe it could be quite easily achieved with a Perl/Python/Ruby etc. script.

"descript.ion" file spec?

There appears to be a somewhat standard "descript.ion" file in Windows programs universe which provides meta data for all/some of the files in a given directory.
I know there are various programs which write this file (example: NewsBin, UseNet downloader) and read it (Example: "FAR", a file manager mimicking old Norton Commander).
I'm writing my own file indexer, and would like to add the ability to parse and use the info from "descript.ion" files.
The problem I have is that I have not been able to find an actual spec for the file, despine much googling.
I reverse engineered it as best I could, but I'm not certain whether I captured 100% of the possible details, so I figured I'd ask SO.
Here are example lines from the file:
"Rus Song1.mp3" SovietMus 1/2, rus_song#gmail.com, Fri Aug 08 00:46:27 2008
RusSong2.mp3 SovietMus 2/2, rus_song#gmail.com, Fri Aug 08 01:46:22 2008
As it seems the structure is:
First "token" is a file name.
If the token starts with any letter but double quote, the token ends at the first space character.
If the token starts with the double quote, the end of token is the following double quote
Not sure what happens if filename contains a double quote, IIRC it's illegal in Windows filesystems, so escaping the quote may be a moot question)
Last token (end of line to the very last comma moving backwards) is a timestamp.
Second to last token (the very last comma to second-to-last comma moving backwards) is the name of the poster from the Usenet newsgroup. I'm not quite sure what happens in generic format since the only descript.ion files I saw were from NewsBin that is obviously Usenet centric.
Everything in between is a description, in NewsBin's case coming from post's subject.
QUESTIONs:
Does anyone know of a bit more official "descript.ion" file spec/documentation?
(or, at elast, have your own knowledge of those files and can verify my spec)
Does anyone know of any other programs that read or write this file?
Thanks!
The description files on my system are from Total Commander as well. They follow the basic spec mentioned in the other answers:
Filename Text I typed to describe the file
"Long filename" Some text
Each line ends in a normal Windows line break.
In addition, the program stores multi-line comments as follows:
Filename This is the first line\\nSecond line\\nLast line\x04\xc2
Here, I mean that the descript.ion file contains a backslash and a letter 'n' where I typed a line break, and two special characters 04 C2 at the end of the comment. In addition, the line is ended by a Windows line break 0D 0A.
Apparently, the two extra characters at the end of the line signal the end of a multiline comment. If I remove them, the comment is rendered as a single line in the GUI, and the '\n' sequences are displayed literally.
The original usage of DESCRIPT.ION was to provide longer more descriptive names to 8.3 filenames; all it had was the shortname and a longer description. As you've found, others have co-opted the name with varying formats and usages. Frankly speaking, I don't think you'll find any specific commonality among the various usages.
Format is simple: FileName remainder of the line is a description of the file
https://jpsoft.com/ascii/descfile.txt
(Wayback Machine)
The descript.ion file is extensively used in the file management utility "total commander", a shareware found in www.ghisler.com. From version 7.5 of TC, it can have length of 4096 bytes. I have been using it extensively to annotate my files without any issues. You may look up different user's experience at the total commander users forum.
the answer above looks correct for me, just a addition:
from http://filext.com/file-extension/ION
The ION file type is primarily associated with '4DOS'. Note: Norton Utilities also uses 4DOS.
http://www.optimasc.com/products/fileid/4dos-descext.pdf
Collected links to 4DOS description-aware programs of all kind and 4DOS tools.
http://www.4dos.info/4tools.htm
http://drupal.org/node/289988

Resources