CMD.exe compatible replacement supporting longer command lines (>8191 chars) - windows

The documentation of cmd.exe tells us there is a 8191 character limit to a cmd.exe command line. Powershell may have the same issue (but anyway I think it is not compatible with cmd syntax).
The Windows OS technical limit is "much" higher, at 32767 caracters or so (see CreateProcessA documentation).
Are there compatible alternative shells to cmd.exe that increase the command line length limit above 8191 characters ?
Note 1: I am not asking about a terminal emulator (GUI) problem: this is a shell problem.
Note 2: I believe this question is not a duplicate because it is focused on a precise limitation of cmd.exe. Also I could not post my Yori answer on this or this questions because they are closed.

Have a look at Yori. There is no such limit in Yori. Yori is open-source.
Yori is a CMD replacement shell that supports backquotes, job control, and improves tab completion, file matching, aliases, command history, and more. It includes a handful of native Win32 tools that implement commonly needed tasks which can be used with any shell.

You might be interested in Take Command from jpsoftware, abbreviated TCC. There is no such limit in TCC.
There is no limit to the size of a TCC command line (other than that imposed by Windows or the amount of RAM in the system).

Related

How to increase quantity of file descriptors on MacOS Mojave?

Can't find a way to do this, every version of MacOS use a different version and Mojave still very recent, so can't find anything.
Programmatically, you can use getrlimit() and setrlimit() to adjust the number of file descriptors the process can open. The relevant resource identifier is RLIMIT_NOFILE.
As noted in the man page, RLIMIT_NOFILE works somewhat differently than other resources. getrlimit() might indicate that the hard limit is RLIM_INFINITY (unlimited), but the kernel actually imposes a limit of OPEN_MAX (currently 10240). So, treat that as the maximum that you can set using setrlimit().
To do this for a program whose code you don't control, you can adjust the limit in a shell before launching that program from that shell. In bash and other sh-derived shells, you can use the ulimit built-in command for that. For example, ulimit -Sn 10240.

How does line ending effect in coding?

Why do line ending differ from platform to platform? Even why is there term like line ending in programming?
I prefer saving my codes in Unix/Linux format, even if I'm on Windows. Am I missing anything by not saving it in Windows or MacOS format? How does line ending effect in coding.
In the early days, when Typewriters were nearly the only way of getting output from a computer, CR and LF did different things. Unix started the tradition of using a single character to mark the end of a line, probably because it made their pipelining easier; their drivers could easily convert a single LF to CR/LF if need be. Linux is mostly a Unix clone so it keeps that convention. The others hold on to the CR/LF convention for historical reasons, even though it's not strictly necessary.
Some languages such as C, C++, and Python will let you specify the type of file when you open it, either binary or text. For text files a translation is performed so that a single LF is translated into the line ending convention required by the OS.
Basically everyone wanted to be different when creating OS's - Un*x's started with LF, then VMS and DOS wanted CR/LF (like a typewriter) and of course MAC wanted to be different so they went for CR only.
They just wanted to make it harder to transfer between OS's so that you 'bought' into one
Added because of comment
Up to the programmer - if you need to support different line endings then you must code for them. eg you could create a #define for the line ending and then have this change depending on compile options

Is there a minimally POSIX.2 compliant shell?

Is there a minimally POSIX.2 compliant shell (let's call it mpcsh) in the following sense:
if mpcsh myscript.sh behaves correctly on my (compliant) system then xsh myscript.sh will behave identically for any POSIX.2 compliant shell xsh on any compliant system. ("Identically" up to less relevant things like the wording of error messages etc.)
Does dash qualify?
If not, is there any way to verify compliance of myscript.sh?
Edit (9 years later):
The accepted answer still stands, but have a look at this blog post and the checkbashisms command (source). Avoiding bashisms is not the same as writing a POSIX.2 compliant shell script, but it comes close.
The sad answer in advance
It won't help you (not as much and reliably as you would expect and want it to anyway).
Here is why.
One big problem that cannot be addressed by a virtual "POSIX shell" are things that are ambiguously worded or just not addressed in the standard, so that shells may implement things in different ways while still adhering to the standard.
Take these two examples regarding pipelines, the first of which is well known:
Example 1 - scoping
$ ksh -c 'printf "foo" | read s; echo "[${s}]"'
[foo]
$ bash -c 'printf "foo" | read s; echo "[${s}]"'
[]
ksh executes the last command of a pipe in the current shell, whereas bash executes all - including the last command - in a subshell. bash 4 introduced the lastpipe option which makes it behave like ksh:
$ bash -c 'shopt -s lastpipe; printf "foo" | read s; echo "[${s}]"'
[foo]
All of this is (debatably) according to the standard:
Additionally, each command of a multi-command pipeline is in a subshell environment; as an extension, however, any or all commands in a pipeline may be executed in the current environment.
I am not 100% certain on what they meant with extension, but based on other examples in the document it does not mean that the shell has to provide a way to switch between behavior but simply that it may, if it wishes so, implement things in this "extended way". Other people read this differently and argue about the ksh behavior being non-standards-compliant and I can see why. Not only is the wording unlucky, it is not a good idea to allow this in the first place.
In practice it doesn't really matter which behavior is correct since those are the """two big shells""" and people would think that if you don't use their extensions and only supposedly POSIX-compliant code that it will work in either, but the truth is that if you rely on one or the other behavior mentioned above your script can break in horrible ways.
Example 2 - redirection
This one I learnt about just a couple of days ago, see my answer here:
foo | bar 2>./qux | quux
Common sense and POLA tells me that when the next line of code is hit, both quux and bar should have finished running, meaning that the file ./qux is fully populated. Right? No.
POSIX states that
If the pipeline is not in the background (see Asynchronous Lists), the shell shall wait for the last command specified in the pipeline to complete, and may also wait for all commands to complete.)
May (!) wait for all commands to complete! WTH!
bash waits:
The shell waits for all commands in the pipeline to terminate before returning a value.
but ksh doesn't:
Each command, except possibly the last, is run as a separate process; the shell waits for the last command to terminate.
So if you use redirection inbetween a pipe, make sure you know what you are doing since this is treated differently and can horribly break on edge cases, depending on your code.
I could give another example not related to pipelines, but I hope these two suffice.
Conclusion
Having a standard is good, continuously revising it is even better and adhering to it is great. But if the standard fails due to ambiguity or permissiveness things can still unexpectedly break practically rendering the usefulness of the standard void.
What this means in practice is that on top of writing "POSIX-compliant" code you still need to think and know what you are doing to prevent certain things from happening.
All that being said, one shell which has not yet been mentioned is posh which is supposedly POSIX plus even fewer extensions than dash has, (primarily echo -n and the local keyword) according to its manpage:
BUGS
Any bugs in posh should be reported via the Debian BTS.
Legitimate bugs are inconsistencies between manpage and behavior,
and inconsistencies between behavior and Debian policy
(currently SUSv3 compliance with the following exceptions:
echo -n, binary -a and -o to test, local scoping).
YMMV.
Probably the closest thing to a canonical shell is ash which is maintained by The NetBSD Foundation, among other organizations.
A downstream variant of this shell called dash is better known.
Currently, there is no single role model for the POSIX shell.
Since the original Bourne shell, the POSIX shell has adopted a number of additional features.
All of the shells that I know that implement those features also have extensions that go beyond the feature set of the POSIX shell.
For instance, POSIX allows for arithmetic expressions in the format:
var=$(( expression ))
but it does not allow the equivalent:
(( var = expression ))
supported by bash and ksh93.
I know that bash has a set -o posix option, but that will not disable any extensions.
$ set -o posix
$ (( a = 1 + 1 ))
$ echo $a
2
To the best of my knowledge, ksh93 tries to conform to POSIX out of the box, but still allows extensions.
The POSIX developers spent years (not an exaggeration) wrestling with the question: "What does it mean for an application program to conform to the standard?" While the POSIX developers were able to define a conformance test suite for an implementation of the standards (POSIX.1 and POSIX.2), and could define the notion of a "strictly conforming application" as one which used no interface beyond the mandatory elements of the standard, they were unable to define a testing regime that would confirm that a particular application program was "strictly conforming" to POSIX.1, or that a shell script was "strictly conforming" to POSIX.2.
The original question seeks just that; a conformance test that verifies a script uses only elements of the standard which are fully specified. Alas, the standard is full of "weasel words" that loosen definitions of behavior, making such a test effectively impossible for a script of any significant level of usefulness. (This is true even setting aside the fact that shell scripts can generate and execute shell scripts, thus rendering the question of "strictly conforming" as equivalent to the Stopping Problem.)
(Full disclosure: I was a working member and committee leader within IEEE-CS TCOS, the creators of the POSIX family of standards, from 1988-1999.)
If not, is there any way to verify compliance of myscript.sh?
This is basically a case of Quality Assurance. Start with:
code review
unit tests (yes, I've done this)
functional tests
perform the test suite with as many different shell programs as you can find.
(ash, bash, dash, ksh93, mksh, zsh)
Personally, I aim for the common set of extensions as supported by bash and ksh93. They're the oldest and most widely available interpreters of the shell language available.
EDIT Recently I happened upon rylnd/shpec - a testing framework for your shell code. You can describe features of your code in test cases, and specify how they can be verified.
Disclosure: I helped making it work across bash, ksh, and dash.

Windows equivalent of ulimit -n

What is the windows equivalent of the unix command " ulimit -n" ?
Basically, i want to set the maximum fd limit via command prompt.
I don't believe that current Windows O/S have a limit on the total number of file descriptors, but the MS runtime library (msvcrt.dll) has a per process limit of 2048, albeit as far as I know that's not enforced by the O/S.
It can allegedly be increased only by building your own version of the MS runtime library from source.
hmm... I may have been wrong before - setmaxstdio (see here) - but it is per-process, not system wide.
I may be wrong, but I didn't think there was a limit to set in Windows... but unless you can say how this relates to programming, I expect this answer will be closed soon.
If you are in the "IT Pro" area (rather than development), then there is a sister-site, serverfault.com - coming soon for this type of question.

Text editor to open big (giant, huge, large) text files [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
I mean 100+ MB big; such text files can push the envelope of editors.
I need to look through a large XML file, but cannot if the editor is buggy.
Any suggestions?
Free read-only viewers:
Large Text File Viewer (Windows) – Fully customizable theming (colors, fonts, word wrap, tab size). Supports horizontal and vertical split view. Also support file following and regex search. Very fast, simple, and has small executable size.
klogg (Windows, macOS, Linux) – A maintained fork of glogg. Its main feature is regular expression search. It supports monitoring file changes (like tail), bookmarks, highlighting patterns using different colors, and has serious optimizations built in. But from a UI standpoint, it's rather minimal.
LogExpert (Windows) – "A GUI replacement for tail." It's really a log file analyzer, not a large file viewer, and in one test it required 10 seconds and 700 MB of RAM to load a 250 MB file. But its killer features are the columnizer (parse logs that are in CSV, JSONL, etc. and display in a spreadsheet format) and the highlighter (show lines with certain words in certain colors). Also supports file following, tabs, multifiles, bookmarks, search, plugins, and external tools.
Lister (Windows) – Very small and minimalist. It's one executable, barely 500 KB, but it still supports searching (with regexes), printing, a hex editor mode, and settings.
Free editors:
Your regular editor or IDE. Modern editors can handle surprisingly large files. In particular, Vim (Windows, macOS, Linux), Emacs (Windows, macOS, Linux), Notepad++ (Windows), Sublime Text (Windows, macOS, Linux), and VS Code (Windows, macOS, Linux) support large (~4 GB) files, assuming you have the RAM.
Large File Editor (Windows) – Opens and edits TB+ files, supports Unicode, uses little memory, has XML-specific features, and includes a binary mode.
GigaEdit (Windows) – Supports searching, character statistics, and font customization. But it's buggy – with large files, it only allows overwriting characters, not inserting them; it doesn't respect LF as a line terminator, only CRLF; and it's slow.
Builtin programs (no installation required):
less (macOS, Linux) – The traditional Unix command-line pager tool. Lets you view text files of practically any size. Can be installed on Windows, too.
Notepad (Windows) – Decent with large files, especially with word wrap turned off.
MORE (Windows) – This refers to the Windows MORE, not the Unix more. A console program that allows you to view a file, one screen at a time.
Web viewers:
readfileonline.com – Another HTML5 large file viewer. Supports search.
Paid editors/viewers:
010 Editor (Windows, macOS, Linux) – Opens giant (as large as 50 GB) files.
SlickEdit (Windows, macOS, Linux) – Opens large files.
UltraEdit (Windows, macOS, Linux) – Opens files of more than 6 GB, but the configuration must be changed for this to be practical: Menu » Advanced » Configuration » File Handling » Temporary Files » Open file without temp file...
EmEditor (Windows) – Handles very large text files nicely (officially up to 248 GB, but as much as 900 GB according to one report).
BssEditor (Windows) – Handles large files and very long lines. Don’t require an installation. Free for non commercial use.
loxx (Windows) – Supports file following, highlighting, line numbers, huge files, regex, multiple files and views, and much more. The free version can not: process regex, filter files, synchronize timestamps, and save changed files.
Tips and tricks
less
Why are you using editors to just look at a (large) file?
Under *nix or Cygwin, just use less. (There is a famous saying – "less is more, more or less" – because "less" replaced the earlier Unix command "more", with the addition that you could scroll back up.) Searching and navigating under less is very similar to Vim, but there is no swap file and little RAM used.
There is a Win32 port of GNU less. See the "less" section of the answer above.
Perl
Perl is good for quick scripts, and its .. (range flip-flop) operator makes for a nice selection mechanism to limit the crud you have to wade through.
For example:
$ perl -n -e 'print if ( 1000000 .. 2000000)' humongo.txt | less
This will extract everything from line 1 million to line 2 million, and allow you to sift the output manually in less.
Another example:
$ perl -n -e 'print if ( /regex one/ .. /regex two/)' humongo.txt | less
This starts printing when the "regular expression one" finds something, and stops when the "regular expression two" find the end of an interesting block. It may find multiple blocks. Sift the output...
logparser
This is another useful tool you can use. To quote the Wikipedia article:
logparser is a flexible command line utility that was initially written by Gabriele Giuseppini, a Microsoft employee, to automate tests for IIS logging. It was intended for use with the Windows operating system, and was included with the IIS 6.0 Resource Kit Tools. The default behavior of logparser works like a "data processing pipeline", by taking an SQL expression on the command line, and outputting the lines containing matches for the SQL expression.
Microsoft describes Logparser as a powerful, versatile tool that provides universal query access to text-based data such as log files, XML files and CSV files, as well as key data sources on the Windows operating system such as the Event Log, the Registry, the file system, and Active Directory. The results of the input query can be custom-formatted in text based output, or they can be persisted to more specialty targets like SQL, SYSLOG, or a chart.
Example usage:
C:\>logparser.exe -i:textline -o:tsv "select Index, Text from 'c:\path\to\file.log' where line > 1000 and line < 2000"
C:\>logparser.exe -i:textline -o:tsv "select Index, Text from 'c:\path\to\file.log' where line like '%pattern%'"
The relativity of sizes
100 MB isn't too big. 3 GB is getting kind of big. I used to work at a print & mail facility that created about 2% of U.S. first class mail. One of the systems for which I was the tech lead accounted for about 15+% of the pieces of mail. We had some big files to debug here and there.
And more...
Feel free to add more tools and information here. This answer is community wiki for a reason! We all need more advice on dealing with large amounts of data...

Resources