I'm making some command-line tools for some research I'm doing. I'd like these tools to follow commonly used conventions regarding command line programs in Unix.
Should I use flags or just list parameters?
program one two three
program -a one -b two -c three
Where in the list of commands does the input file normally go, or is it better to < it into the program?
What about the output filename?
Should I specify the file extension for the output format, or have my program automatically put the correct extension on?
When the user enters an invalid command, is there a prototypical "correct usage" message?
Is "--help" or "-h" required?
Also, is there some sort of header file I can include that would help with managing these?
If you're looking for a "standard", then you could do worse than look at GNU's Standards for Command Line Interfaces. Other standards are available.
As far as coding for this goes, take a look at boost::program_options. Not only will this save you rolling a lot of your own code, but it does a good job of formatting the options for presenting to the user (the prototypical "correct usage" message, you asked for).
In answer to your specific questions:
Where in the list of commands does the input file normally go, or is it better to < it into the program?
I would expect these to come at the end of a command line. Like in GNU grep. If you are only processing one file and would like to make stdin available as an input source, that would not surprise most users.
If your command processes lots of files, then it would be unusual to have to specify a switch before the filenames. Think cat.
What about the output filename?
A -o or --output option is fairly common. If your file takes exactly one input and one output, then program inputfile outputfile would not surprise many users. If no output file is specified, perhaps you'll output to stdout; that would not be unusual behaviour and would allow your users to pipe the output through other commands (such as grep, less, etc...), They could also redirect stdout to a file using >.
Should I specify the file extension for the output format, or have my program automatically put the correct extension on?
This is probably a matter for debate. If I specified an output filename, I would expect to find that file created (or replaced, after a prompt) without the program changing the name.
When the user enters an invalid command, is there a prototypical "correct usage" message?
Using GNU grep as an example again:
grep: unrecognized option '--incorrect'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
This wouldn't surprise too many users and points them in the right direction if they've made a typo without swamping them with information.
Is "--help" or "-h" required?
That depends on your customer! I find it frustrating when this option isn't available.
Usually speaking, flags are there for providing options and parameter are for passing information. If you have input,output file as command line argument, use flags like -i -o, so sequence will not matter. -h is required if you want to (and need to) give documentation.
Related
I'm trying to create an overly simplified version of bash, I've tried split the program into "lexer + expander, parser, executor".
In the lexer i store my data (commands, flags, files) and create tokens out of them , my procedure is simply to loop through given input char by char and use a state machine to handle states, states are either a special character, an alphanumeric character or space.
Now when i'm at an alphanumeric state i'm at a command, the way i know where the next flag is when i encounter again alphanumeric state or if input[i] == '-', now the problem is with multi-flag commands.
For example:
$ ls -la | grep "*.c"
I successfully get the command ls, grep and the flag -la, *.c.
However with multi-flag commands like.
$ sed -i "*.bak" "s/a/b/g" file1 file2
It seems to me very difficult, and i can't figure out yet, how can i know where the flags to a specific command ends, so my question is how bash parse these multi-flags commands ? any suggestions regarding my problem, would be appreciated !
The shell does not attempt to parse command arguments; that's the responsibility of the utility. The range of possible command argument syntaxes, both in use and potentially useful, is far too great to attempt that.
On Unix-like systems, the shell identifies individual arguments from the command line, mostly by splitting at whitespace but also taking into account the use of quotes and a variety of other transformations, such as "glob expansion". It then makes a vector of these arguments ("argv") and passes the vector to execve, which hands them to the newly created process.
On Windows systems, the shell doesn't even do that. It just hands over the command-line as a string, and leaves it to the command-line tool to do everything. (In order to provide a modicum of compatibility, there's an intermediate layer which is called by the application initialization code, which eventually calls main(). This does some basic argument-splitting, although its quoting algorithm is quite a bit simplified from that used by a Unix shell.)
No command-line shell that I know of attempts to identify command-line flags. And neither should you.
For a bit of extracurricular reading, here's the description of shell parsing from the Posix standard: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html. Trying to implement all that goes far beyond the requirements given to you for this assignment, and I'm certainly not recommending that you do that. But it might still be interesting, and understanding it will help you immensely if you start using a shell.
Alternatively, you could try reading the Bash manual, which might be easier to understand. Note that Bash implements a lot of extensions to the Posix standard.
Examples:
Create an ISO image and burn it directly to a CD.
mkisofs -V Photos -r /home/vivek/photos | cdrecord -v dev=/dev/dvdrw -
Change to the previous directory.
cd -
Listen on port 12345 and untar data sent to it.
nc -l -p 12345 | tar xvzf -
What is the purpose of the dash and how do I use it?
If you mean the naked - at the end of the tar command, that's common on many commands that want to use a file.
It allows you to specify standard input or output rather than an actual file name.
That's the case for your first and third example. For example, the cdrecord command is taking standard input (the ISO image stream produced by mkisofs) and writing it directly to /dev/dvdrw.
With the cd command, every time you change directory, it stores the directory you came from. If you do cd with the special - "directory name", it uses that remembered directory instead of a real one. You can easily switch between two directories quite quickly by using that.
Other commands may treat - as a different special value.
It's not magic. Some commands interpret - as the user wanting to read from stdin or write to stdout; there is nothing special about it to the shell.
- means exactly what each command wants it to mean. There are several common conventions, and you've seen examples of most of them in other answers, but none of them are 100% universal.
There is nothing magic about the - character as far as the shell is concerned (except that the shell itself, and some of its built-in commands like cd and echo, use it in conventional ways). Some characters, like \, ', and ", are "magical", having special meanings wherever they appear. These are "shell metacharacters". - is not like that.
To see how a given command uses -, read the documentation for that command.
It means to use the program's standard input stream.
In the case of cd, it means something different: change to the prior working directory.
The magic is in the convention. For millennia, people have used '-' to distinguish options from arguments, and have used '-' in a filename to mean either stdin or stdout, as appropriate. Do not underestimate the power of convention!
Examples:
Create an ISO image and burn it directly to a CD.
mkisofs -V Photos -r /home/vivek/photos | cdrecord -v dev=/dev/dvdrw -
Change to the previous directory.
cd -
Listen on port 12345 and untar data sent to it.
nc -l -p 12345 | tar xvzf -
What is the purpose of the dash and how do I use it?
If you mean the naked - at the end of the tar command, that's common on many commands that want to use a file.
It allows you to specify standard input or output rather than an actual file name.
That's the case for your first and third example. For example, the cdrecord command is taking standard input (the ISO image stream produced by mkisofs) and writing it directly to /dev/dvdrw.
With the cd command, every time you change directory, it stores the directory you came from. If you do cd with the special - "directory name", it uses that remembered directory instead of a real one. You can easily switch between two directories quite quickly by using that.
Other commands may treat - as a different special value.
It's not magic. Some commands interpret - as the user wanting to read from stdin or write to stdout; there is nothing special about it to the shell.
- means exactly what each command wants it to mean. There are several common conventions, and you've seen examples of most of them in other answers, but none of them are 100% universal.
There is nothing magic about the - character as far as the shell is concerned (except that the shell itself, and some of its built-in commands like cd and echo, use it in conventional ways). Some characters, like \, ', and ", are "magical", having special meanings wherever they appear. These are "shell metacharacters". - is not like that.
To see how a given command uses -, read the documentation for that command.
It means to use the program's standard input stream.
In the case of cd, it means something different: change to the prior working directory.
The magic is in the convention. For millennia, people have used '-' to distinguish options from arguments, and have used '-' in a filename to mean either stdin or stdout, as appropriate. Do not underestimate the power of convention!
Examples:
Create an ISO image and burn it directly to a CD.
mkisofs -V Photos -r /home/vivek/photos | cdrecord -v dev=/dev/dvdrw -
Change to the previous directory.
cd -
Listen on port 12345 and untar data sent to it.
nc -l -p 12345 | tar xvzf -
What is the purpose of the dash and how do I use it?
If you mean the naked - at the end of the tar command, that's common on many commands that want to use a file.
It allows you to specify standard input or output rather than an actual file name.
That's the case for your first and third example. For example, the cdrecord command is taking standard input (the ISO image stream produced by mkisofs) and writing it directly to /dev/dvdrw.
With the cd command, every time you change directory, it stores the directory you came from. If you do cd with the special - "directory name", it uses that remembered directory instead of a real one. You can easily switch between two directories quite quickly by using that.
Other commands may treat - as a different special value.
It's not magic. Some commands interpret - as the user wanting to read from stdin or write to stdout; there is nothing special about it to the shell.
- means exactly what each command wants it to mean. There are several common conventions, and you've seen examples of most of them in other answers, but none of them are 100% universal.
There is nothing magic about the - character as far as the shell is concerned (except that the shell itself, and some of its built-in commands like cd and echo, use it in conventional ways). Some characters, like \, ', and ", are "magical", having special meanings wherever they appear. These are "shell metacharacters". - is not like that.
To see how a given command uses -, read the documentation for that command.
It means to use the program's standard input stream.
In the case of cd, it means something different: change to the prior working directory.
The magic is in the convention. For millennia, people have used '-' to distinguish options from arguments, and have used '-' in a filename to mean either stdin or stdout, as appropriate. Do not underestimate the power of convention!
i know what they are, but i dunno when i should use them. Are they useful? I think yes, but I want you to tell me in which situations a file descriptor could be useful. Thanks :D
The most obvious case which springs to mind is:
myProgram >myProgram.output_and_error 2>&1
which sends both standard output and error to the same file.
I've also used:
myProgram 2>&1 | less
which will allow me to page through the output and error in sequence (rather than having error got to the terminal in "arbitrary" places in the output).
Basically, any time when you need to get at an already existing file descriptor, you'll find yourself using this.