Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have designed a few programs that have a CLI and want to document them as standard as possible. Are there any agreements out there as to the best way to do this?
An example:
Let's say the Program is "sayHello" and it takes in a few parameters: name and message. So a standard call would look like this:
> sayHello "Bob" "You look great"
Okay, so my command usage would look something like this:
sayHello [name] [message]
That may already be a mistake if brackets have a specific meaning in usage commands. But let's go a step farther and say "message" is optional:
sayHello [name] [message (optional)]
And then just one more wrinkle, what if there is a default we want to denote:
sayHello [name] [message (optional: default 'you look good')]
I realize this usage statement looks a little obtuse at this point. I'm really asking if there are somewhat agreed-upon standards on how to write these. I have a sneaking suspicion that the parenthesis and brackets all have specific meanings.
While I am unaware of any official standard, there are some efforts to provide conventions-by-framework. Docopt is one such framework, and may suit your needs here. In their own words:
docopt helps you:
define interface for your command-line app, and
automatically generate parser for it.
There are implementations for many programming languages, including shell.
You might want to look at the manuals for common Unix commands (e.g. man grep) or the help documentation for Windows commands (e.g. find /?) and using them as a general guide. If you picked either of those patterns (or used some elements common to both), you'd at least surprise the fewest number of people.
Apache commons also has some classes in the commons-cli package that will print usage information for your particular set of command-line options.
Options options = new Options();
options.addOption(OptionBuilder.withLongOpt("file")
.withDescription("The file to be processed")
.hasArg()
.withArgName("FILE")
.isRequired()
.create('f'));
options.addOption(OptionBuilder.withLongOpt("version")
.withDescription("Print the version of the application")
.create('v'));
options.addOption(OptionBuilder.withLongOpt("help").create('h'));
String header = "Do something useful with an input file\n\n";
String footer = "\nPlease report issues at http://example.com/issues";
HelpFormatter formatter = new HelpFormatter();
formatter.printHelp("myapp", header, options, footer, true);
Using the above will generate help output that looks like:
usage: myapp -f [-h] [-v]
Do something useful with an input file
-f,--file <FILE> The file to be processed
-h,--help
-v,--version Print the version of the application
Please report issues at http://example.com/issues
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
we have many pdf files they are all unlocked they have text, pictures etc. everytime we have to open the file on adobe and do it manually i was thinking maybe there is a better way to do with PowerShell if not yeah we have to do over 1000 files and more are coming but thank you for your answer
Peggy
After looking into it a bit more, I discovered a command-line tool that you can use in tangent with PowerShell. It's called tesseract. For Windows and Linux, download the prebuilt binaries. For MacOS, you need to get use MacPorts or Homebrew.
You'll want to do something like this:
# Using Get-ChildItem's -Include parameter to filter file types
# requires the target path to end in an asterisk. Using just an
# asterisk as the path makes it target the current directory.
foreach ($pdf in (Get-ChildItem * -Include *.pdf))
{
# An array isn't needed, it's just good for arranging arguments
tesseract #(
#INPUT:
$pdf
#OUTPUT:
"$($pdf.Directory)\{OCR} $($pdf.Name)"
#LANGUAGE:
'-l','eng'
)
# The directory is included in the output path so that you can
# change Get-ChildItem's target without adjusting the argument
}
Or, without the fluff:
foreach ($pdf in (Get-ChildItem * -Include *.pdf))
{
tesseract $pdf "$($pdf.Directory)\{OCR} $($pdf.Name)" -l eng
}
Granted, I haven't actually tested tesseract out, but I did read other Q&A pages to derive the appropriate command. Let me know if there's any issues.
Your question is a bit unclear. There is a way to OCR images using PowerShell, such as using this function, and you can convert pdfs to images using this function (it does require imagemagick, which is available here, there are portable options if yuo don't want to install anything). This would effectively allow you to search PDF files that haven't been OCR'd.
However, in terms of directly editing the PDF files with PowerShell to make them into OCR'd PDFs, while PowerShell functionality might help you automate the process, you would first need to find a program that can do that sort of thing from the command line. The PDFs would also have to all be unlocked so that editing them would even be possible (though there are ways to circumvent PDF locks to unlock them).
Unfortunately, I don't really know of any programs that can do that. Maybe it's possible with some advanced Ghostscript parameters, but I haven't looked into it. It is certainly not going to be easy!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I've got an idea, but it's implications scare me. Perhaps you, dear reader, can help. :)
The Setup
I've created a Ruby-based CLI app that allows user configuration via a YAML file. In that file, there is scenario where the user can define pre and post "actions" that display a message (with some arbitrary, non-relevant code in-between). For example:
actions:
- action
# ...other keys...
pre:
message: 'This is the pre message'
action: puts 'PRE COMMAND'
post:
message: 'This is the post message'
action: puts 'POST COMMAND'
In this case, my app would output This is the pre message, evaluate the "pre" action (thus outputting PRE COMMAND), do some irrelevant stuff, output This is the post message, and finally evaluate the "post" action (thus outputting POST COMMAND).
The Problem
You can already guess the problem; it appeared when I used the word "evaluate". That's a scary thing. Even though this is a locally-run, client-centric app, the idea of eval'ing random Ruby is terrifying.
Solution Idea #1
The first idea was just that: eval the actions. I quickly destroyed it (unless one of you knows-more-Ruby-than-me types can convince me otherwise).
Solution Idea #2
Do some "checking" (via Regexp, perhaps) to validate that the command is somehow "valid". That seems wildly large and difficult to contain.
Solution Idea #3
Another idea was to wrap acceptable commands in data structures of my own (thus limiting the possibilities that a user could define). For instance, I might create an open_url action that safely validates and opens a URL in the default browser.
I like this idea, but it seems rather limiting; I'd have to define a zillion wrappers over time, it seems like. But perhaps that's the price you pay for safety?
Your Turn
I appreciate any additional thoughts you have!
You'd probably be a lot better off writing a simple framework that allows for Ruby plugins than to glue together something out of YAML and snippets of code.
You're right that "eval" is terrifying, and it should be, but sometimes it's the most elegant solution out of all possible inelegant solutions. I'd argue that this time is not one of those cases.
It's not at all hard to write a very simple DSL in Ruby where you can express your configuration in code:
action.pre.message = 'This is the pre message'
action.pre.command do
puts "PRE COMMAND"
end
All this depends on is having a number of pre-defined structures that have methods like message= taking a string as an argument or command taking a block. If you want to get fancy you can write some method_missing handlers and make up things as you go along, allowing for maximum flexibility.
You can see many examples of this, from your Rakefile to capistrano, and it usually works out a lot better than having a non-Ruby configuration file format with Ruby in it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm brand new to Ruby and programming. I'd like to create a little program to automate one of my more tedious work tasks that I'm currently doing by hand but I'm not sure where to start.
People register to take courses through an online form, and I receive their registration information at the end of each day as a CSV document. I go line by line through that document and generate a confirmation email to send to them based on their input on the online form: the course they'd like to take, their room preference, how much they chose to pay for the course (sliding scale), etc. The email ends up looking something like this:
Dear So and so, Thank you for signing up for "Such-and-such An Awesome Course," with Professor Superdude. The course starts on Monday, September 1, 2030 at 4pm and ends on Thursday at 1pm. You paid such-and-such an amount...
et cetera. So ideally the program would take in the CSV document with information like "student name," "course title," "fee paid," and generate emails based on blocks of text ("Dear , Thank you for signing up for _,") and variables (the dates of the course) that are stored externally so they are easy to edit without going into the source code (maybe as CSV and plain text files).
Additionally, I need the output to be in rich text, so I can bold and underline certain things. I'm familiar with Markdown so I could use that in the source code but it would be ideal if the output could be rich text.
I'm not expecting anyone to write a program for me, but if you could let me know what I should look into or even what I should Google, that would be very helpful.
I assume you're trying to put together an email. If so, I'd probably start with a simple ERB template. If you want to generate HTML, you can write one HTML template and one plain text template; variable substitution works the same way for both, with the exception that you'll need to html-escape anything that contains characters that HTML considers special (ampersands, greater than, less then, for example). See ERB Documentation here.
If you're trying to parse CSV, user FasterCSV or a similar library. FasterCSV is documented here.
If you want to send an email, you can use ActionMailer, the mail gem, or the pony gem. ActionMailer is part of rails, but can be used independently. Pony is a good facade for creating email, as well; both ActionMailer and Pony depend on the "mail" gem, so unless you want to spend more time thinking about how email formats work, use one of those.
If you're not trying to send an email, and instead are trying to create a formatted document, you can still use ERB, but use it to generate output in TeX, or if you're more adventurous than I am, a Word compatible XML document. Alternatively, if you're wedded to Microsoft Word or RTF, you might try either http://ruby-rtf.rubyforge.org/ (Ruby RTF) or use COM/OLE interop to talk to Word, but I would only do that if really I had to; if I had to go that route, I'd probably suck it up and just use the built in mail merge feature in Word perhaps with a little VBA code.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I would like to understand the serarh term of a user. Think of someone is searching for "staples in NY" - I would like to understand that its a location search where keyword is staples and location is new york. Similarly if someone types "cat in hat", the parser should not flag that also as a location search, here the entire keyword is "cat in hat".
Is there any algorithm or open source library available to parse a search term and understand its a comparison (like A vs B) or its a location based search (like A in X)?
The problem you describe is called information extraction. A host of algorithms exist, the simplest being regexp matching, the best structured machine learning. Try regexps first and look at something like NLTK if you know Python.
Distinguishing "staples in NY" from "cat in hat" is possible if your program knows that "NY" is a location. You can tell either by the capitals or because "NY" occurs in a list called a gazetteer.
The problem in general is AI-complete, so expect to put in lots of hard work if you want good results.
You should write such linguistic rules in grammars such as GATE and http://code.google.com/p/graph-expression/.
Examples:
Token+ in (LocationLookup).
Not too sure but two approaches as per my experience with parsing -
Define a grammar which can parse the expression and collect values / parameters. You might want to come up with a dictionary of keywords using which you can then deduce the the type of search.
Be strict when defining your grammar so that the expression itself tells you about the type of search.
eg LOC: A in B , VALUE $ to Euro. etc.
For parser see ANTLR / jcup & jflex.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 11 months ago.
Improve this question
First of all, great praise goes out to PowerGREP. It's a great program.
But it's not free. Some of its options I'm looking for:
Being able to use .NET regexp's (or similar) to find things in a filtered list of files through subdirectories.
Replacing that stuff with other regexps.
Being able to jump to that part of the file in some sort of editor.
Non commandline.
Being able to copy the results / filename and occurrences of the text.
Low overhead would also be nice, so not too many dependencies, etc.
And I need it on Windows.
I would suggest trying the new dnGrep. It's a .NET application that provides grep-like functionality and has almost all the features you specified.
Here are the features and a sample screenshot:
Shell integration (ability to search from Windows Explorer)
Plain text/regex/XPath search (including case-insensitive search)
Phonetic search (using Bitap and Needleman-Wunch algorithms)
File move/copy/delete actions
Search inside archives (via plug-ins)
Search Microsoft Word documents (via plug-ins)
Search PDF documents (via plug-ins)
Undo functionality
Optional integration with a text editor (like Notepad++)
Bookmarks (ability to save regex searches for the future)
Pattern test form
Search result highlighting
Search result preview
Does not require installation (can be run from a USB drive)
Feature-wise nothing even comes close to PowerGREP, so the question is, how many compromises are you willing to make? I agree that PowerGREP's price tag is a bit steep (not that I have ever regretted a single penny I spent on it), so perhaps something cheaper might do?
UltraEdit is an excellent text editor with very good regex support. It supports Perl-style regular expressions, and you can do find/replace operations in multiple (optionally pre-filtered) files with it. I'd say it can do everything you want to do according to your question.
RegexBuddy, apart from being the best regex editor/debugger on the market, also has a limited GREP functionality, allowing search/replace in (pre-filtered) subdirectories. It's also not free, but considerably less expensive than PowerGREP, and its regex engine has all the features you could ask for (the current version even introduced recursive regexes, and the extremely useful ability to translate regexes between flavors). Big pluses here are the ability to do a non-desctructive preview for all operations, and to have backups automatically be created of all files that are modified during a grep.
I use GrepWin extensively during development and on production servers - it doesn't support all the features you specify, but it gets the job done (your mileage may vary).
For a fast loading, fast executing program used to only find (no search and replace) then I've found Baregrep to be pretty good. It does subdirectories.
You might have a look on this:
Open Source PowerGREP Alternatives
Currently there're six alternatives to PowerGREP.
Get Cygwin for a bunch of free alternatives!
grep, sed, awk, perl, python... goes on.
But, oops! you want to stick to GUI.
I always wonder at how people wrap GUI around things like grep and get cash for that!
WinGrep seems to be free though and, yet comes with quite a punch.
Windows Grep is designed for searching plain-ASCII text files, such as program source, HTML, RTF and batch files, but it can also search binary files such as word processor documents, databases, spreadsheets and executables.
I do not know PowerGREP, but grepWin lets you search regexes in directories.
You can get GNU grep or Gawk.