Easier way to find where in jq code an error occurred?

Easier way to find where in jq code an error occurred? - debugging

Is there a way for jq to give a reference to where in code an error occurred?
Too often I end up with errors that are unhelpful. Here is an example:
jq: error (at <stdin>:43913): Cannot index object with null
With or without the --slurp flag, the line indicator for stdin is almost always the last line of input. What would be really helpful is to note where in the code it failed at runtime.
Wrapping code snippets in try/catch blocks with $__loc__ have proven unhelpful too, as the line in the code tends to be the catch block, not where the error occurred.
jq: error (at <stdin>:43913) (not a string): {"file":"<top-level>","line":68}
Is there some method to make debugging jq scripts easier?
What I've been doing instead is commenting out large chunks of code and performing a binary search for what code gets the error. It feels like there is a better way.

Sometimes, using gojq ("the Go implementation of jq") gives good insights about errors.
Otherwise, you will probably find that some combination of debug statements, $__loc__, and debugging functions defined using debug to be most helpful.
Consider, for example:
def debug(msg): (msg|debug) as $debug | . ;
(The subtlety here is that you can use "\(.)" in msg.)
In a pinch, the command-line option --debug-trace can be useful.

Related

Dealing with code movement when comparing static analysis reports

When I run a static analysis tool over my codebase, and I get results like this:
...
arch/powerpc/kernel/time.c:102:5: warning: symbol 'decrementer_max' was not declared. Should it be static?
arch/powerpc/kernel/time.c:138:1: warning: symbol 'rtc_lock' was not declared. Should it be static?
arch/powerpc/kernel/time.c:361:37: warning: implicit cast to nocast type
...
I want to keep track of the number of warnings and where they are in the code as people make changes.
I could just diff the results of the static analysis runs, but then if someone inserts some code in time.c at line 50, the warnings above will move, and because the line numbers have changed, diff will tell me that they've changed.
How should I go about comparing these in a way that deals with movement of code within a file?
Googling for 'smart diff', etc hasn't been productive: they're mostly smart diffs of code rather than smart diff of logs. Log analysis tools like Greylog or Kibana also seem like a poor fit, designed more for different and more general analysis rather than for this quite specific task.
Is there something obvious that I'm missing? Or is this a problem where I should expect to be writing my own tooling?

You could maintain a merge of the code and the errors: insert each error message (minus its line number) after the corresponding line of code. Then if someone inserts code at line 50, the (updated) merge will not have diffs around the later error points. It'll have a diff at line 50, of course, which you may or may not be interested in. If you like, you can ignore diff-chunks that don't involve an error message (for which you'd need some distinctive marker at each inserted error message).

I had a go with a slightly simpler setup - as #ajd suggested, parsing the messages, and doing line-number-insensitive matching.
The code is up at https://github.com/daxtens/smart-sparse-diff

How to create AST parser which allows syntax errors?

First, what to read about parsing and building AST?
How to create parser for a language (like SQL) that will build an AST and allow syntax errors?
For example, for "3+4*5":
+
/ \
3 *
/ \
4 5
And for "3+4*+" with syntax error, parser would guess that the user meant:
+
/ \
3 *
/ \
4 +
/ \
? ?
Where to start?
SQL:
SELECT_________________
/ \ \
. FROM JOIN
/ \ | / \
a city_name people address ON
|
=______________
/ \
.____ .
/ \ / \
p address_id a id

The standard answer to the question of how to build parsers (that build ASTs), is to read the standard texts on compiling. Aho and Ullman's "Dragon" Compiler book is pretty classic. If you haven't got the patience to get the best reference materials, you're going to have more trouble, because they provide theory and investigate subtleties. But here is my answer for people in a hurry, building recursive descent parsers.
One can build parsers with built-in error recovery. There are many papers on this sort of thing, a hot topic in the 1980s. Check out Google Scholar, hunt for "syntax error repair". The basic idea is that the parser, on encountering a parsing error, skips to some well-known beacon (";" a statement delimiter is pretty popular for C-like languages, which is why you got asked in a comment if your language has statement terminators), or proposes various input stream deletions or insertions to climb over the point of the syntax error. The sheer variety of such schemes is surprising. The key idea is generally to take into account as much information around the point of error as possible. One of the most intriguing ideas I ever saw had two parsers, one running N tokens ahead of the other, looking for syntax-error land-mines, and the second parser being feed error repairs based on the N tokens available before it encounters the syntax error. This lets the second parser choose to act differently before arriving at the syntax error. If you don't have this, most parser throw away left context and thus lose the ability to repair. (I never implemented such a scheme.)
The choice of things to insert can often be derived from information used to build the parser (often First and Follow sets) in the first place. This is relatively easy to do with L(AL)R parsers, because the parse tables contain the necessary information and are available to the parser at the point where it encounters an error. If you want to understand how to do this, you need to understand the theory (oops, there's that compiler book again) of how the parsers are constructed. (I have implemented this scheme successfully several times).
Regardless of what you do, syntax error repair doesn't help much, because it is almost impossible to guess what the writer of the parsed document actually intended. This suggests fancy schemes won't be really helpful. I stick to simple ones; people are happy to get an error report and some semi-graceful continuation of parsing.
A real problem with rolling your own parser for a real language, is that real languages are nasty messy things; people building real implementations get it wrong and frozen in stone because of existing code bases, or insist on bending/improving the language (standards are for wimps, goodies are for marketing) because its cool. Expect to spend a lot of time re-calibrating what you think the grammar is, against the ground truth of real code. As a general rule, if you want a working parser, better to get one that has a track record rather than roll it yourself.
A lesson most people that are hell-bent to build a parser don't get, is that if they want to do anything useful with the parse result or tree, they'll need a lot more basic machinery than just the parser. Check my bio for "Life After Parsing".

There are two things the parser could do:
Report the error and have the user try again.
Repair the error and proceed.
Generally speaking the first one is easier (and safer). There may not always be enough information for the parser to infer the intent when the syntax is wrong. Depending on the circumstances, it may be dangerous to proceed with a repair that makes the input syntactically correct but semantically wrong.
I've written a few hand-rolled recursive descent parsers for little languages. When writing code to interpret the grammar rules explicitly (as opposed to using a parser-generator), it's easy to detect errors, because the next token doesn't fit the production rule. Generated parsers tend to spit out a simplistic "expected $(TOKEN_TYPE) here" message, which isn't always useful to the user. With a hand-written parser, it's often easy to give a more specific diagnostic message, but it can be time consuming to cover every case.
If your goal is the report the problem but to keep parsing (so that you can see if there are additional problems), you can put a special AST node in the tree at the point of the error. This keeps the tree from falling apart.
You then have to resync to some point beyond the error in order to continue parsing. As Ira Baxter mentioned in his answer, you might look for a token, like ';', that separates statements. The correct token(s) to look for depends on the language you're parsing. Another possibility is to guess what the user meant (e.g., infer an extra token or a different token at the point the error was detected) and then continue. If you encounter another syntax error within the next few tokens, you could backtrack, make a different guess, and try again.

How ought I run the annotate function in gui-debugger/annotator on a datum?

I am trying to learn how to use the DrRacket debugger's annotate function. My ultimate aim is to build a REPL that you can execute from within a closure and have access to everything that's in scope. (see my previous question on that topic and Greg Hendershott's well-researched answer) For the time being, I'm just trying to explore how the annotate function works. You can see my first exploratory attempt at using it, and the results, here.
The error, which is happing inside of the annotator, seems to arise when it tries to match he application of string-append. The matcher is looking for a #%plain-app and the expanded syntax I'm presenting to it contains an #%app. I'm unsure if I should be expanding the syntax differently so it comes out as a #%plain-app or if there's something else I'm doing wrong to produce the syntax I'm feeding into the annotator. Does anybody see where my error is?

This revision to my previous pasterack post is swallowed without complaint. It seems that the syntax match must take place on a top-level syntax object (ruling out anything that could happen in an expansion phase, like a macro), and expansion must take place with a current namespace attached. There are some more subtleties in the syntax match, particularly around the fact that the syntax object needs to be free-identifier=? to #%plain-app. For all the gory details, refer to the mailing list thread I started.

How do I get a callstack in Haskell?

I am trying to track down a non-exhaustive pattern in a libraries code. Specifically HDBC's mysql implementation. It is trying to match over types in my program and map them to mysql's types I believe. I can't seem to get a callstack for this error which means that since there are a number of parameters to the SQL query it is difficult to track down exactly what is causing it.
Is it possible to get a callstack in haskell so I would know which parameter was causing the error? Also I would think that this should be caught by the compiler since it should be able to look at my types and the patterns and make sure that there was a corresponding match.

You can use the GHCi debugger to identify where the exception is coming from.
I walk through a full example here.

You might also take a look at the Debug.Trace library.

Can you ask ruby to treat warnings as errors?

Does ruby allow you to treat warnings as errors?
One reason I'd like to do this is to ensure that if heckle removing a line of code means that a warning occurs, I have the option of ensuring that the mutant get killed.

There is unfortunately no real way of doing this, at least not on most versions of Ruby out there (variations may exist), short of monitoring the program output and aborting it when a warning appears on standard error. Here's why:
Ruby defines Kernel.warn, which you can redefine to do whatever you wish (including exiting), and which you'd expect (hope) to be used consistently by Ruby to report warnings (including internal e.g. parsing warning), but
methods implemented natively (in C) inside Ruby will in turn directly invoke a native method called rb_warn from source/server.c, completely bypassing your redefinition of Kernel.warn (e.g. the "string literal in condition" warning, for example, issued when doing something like: do_something if 'string', is printed via the native rb_warn from source/parse.c)
to make things even worse, there is an additional, rb_warning native method, which can be used by Ruby to log warnings if -w or -v is specified.
So, if you need to take action solely on warnings generated by your application code's calling Kernel.warn then simply redefine Kernel.warn. Otherwise, you have exactly two options:
alter source/error.c to exit in rb_warn and rb_warning (and rb_warn_m?), and rebuild Ruby
monitor your program's standard error output for ': warning:', and abort it on match

You can finally do that by overriding Warning.warn like
module Warning
def warn(msg)
raise msg
end
end
This will turn the warning into an exception. This solution works at least since 2.4 branch.

You could also potentially use DTrace, and intercept the calls to rb_warn and rb_warning, though that's not going to produce exceptions you can rescue from somewhere. Rather, it'll just put them somewhere you can easily log them.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio