does highlight.js check syntax as well - highlight.js

In addition to highlighting syntax (which I suppose means indentation, color etc.), does highlight.js check syntax as well. For eg. if I write the following code in javascript
function {
}
Would highlight.js show error that the function name is missing. I know that ace.js does this. I want to replace ace.js with highlight.js.

No, Highlight.js does not check syntax, it only highlights code (with pattern matching). Often incorrect syntax is simply ignored, or could cause your code to be highlighted funny.
It MIGHT be possible to write a 3rd party grammar that did have some ability to detect (and highlight) syntax errors, but this is not something the core library is interested in doing.
[Disclaim: I'm the current Highlight.js maintainer.]

Related

Adding keyword commands and functions to Textmate 2 bundle syntax

I want to add some additional syntax highlighting definitions to an existing bundle, but I need some general advice on how to do this. I'm not building a syntax from scratch, and I think my request is pretty simple, but I think it involves some subtleties for which I find the manual somewhat impenetrable in finding the answer.
Basically, I'm trying to fill out the syntax definitions for the Stata bundle. It's great, but there is no built in support for automatically highlighting the base commands and the installed functions, only a handful of basic control statements. Stata is a language which is primarily used by calling lots of different high level pre-defined command calls, like command foo bar, options(). The convention is that these command calls be highlighted.
There are a ton of these commands, and stubs which are used for convenience. Just the base install has almost 3500. Even optimizing them using the bundle helper, which obviously gets rid of the stub issue, still yields a massive regex list. I can easily cut this down to less than 1000 important ones, but its still a lot. There are also 350 "functions" which I would like to match with the syntax function()
I essentially have 3 questions:
Am I creating a serious problem by including a very comprehensive list of matching definitions?
How do I restrict a command to only highlight when it either begins a line or there is only whitespace between the beginning line and the command
What is the preferred way of restricting the list of functions() to only highlight when they have attached parentheses?

How ought I run the annotate function in gui-debugger/annotator on a datum?

I am trying to learn how to use the DrRacket debugger's annotate function. My ultimate aim is to build a REPL that you can execute from within a closure and have access to everything that's in scope. (see my previous question on that topic and Greg Hendershott's well-researched answer) For the time being, I'm just trying to explore how the annotate function works. You can see my first exploratory attempt at using it, and the results, here.
The error, which is happing inside of the annotator, seems to arise when it tries to match he application of string-append. The matcher is looking for a #%plain-app and the expanded syntax I'm presenting to it contains an #%app. I'm unsure if I should be expanding the syntax differently so it comes out as a #%plain-app or if there's something else I'm doing wrong to produce the syntax I'm feeding into the annotator. Does anybody see where my error is?
This revision to my previous pasterack post is swallowed without complaint. It seems that the syntax match must take place on a top-level syntax object (ruling out anything that could happen in an expansion phase, like a macro), and expansion must take place with a current namespace attached. There are some more subtleties in the syntax match, particularly around the fact that the syntax object needs to be free-identifier=? to #%plain-app. For all the gory details, refer to the mailing list thread I started.

Building a syntax checker

I am building a app like a compiler with my own script language. The user will enter the code and the output will be another app.
So I need tell to user if some line is wrong and why it is.
But I don't know how to start.
I thought this:
All lines will start with a keyword, except for those who start with an variable. So different that are wrong.
So, I can calculate the next valid entries and check them.
Also, I thought that I can check each line, but it's complex because I can have this
var varName { /* ... */ };
Or
var varName {
/* ... */
};
Or Even
var varName
{
/* ... */
};
So why not remove the break-lines and check? Because I will lose the line number, which in this case is the most important.
Maybe I'm going to create a map between the code with and without break-line.
But first I want to hear you, if you already has this experience or you have any idea.
Thanks
There are formal languages to describe syntax and semantics of the language and there are tools that will generate parsers out of these descriptions. I suggest reading on flex and bison for starters.
It'll be fairly complicated to write your own language. But totally doable.
To able to recognize if a line is wrong, in the syntactical sense, you'd need to build a parser.
The parser checks the context-free grammar for a correct derivation of a structure from its tokens.
First you need to tokenize the file, then reconstruct it into a parse tree (to check syntax).
I took a class in this, CS 241. There's a very nice set of course notes which this is all explained in detail.
https://github.com/christhomson/lecture-notes/blob/master/cs241.pdf
You should check tools like: lex, bison and yacc.
lex is lexical analyser generator. It generates a code, which could be used for breaking the script to tokens (like numbers, keywords and so on...).
bison and yacc are both parser generators. Both can be used for generating code for parsing your language (combining tokens to statements).
Just google tutorials for those tools.

Grammar for the Chef Language

I'm just starting to use antlr, with antlr for ruby. The version is 3.2.1
I'm trying to create a parser for the chef language, and the grammar is giving me a real headache :P I'm sure I'm missing some fundamental concept, but I just couldn't figure it out.
I created 3 grammars. The main one is the recipe parser, which (of course) parses the recipes. Once a recipe is parsed, I used the other 2 grammars, that parse ingredients and instructions (the method section).
My problem is with the last one, the one that parses the instructions, such as "put ... into the mixing bowl", "liquefy ...", etc. Everything works great except for a few rules. I've posted the Instructions.g source here, at paste.bin because of its length.
Here's what's happening:
When I uncomment the rules combine_ingredient_into_mixing_bowl or divide_ingredient_into_mixing_bowl, the parser stops recognizing almost all of the other rules (such as put_ingredient_into_mixing_bowl). This seems strange to me, because they don't seem to override each other (of course they are, somehow). I get the error: "line 0:-1 mismatched input "" expecting WS"
stir_mixing_bowl does not match anything, but it's really no different from the other rules that do work ok. I get the error: "line 0:-1 mismatched input "" expecting set nil"
Is it possible to include the rules verb_the_ingredient and liquefy_ingredient without making them conflict with the other rules? The former will actually conflict with everything else I guess, and the latter will conflict with liquefy_mixing_bowl. What would be the best way to deal with such a nasty grammar?
By the way, I haven't set the WS (whitespace such as space and tab) to the ignore channel because since an ingredient can consiste of one or more words (such as dijon mustard or just zuchinnis) I found that it is easier to specify the grammar by using the WS token as separators.
Also, running the antlr4ruby command to generate the parsers/lexers code shows no warnings at all.
Any tips, hints, or enlightening is really appreciated here :)
Thanks in advance.

how to match function code block with regex

What I like to do is remove all functions which has specific word for example, if the word is 'apple':
void eatapple()
{
// blah
// blah
}
I'd like to delete all code from 'void' to '}'.
What I tried is:
^void.*apple(.|\n)*}
But it took very long time I think something is wrong here.
I'm using Visual Studio. Thank you.
To clarify jeong's eventual solution, which I think is pretty clever: it works, but it depends on the code being formatted in a very particular way. But that's OK, because most IDE's can enforce a particular formatting standard anyway. If I may give a related example - the problem that brought me here - I was looking to find expressions of the form (in Java)
if (DEBUG) {
// possibly arbitrary statements or blocks
}
which, yes, isn't technically regular, but I ran the Eclispe code formatter on the files to make sure they all necessarily looked like this (our company's usual preferred code style):
if (DEBUG) {
statement();
while (whatever) {
blahblahblah(etc);
}
// ...
}
and then looking for this (this is Java regex syntax, of course)
^(\s*)if \(DEBUG.*(?:\n\1 .*)*\n\1\}
did the trick.
Finally did it.
^void.*(a|A)pple\(\)\n\{\n((\t.*\n)|(^$\n))*^\}
Function blocks aren't regular, so using a regular expression in this situation is a bad idea.
If you really have a huge number of functions that you need to delete (more than you can delete by hand (suggesting there's something wrong with your codebase — but I digress)) then you should write a quick brace-counting parser instead of trying to use regular expressions in this situation.
It should be pretty easy, especially if you can assume the braces are already balanced. Read in tokens, find one that matches "apple", then keep going until you reach the brace that matches with the one immediately after the "apple" token. Delete everything between.
In theory, regular language is not able to express a sentence described by context free grammar. If it is a one time job, why don't you just do it manually.
Switch to VI. Then you can select the opening brace and press d% to delete the section.

Resources