Building a syntax checker - syntax

I am building a app like a compiler with my own script language. The user will enter the code and the output will be another app.
So I need tell to user if some line is wrong and why it is.
But I don't know how to start.
I thought this:
All lines will start with a keyword, except for those who start with an variable. So different that are wrong.
So, I can calculate the next valid entries and check them.
Also, I thought that I can check each line, but it's complex because I can have this
var varName { /* ... */ };
Or
var varName {
/* ... */
};
Or Even
var varName
{
/* ... */
};
So why not remove the break-lines and check? Because I will lose the line number, which in this case is the most important.
Maybe I'm going to create a map between the code with and without break-line.
But first I want to hear you, if you already has this experience or you have any idea.
Thanks

There are formal languages to describe syntax and semantics of the language and there are tools that will generate parsers out of these descriptions. I suggest reading on flex and bison for starters.

It'll be fairly complicated to write your own language. But totally doable.
To able to recognize if a line is wrong, in the syntactical sense, you'd need to build a parser.
The parser checks the context-free grammar for a correct derivation of a structure from its tokens.
First you need to tokenize the file, then reconstruct it into a parse tree (to check syntax).
I took a class in this, CS 241. There's a very nice set of course notes which this is all explained in detail.
https://github.com/christhomson/lecture-notes/blob/master/cs241.pdf

You should check tools like: lex, bison and yacc.
lex is lexical analyser generator. It generates a code, which could be used for breaking the script to tokens (like numbers, keywords and so on...).
bison and yacc are both parser generators. Both can be used for generating code for parsing your language (combining tokens to statements).
Just google tutorials for those tools.

Related

How bash handle parsing errors?

For context, i'm trying to create an overly simplified version of bash, not like a bash full script interpreter, just a series of commands and operators (|,||, &&, <, >, <<, >>,$, $?) small interpreter, The mental model which i used in a nutshell is:
Lexer + Expander: in the first stage i used a simple state machine to lex and store data (commands, arguments, redirection files etc.) and lex input into tokens, i expand env variables and i handle lexical errors too.(as simple as checking finite states of valid characters).
Parser: in the second i stage i intend to create an AST out of the tokens + data, and handle parsing errors.
Executor: Finally i'll execute the AST.
No i'm at the parser stage, and i'm trying to think about how might i handle parsing errors, now the thought i had is out of the possible range of valid statements, it seems very difficult to check the validity of such an input cause the range is too big or at least that's what i think, and i'm sure there's some generalized solution for the problem, why i'm sure? because bash have done it.
For example this statement:
$ < $FILE || && > outfile
From the lexer point of view it's all bright and shiny, but it's surely not a valid input from the parser's perspective. Now one possible solution to this is to check whether there's a command token in the input if not then invalid. but what about this one:
$ || ls > $FILE && cat < $FILE
Again all valid lexeme, but unparsable statement, maybe that too could be checked against "if the line start with an OR or AND token error.".
Now the specific question is how bash exactly parse these combination of commands and operators, either there's some sort of more generalized solution or i'm left with an if&else error checking against inputs that i think is invalid. which honestly seems stupid and cumbersome.
Most of the complexity of shell parsing is in the tokenisation, although you certainly don't need to worry about all of the complications which have crept in over the years. The grammar itself is pretty simple; it's designed to be parsed by a parser generated with a tool like Bison (or some other yacc derivative), and that's precisely how Bash works.
The various syntactic rules recognised by Bash are scattered throughout the Bash manual, but the grammar is based on the standard shell grammar specified in the Posix standard, which is probably an easier starting point. In that document, the grammar is included as what is basically a Yacc input file (without any of the semantic actions necessary for an actual implementation); you can find it at the end of section 2.10. Make sure to read the initial part of that section, though, because it contains important information about how tokens are classified. Also, take note of section 2.3, token recognition.
Between these two sections you'll find a precise description of shell quoting rules and the various expansions which are done prior to parsing (or, better said, intermingled with parsing because command substitution makes the whole process recursive.) You might not want to absorb all of that on a first reading, although it will also help you be more effective in your use of the shell.
Bash implements a lot more features, but probably most or all of them go beyond your needs.
#choroba has the right idea - to understand exactly how Bash parses scripts you need to look at the source of Bash. There are basically fractal rules of thumb for how Bash works in increasingly complex cases, and any description short enough to fit in a SO response is probably not detailed enough to give you the full picture.

How I to parse includes with bison/flex?

I would like to do the following example by parsing linker scripts
example.ld
MEMORY
{
INCLUDE example_include.ld
}
example_include.ld
rom : ORIGIN = 0, LENGTH = 256K
I have found some code which could do this, but it is c flex/bison and I use c++ flex / bison.
I ve figured out that I can use yyFlexLexer lexer;
which provides me: yy_create_buffer() and so on ...
This is the code I ve found in binutils/ld/ldlex.l. Maybe it could help me.
void
lex_push_file (FILE *file, const char *name, unsigned int sysrooted)
{
if (include_stack_ptr >= MAX_INCLUDE_DEPTH)
{
einfo ("%F:includes nested too deeply\n");
}
file_name_stack[include_stack_ptr] = name;
lineno_stack[include_stack_ptr] = lineno;
sysrooted_stack[include_stack_ptr] = input_flags.sysrooted;
include_stack[include_stack_ptr] = YY_CURRENT_BUFFER;
include_stack_ptr++;
lineno = 1;
input_flags.sysrooted = sysrooted;
yyin = file;
yy_switch_to_buffer (yy_create_buffer (yyin, YY_BUF_SIZE));
}
My problem is that, I do not find a good example or documentation, how to use the c++ bison / flex? For example, I cannot use yyin, because it is protected and not public.
The easiest solution is to just recursively call the parser, passing it the file to be parsed. The precise details about how you communicate the environmental information (that is, the state of the parse) from the outer parser to the inner parser will depend heavily on the nature of your internal data structures, so I'm not even going to venture a guess. If all you're doing is building an AST (which is almost always the best solution even though it never seems attractive at first sight), then you won't have to do anything other than have the parser return the AST to its caller when it successfully parses a file.
The parser (or its manager) will generally create a new Lexer object to scan the provided input file; since the C++ scanners are fully reentrant, the coexistence of the two lexers creates no difficulties. This avoids using the buffer stack, and is generally a much cleaner solution.
This avoids a classic problem with handling "includes" in bison/flex parsers, which is that a naive solution allows syntactic context to leak out of the included file back into the including file. If the included file contains an unterminated block (or unterminated comment), that syntactic context might continue to be active at the end of the include, leading to unintuitive and often misleading error messages. The recursive strategy will instead trigger a syntax error at the end of the included file, which will also make error recovery easier.
Disclaimer: I'm really not a fan of the C++ interfaces for the scanners and parsers generated by flex and bison. Maybe someday I'll change my mind; I freely admit that it might just be intellectual laziness. In any case, aside from a few toys, the only parsers I've built use the C APIs, even when I write the actions in C++ (which I often do). So I'm not providing any sample code here, but I don't think that it's particularly difficult.

Grammar for the Chef Language

I'm just starting to use antlr, with antlr for ruby. The version is 3.2.1
I'm trying to create a parser for the chef language, and the grammar is giving me a real headache :P I'm sure I'm missing some fundamental concept, but I just couldn't figure it out.
I created 3 grammars. The main one is the recipe parser, which (of course) parses the recipes. Once a recipe is parsed, I used the other 2 grammars, that parse ingredients and instructions (the method section).
My problem is with the last one, the one that parses the instructions, such as "put ... into the mixing bowl", "liquefy ...", etc. Everything works great except for a few rules. I've posted the Instructions.g source here, at paste.bin because of its length.
Here's what's happening:
When I uncomment the rules combine_ingredient_into_mixing_bowl or divide_ingredient_into_mixing_bowl, the parser stops recognizing almost all of the other rules (such as put_ingredient_into_mixing_bowl). This seems strange to me, because they don't seem to override each other (of course they are, somehow). I get the error: "line 0:-1 mismatched input "" expecting WS"
stir_mixing_bowl does not match anything, but it's really no different from the other rules that do work ok. I get the error: "line 0:-1 mismatched input "" expecting set nil"
Is it possible to include the rules verb_the_ingredient and liquefy_ingredient without making them conflict with the other rules? The former will actually conflict with everything else I guess, and the latter will conflict with liquefy_mixing_bowl. What would be the best way to deal with such a nasty grammar?
By the way, I haven't set the WS (whitespace such as space and tab) to the ignore channel because since an ingredient can consiste of one or more words (such as dijon mustard or just zuchinnis) I found that it is easier to specify the grammar by using the WS token as separators.
Also, running the antlr4ruby command to generate the parsers/lexers code shows no warnings at all.
Any tips, hints, or enlightening is really appreciated here :)
Thanks in advance.

how to match function code block with regex

What I like to do is remove all functions which has specific word for example, if the word is 'apple':
void eatapple()
{
// blah
// blah
}
I'd like to delete all code from 'void' to '}'.
What I tried is:
^void.*apple(.|\n)*}
But it took very long time I think something is wrong here.
I'm using Visual Studio. Thank you.
To clarify jeong's eventual solution, which I think is pretty clever: it works, but it depends on the code being formatted in a very particular way. But that's OK, because most IDE's can enforce a particular formatting standard anyway. If I may give a related example - the problem that brought me here - I was looking to find expressions of the form (in Java)
if (DEBUG) {
// possibly arbitrary statements or blocks
}
which, yes, isn't technically regular, but I ran the Eclispe code formatter on the files to make sure they all necessarily looked like this (our company's usual preferred code style):
if (DEBUG) {
statement();
while (whatever) {
blahblahblah(etc);
}
// ...
}
and then looking for this (this is Java regex syntax, of course)
^(\s*)if \(DEBUG.*(?:\n\1 .*)*\n\1\}
did the trick.
Finally did it.
^void.*(a|A)pple\(\)\n\{\n((\t.*\n)|(^$\n))*^\}
Function blocks aren't regular, so using a regular expression in this situation is a bad idea.
If you really have a huge number of functions that you need to delete (more than you can delete by hand (suggesting there's something wrong with your codebase — but I digress)) then you should write a quick brace-counting parser instead of trying to use regular expressions in this situation.
It should be pretty easy, especially if you can assume the braces are already balanced. Read in tokens, find one that matches "apple", then keep going until you reach the brace that matches with the one immediately after the "apple" token. Delete everything between.
In theory, regular language is not able to express a sentence described by context free grammar. If it is a one time job, why don't you just do it manually.
Switch to VI. Then you can select the opening brace and press d% to delete the section.

What are your language "hangups"? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've read some of the recent language vs. language questions with interest... Perl vs. Python, Python vs. Java, Can one language be better than another?
One thing I've noticed is that a lot of us have very superficial reasons for disliking languages. We notice these things at first glance and they turn us off. We shun what are probably perfectly good languages as a result of features that we'd probably learn to love or ignore in 2 seconds if we bothered.
Well, I'm as guilty as the next guy, if not more. Here goes:
Ruby: All the Ruby example code I see uses the puts command, and that's a sort of childish Yiddish anatomical term. So as a result, I can't take Ruby code seriously even though I should.
Python: The first time I saw it, I smirked at the whole significant whitespace thing. I avoided it for the next several years. Now I hardly use anything else.
Java: I don't like identifiersThatLookLikeThis. I'm not sure why exactly.
Lisp: I have trouble with all the parentheses. Things of different importance and purpose (function declarations, variable assignments, etc.) are not syntactically differentiated and I'm too lazy to learn what's what.
Fortran: uppercase everything hurts my eyes. I know modern code doesn't have to be written like that, but most example code is...
Visual Basic: it bugs me that Dim is used to declare variables, since I remember the good ol' days of GW-BASIC when it was only used to dimension arrays.
What languages did look right to me at first glance? Perl, C, QBasic, JavaScript, assembly language, BASH shell, FORTH.
Okay, now that I've aired my dirty laundry... I want to hear yours. What are your language hangups? What superficial features bother you? How have you gotten over them?
I hate Hate HATE "End Function" and "End IF" and "If... Then" parts of VB. I would much rather see a curly bracket instead.
PHP's function name inconsistencies.
// common parameters back-to-front
in_array(needle, haystack);
strpos(haystack, needle);
// _ to separate words, or not?
filesize();
file_exists;
// super globals prefix?
$GLOBALS;
$_POST;
I never really liked the keywords spelled backwards in some scripting shells
if-then-fi is bad enough, but case-in-esac is just getting silly
I just thought of another... I hate the mostly-meaningless URLs used in XML to define namespaces, e.g. xmlns="http://purl.org/rss/1.0/"
Pascal's Begin and End. Too verbose, not subject to bracket matching, and worse, there isn't a Begin for every End, eg.
Type foo = Record
// ...
end;
Although I'm mainly a PHP developer, I dislike languages that don't let me do enough things inline. E.g.:
$x = returnsArray();
$x[1];
instead of
returnsArray()[1];
or
function sort($a, $b) {
return $a < $b;
}
usort($array, 'sort');
instead of
usort($array, function($a, $b) { return $a < $b; });
I like object-oriented style. So it bugs me in Python to see len(str) to get the length of a string, or splitting strings like split(str, "|") in another language. That is fine in C; it doesn't have objects. But Python, D, etc. do have objects and use obj.method() other places. (I still think Python is a great language.)
Inconsistency is another big one for me. I do not like inconsistent naming in the same library: length(), size(), getLength(), getlength(), toUTFindex() (why not toUtfIndex?), Constant, CONSTANT, etc.
The long names in .NET bother me sometimes. Can't they shorten DataGridViewCellContextMenuStripNeededEventArgs somehow? What about ListViewVirtualItemsSelectionRangeChangedEventArgs?
And I hate deep directory trees. If a library/project has a 5 level deep directory tree, I'm going to have trouble with it.
C and C++'s syntax is a bit quirky. They reuse operators for different things. You're probably so used to it that you don't think about it (nor do I), but consider how many meanings parentheses have:
int main() // function declaration / definition
printf("hello") // function call
(int)x // type cast
2*(7+8) // override precedence
int (*)(int) // function pointer
int x(3) // initializer
if (condition) // special part of syntax of if, while, for, switch
And if in C++ you saw
foo<bar>(baz(),baaz)
you couldn't know the meaning without the definition of foo and bar.
the < and > might be a template instantiation, or might be less-than and greater-than (unusual but legal)
the () might be a function call, or might be just surrounding the comma operator (ie. perform baz() for size-effects, then return baaz).
The silly thing is that other languages have copied some of these characteristics!
Java, and its checked exceptions. I left Java for a while, dwelling in the .NET world, then recently came back.
It feels like, sometimes, my throws clause is more voluminous than my method content.
There's nothing in the world I hate more than php.
Variables with $, that's one extra odd character for every variable.
Members are accessed with -> for no apparent reason, one extra character for every member access.
A freakshow of language really.
No namespaces.
Strings are concatenated with ..
A freakshow of language.
All the []s and #s in Objective C. Their use is so different from the underlying C's native syntax that the first time I saw them it gave the impression that all the object-orientation had been clumsily bolted on as an afterthought.
I abhor the boiler plate verbosity of Java.
writing getters and setters for properties
checked exception handling and all the verbiage that implies
long lists of imports
Those, in connection with the Java convention of using veryLongVariableNames, sometimes have me thinking I'm back in the 80's, writing IDENTIFICATION DIVISION. at the top of my programs.
Hint: If you can automate the generation of part of your code in your IDE, that's a good hint that you're producing boilerplate code. With automated tools, it's not a problem to write, but it's a hindrance every time someone has to read that code - which is more often.
While I think it goes a bit overboard on type bureaucracy, Scala has successfully addressed some of these concerns.
Coding Style inconsistencies in team projects.
I'm working on a large team project where some contributors have used 4 spaces instead of the tab character.
Working with their code can be very annoying - I like to keep my code clean and with a consistent style.
It's bad enough when you use different standards for different languages, but in a web project with HTML, CSS, Javascript, PHP and MySQL, that's 5 languages, 5 different styles, and multiplied by the number of people working on the project.
I'd love to re-format my co-workers code when I need to fix something, but then the repository would think I changed every line of their code.
It irritates me sometimes how people expect there to be one language for all jobs. Depending on the task you are doing, each language has its advantages and disadvantages. I like the C-based syntax languages because it's what I'm most used to and I like the flexibility they tend to bestow on the developer. Of course, with great power comes great responsibility, and having the power to write 150 line LINQ statements doesn't mean you should.
I love the inline XML in the latest version of VB.NET although I don't like working with VB mainly because I find the IDE less helpful than the IDE for C#.
If Microsoft had to invent yet another C++-like language in C# why didn't they correct Java's mistake and implement support for RAII?
Case sensitivity.
What kinda hangover do you need to think that differentiating two identifiers solely by caSE is a great idea?
I hate semi-colons. I find they add a lot of noise and you rarely need to put two statements on a line. I prefer the style of Python and other languages... end of line is end of a statement.
Any language that can't fully decide if Arrays/Loop/string character indexes are zero based or one based.
I personally prefer zero based, but any language that mixes the two, or lets you "configure" which is used can drive you bonkers. (Apache Velocity - I'm looking in your direction!)
snip from the VTL reference (default is 1, but you can set it to 0):
# Default starting value of the loop
# counter variable reference.
directive.foreach.counter.initial.value = 1
(try merging 2 projects that used different counter schemes - ugh!)
In no particular order...
OCaml
Tuples definitions use * to separate items rather than ,. So, ("Juliet", 23, true) has the type (string * int * bool).
For being such an awesome language, the documentation has this haunting comment on threads: "The threads library is implemented by time-sharing on a single processor. It will not take advantage of multi-processor machines. Using this library will therefore never make programs run faster." JoCaml doesn't fix this problem.
^^^ I've heard the Jane Street guys were working to add concurrent GC and multi-core threads to OCaml, but I don't know how successful they've been. I can't imagine a language without multi-core threads and GC surviving very long.
No easy way to explore modules in the toplevel. Sure, you can write module q = List;; and the toplevel will happily print out the module definition, but that just seems hacky.
C#
Lousy type inference. Beyond the most trivial expressions, I have to give types to generic functions.
All the LINQ code I ever read uses method syntax, x.Where(item => ...).OrderBy(item => ...). No one ever uses expression syntax, from item in x where ... orderby ... select. Between you and me, I think expression syntax is silly, if for no other reason than that it looks "foreign" against the backdrop of all other C# and VB.NET code.
LINQ
Every other language uses the industry standard names are Map, Fold/Reduce/Inject, and Filter. LINQ has to be different and uses Select, Aggregate, and Where.
Functional Programming
Monads are mystifying. Having seen the Parser monad, Maybe monad, State, and List monads, I can understand perfectly how the code works; however, as a general design pattern, I can't seem to look at problems and say "hey, I bet a monad would fit perfect here".
Ruby
GRRRRAAAAAAAH!!!!! I mean... seriously.
VB
Module Hangups
Dim _juliet as String = "Too Wordy!"
Public Property Juliet() as String
Get
Return _juliet
End Get
Set (ByVal value as String)
_juliet = value
End Set
End Property
End Module
And setter declarations are the bane of my existence. Alright, so I change the data type of my property -- now I need to change the data type in my setter too? Why doesn't VB borrow from C# and simply incorporate an implicit variable called value?
.NET Framework
I personally like Java casing convention: classes are PascalCase, methods and properties are camelCase.
In C/C++, it annoys me how there are different ways of writing the same code.
e.g.
if (condition)
{
callSomeConditionalMethod();
}
callSomeOtherMethod();
vs.
if (condition)
callSomeConditionalMethod();
callSomeOtherMethod();
equate to the same thing, but different people have different styles. I wish the original standard was more strict about making a decision about this, so we wouldn't have this ambiguity. It leads to arguments and disagreements in code reviews!
I found Perl's use of "defined" and "undefined" values to be so useful that I have trouble using scripting languages without it.
Perl:
($lastname, $firstname, $rest) = split(' ', $fullname);
This statement performs well no matter how many words are in $fullname. Try it in Python, and it explodes if $fullname doesn't contain exactly three words.
SQL, they say you should not use cursors and when you do, you really understand why...
its so heavy going!
DECLARE mycurse CURSOR LOCAL FAST_FORWARD READ_ONLY
FOR
SELECT field1, field2, fieldN FROM atable
OPEN mycurse
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
WHILE ##fetch_status = 0
BEGIN
-- do something really clever...
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
END
CLOSE mycurse
DEALLOCATE mycurse
Although I program primarily in python, It irks me endlessly that lambda body's must be expressions.
I'm still wrapping my brain around JavaScript, and as a whole, Its mostly acceptable. Why is it so hard to create a namespace. In TCL they're just ugly, but in JavaScript, it's actually a rigmarole AND completely unreadable.
In SQL how come everything is just one, huge freekin SELECT statement.
In Ruby, I very strongly dislike how methods do not require self. to be called on current instance, but properties do (otherwise they will clash with locals); i.e.:
def foo()
123
end
def foo=(x)
end
def bar()
x = foo() # okay, same as self.foo()
x = foo # not okay, reads unassigned local variable foo
foo = 123 # not okay, assigns local variable foo
end
To my mind, it's very inconsistent. I'd rather prefer to either always require self. in all cases, or to have a sigil for locals.
Java's packages. I find them complex, more so because I am not a corporation.
I vastly prefer namespaces. I'll get over it, of course - I'm playing with the Android SDK, and Eclipse removes a lot of the pain. I've never had a machine that could run it interactively before, and now I do I'm very impressed.
Prolog's if-then-else syntax.
x -> y ; z
The problem is that ";" is the "or" operator, so the above looks like "x implies y or z".
Java
Generics (Java version of templates) are limited. I can not call methods of the class and I can not create instances of the class. Generics are used by containers, but I can use containers of instances of Object.
No multiple inheritance. If a multiple inheritance use does not lead to diamond problem, it should be allowed. It should allow to write a default implementation of interface methods, a example of problem: the interface MouseListener has 5 methods, one for each event. If I want to handle just one of them, I have to implement the 4 other methods as an empty method.
It does not allow to choose to manually manage memory of some objects.
Java API uses complex combination of classes to do simple tasks. Example, if I want to read from a file, I have to use many classes (FileReader, FileInputStream).
Python
Indentation is part of syntax, I prefer to use the word "end" to indicate end of block and the word "pass" would not be needed.
In classes, the word "self" should not be needed as argument of functions.
C++
Headers are the worst problem. I have to list the functions in a header file and implement them in a cpp file. It can not hide dependencies of a class. If a class A uses the class B privately as a field, if I include the header of A, the header of B will be included too.
Strings and arrays came from C, they do not provide a length field. It is difficult to control if std::string and std::vector will use stack or heap. I have to use pointers with std::string and std::vector if I want to use assignment, pass as argument to a function or return it, because its "=" operator will copy entire structure.
I can not control the constructor and destructor. It is difficult to create an array of objects without a default constructor or choose what constructor to use with if and switch statements.
In most languages, file access. VB.NET is the only language so far where file access makes any sense to me. I do not understand why if I want to check if a file exists, I should use File.exists("") or something similar instead of creating a file object (actually FileInfo in VB.NET) and asking if it exists. And then if I want to open it, I ask it to open: (assuming a FileInfo object called fi) fi.OpenRead, for example. Returns a stream. Nice. Exactly what I wanted. If I want to move a file, fi.MoveTo. I can also do fi.CopyTo. What is this nonsense about not making files full-fledged objects in most languages? Also, if I want to iterate through the files in a directory, I can just create the directory object and call .GetFiles. Or I can do .GetDirectories, and I get a whole new set of DirectoryInfo objects to play with.
Admittedly, Java has some of this file stuff, but this nonsense of having to have a whole object to tell it how to list files is just silly.
Also, I hate ::, ->, => and all other multi-character operators except for <= and >= (and maybe -- and ++).
[Disclaimer: i only have a passing familiarity with VB, so take my comments with a grain of salt]
I Hate How Every Keyword In VB Is Capitalized Like This. I saw a blog post the other week (month?) about someone who tried writing VB code without any capital letters (they did something to a compiler that would let them compile VB code like that), and the language looked much nicer!
My big hangup is MATLAB's syntax. I use it, and there are things I like about it, but it has so many annoying quirks. Let's see.
Matrices are indexed with parentheses. So if you see something like Image(350,260), you have no clue from that whether we're getting an element from the Image matrix, or if we're calling some function called Image and passing arguments to it.
Scope is insane. I seem to recall that for loop index variables stay in scope after the loop ends.
If you forget to stick a semicolon after an assignment, the value will be dumped to standard output.
You may have one function per file. This proves to be very annoying for organizing one's work.
I'm sure I could come up with more if I thought about it.

Resources