What is the difference between syntax and semantics in programming languages (like C, C++)?
TL; DR
In summary, syntax is the concept that concerns itself only whether or not the sentence is valid for the grammar of the language. Semantics is about whether or not the sentence has a valid meaning.
Long answer:
Syntax is about the structure or the grammar of the language. It answers the question: how do I construct a valid sentence? All languages, even English and other human (aka "natural") languages have grammars, that is, rules that define whether or not the sentence is properly constructed.
Here are some C language syntax rules:
separate statements with a semi-colon
enclose the conditional expression of an IF statement inside parentheses
group multiple statements into a single statement by enclosing in curly braces
data types and variables must be declared before the first executable statement (this feature has been dropped in C99. C99 and latter allow mixed type declarations.)
Semantics is about the meaning of the sentence. It answers the questions: is this sentence valid? If so, what does the sentence mean? For example:
x++; // increment
foo(xyz, --b, &qrs); // call foo
are syntactically valid C statements. But what do they mean? Is it even valid to attempt to transform these statements into an executable sequence of instructions? These questions are at the heart of semantics.
Consider the ++ operator in the first statement. First of all, is it even valid to attempt this?
If x is a float data type, this statement has no meaning (according to the C language rules) and thus it is an error even though the statement is syntactically correct.
If x is a pointer to some data type, the meaning of the statement is to "add sizeof(some data type) to the value at address x and store the result into the location at address x".
If x is a scalar, the meaning of the statement is "add one to the value at address x and store the result into the location at address x".
Finally, note that some semantics can not be determined at compile-time and therefore must be evaluated at run-time. In the ++ operator example, if x is already at the maximum value for its data type, what happens when you try to add 1 to it? Another example: what happens if your program attempts to dereference a pointer whose value is NULL?
Syntax refers to the structure of a language, tracing its etymology to how things are put together.
For example you might require the code to be put together by declaring a type then a name and then a semicolon, to be syntactically correct.
Type token;
On the other hand, the semantics is about meaning.
A compiler or interpreter could complain about syntax errors. Your co-workers will complain about semantics.
Semantics is what your code means--what you might describe in pseudo-code. Syntax is the actual structure--everything from variable names to semi-colons.
Wikipedia has the answer. Read syntax (programming languages) & semantics (computer science) wikipages.
Or think about the work of any compiler or interpreter. The first step is lexical analysis where tokens are generated by dividing string into lexemes then parsing, which build some abstract syntax tree (which is a representation of syntax). The next steps involves transforming or evaluating these AST (semantics).
Also, observe that if you defined a variant of C where every keyword was transformed into its French equivalent (so if becoming si, do becoming faire, else becoming sinon etc etc...) you would definitely change the syntax of your language, but you won't change much the semantics: programming in that French-C won't be easier!
You need correct syntax to compile.
You need correct semantics to make it work.
Late to the party - but to me, the answers here seem correct but incomplete.
Pragmatically, I would distinguish between three levels:
Syntax
Low level semantics
High level semantics
1. SYNTAX
Syntax is the formal grammar of the language, which specifies a well-formed statement the compiler will recognise.
So in C, the syntax of variable initialisation is:
data_type variable_name = value_expression;
Example:
int volume = 66 * 22 * 55;
While in Go, which offers type inference, one form of initialisation is:
variable_name := value_expression
Example:
volume := 66 * 22 * 55
Clearly, a Go compiler won't recognise the C syntax, and vice versa.
2. LOW LEVEL SEMANTICS
Where syntax is concerned with form, semantics is concerned with meaning.
In natural languages, a sentence can be syntactically correct but semantically meaningless. For example:
The man bought the infinity from the store.
The sentence is grammatically correct but doesn't make real-world sense.
At the low level, programming semantics is concerned with whether a statement with correct syntax is also consistent with the semantic rules as expressed by the developer using the type system of the language.
For example, this is a syntactically correct assignment statement in Java, but semantically it's an error as it tries to assign an int to a String
String firstName = 23;
So type systems are intended to protect the developer from unintended slips of meaning at the low level.
Loosely typed languages like JavaScript or Python provide very little semantic protection, while languages like Haskell or F# with expressive type systems provide the skilled developer with a much higher level of protection.
For example, in F# your ShoppingCart type can specify that the cart must be in one of three states:
type ShoppingCart =
| EmptyCart // no data
| ActiveCart of ActiveCartData
| PaidCart of PaidCartData
Now the compiler can check that your code hasn't tried to put the cart into an illegal state.
In Python, you would have to write your own code to check for valid state.
3. HIGH LEVEL SEMANTICS
Finally, at a higher level, semantics is concerned with what the code is intended to achieve - the reason that the program is being written.
This can be expressed as pseudo-code which could be implemented in any complete language. For example:
// Check for an open trade for EURUSD
// For any open trade, close if the profit target is reached
// If there is no open trade for EURUSD, check for an entry signal
// For an entry signal, use risk settings to calculate trade size
// Submit the order.
In this (heroically simplified) scenario, you are making a high-level semantic error if your system enters two trades at once for EURUSD, enters a trade in the wrong direction, miscalculates the trade size, and so on.
TL; DR
If you screw up your syntax or low-level semantics, your compiler will complain.
If you screw up your high-level semantics, your program isn't fit for purpose and your customer will complain.
Syntax is the structure or form of expressions, statements, and program units but Semantics is the meaning of those expressions, statements, and program units. Semantics follow directly from syntax.
Syntax refers to the structure/form of the code that a specific programming language specifies but Semantics deal with the meaning assigned to the symbols, characters and words.
Understanding how the compiler sees the code
Usually, syntax and semantics analysis of the code is done in the 'frontend' part of the compiler.
Syntax: Compiler generates tokens for each keyword and symbols: the token contains the information- type of keyword and its location in the code.
Using these tokens, an AST(short for Abstract Syntax Tree) is created and analysed.
What compiler actually checks here is whether the code is lexically meaningful i.e. does the 'sequence of keywords' comply with the language rules? As suggested in previous answers, you can see it as the grammar of the language(not the sense/meaning of the code).
Side note: Syntax errors are reported in this phase.(returns tokens with the error type to the system)
Semantics: Now, the compiler will check whether your code operations 'makes sense'.
e.g. If the language supports Type Inference, sematic error will be reported if you're trying to assign a string to a float. OR declaring the same variable twice.
These are errors that are 'grammatically'/ syntaxially correct, but makes no sense during the operation.
Side note: For checking whether the same variable is declared twice, compiler manages a symbol table
So, the output of these 2 frontend phases is an annotated AST(with data types) and symbol table.
Understanding it in a less technical way
Considering the normal language we use; here, English:
e.g. He go to the school. - Incorrect grammar/syntax, though he wanted to convey a correct sense/semantic.
e.g. He goes to the cold. - cold is an adjective. In English, we might say this doesn't comply with grammar, but it actually is the closest example to incorrect semantic with correct syntax I could think of.
He drinks rice (wrong semantic- meaningless, right syntax- grammar)
Hi drink water (right semantic- has meaning, wrong syntax- grammar)
Syntax: It is referring to grammatically structure of the language.. If you are writing the c language . You have to very care to use of data types, tokens [ it can be literal or symbol like "printf()". It has 3 tokes, "printf, (, )" ]. In the same way, you have to very careful, how you use function, function syntax, function declaration, definition, initialization and calling of it.
While semantics, It concern to logic or concept of sentence or statements. If you saying or writing something out of concept or logic, then you are semantically wrong.
Often enough, I find myself dealing with lists of function options (or more general replacement lists) of the form {foo->value,...}. This leads to bugs when foo already has a value in $Context. One obvious way to prevent this is using a string "foo" instead of the symbol: {"foo"->value,...}. This works, but seems to draw ire of some seasoned LISPers I know, who chastise me for conflating symbols and strings and tell me to use built-in quoting constructs.
While it is certainly possible to write code that avoids collisions without using strings, it often seems more trouble than it is worth. On the other hand, I haven't seen too many examples of {"string"->value} type replacement rules. So the question to you is -- is this an acceptable usage pattern?.. Are there cases where it is particularly appropriate?.. Where should it be avoided?..
In my opinion (disclaimer - it is only my opinion), it is best to avoid using strings as option names, at least for "main" options in your function. Strings OTOH are totally fine as settings (r.h.s. of options). This is not to say that you can not use strings, just as you noted. Perhaps, they could be more appropriate for sub-options, and they are used in this way by many system functions (usually "superfunctions" like NDSolve, that may have sub-options within options). The main problems I see with using strings is that they reduce the introspection capabilities, both for the system and for the user. In other words, it is harder to discover an option that has a string name than that with a symbol name - for the latter I can just inspect the names of the symbols in a package, and also symbolic option names have usage messages. You may also want to automate some things, such as writing a utility that finds all option names in the package etc. It is easier to do when option names are symbols, since they all belong to the same context. It is also easy to discover that some options do not have usage messages, one can do that automatically by writing a utility function.
Finally, you may have a better protection against accidental collisions of similar option names. It may be, that many option sequences are passed to your function, and occasionally they may contain options with the same name. If option names were symbols, full symbol names would be different. Then, you will both get a shadowing warning, and at the same time a protection - only the correct option (full) name will be used. For string, you don't get any warning, and may end up using incorrect option setting, if the duplicate string option name with a wrong setting (intended for a different function, say) happens to be first in the list. This scenario is more likely to occur in larger projects, but bugs like this are probably very hard to catch (this is a guess, I never had such situation).
As for possible collisions, if you follow some naming conventions such as option name always starting with a capital letter, plus put most of your code in packages, and do not start your variable or function names (for functions in the interactive session), with a capital letter, then you will greatly reduce the chance of such collisions. Additionally, you should Protect option names, when you define them, or at the end of the package. Then, the collisions will be detected as cases of shadowing. Avoiding shadowing, OTOH, is a general necessity, so the case of options is no more special in this respect than for function names etc.
I need a very specific tool for VB (or multi-language). I thought I would ask if one already exists, before I start making one myself (probably, in python).
What I need:
The tool must crawl, recursivelly or not, a path, searching for a list of extension, such as .bas, .frm, .xxx
Then, It has to parse that files, searching for functions, routines, etc.
And finally, it must output what it found.
I based this on the idea of, "reducing code redundance", in an scenario where, bad programmers make a lot of functions that do the same thing, sometimes with the same name, sometimes not. There are 4 cases:
Case 1: Same name, Same content.
Case 2: Same name, Diff content.
Case 3: Diff name, Same content.
Case 4: Diff name, Diff Content.
So, the output, should be something like this
===========================================================================
RESULT
===========================================================================
Errors:
---------------------------------------------------------------------------
==Name, ==Content --> 3: (Func(), Foo(), Bar()) In files (f,f2,f3)
!=Name, ==Content --> 2: (Func() + Func1(), Bar() + Bar1()) In Files (f4)
---------------------------------------------------------------------------
Warnings:
==Name, !=Content --> 1 (Foobar()) In Files (f19)
---------------------------------------------------------------------------
This is to give you an idea of what I need.
So, the question is: is there any tool that acomplish something similar to this???
P.S: Yes, we should write good code, in first instance, but, you know, stuff happens.
What you want is a "clone detector". These tools find copy-and-pasted code across a large set of designated files. Clones are not just of functions; they can be code blocks, data declarations, etc.
There are a variety of detectors out there, and you should know how they work before you attempt to build one of your own.
Some simply match lines for exact equivalence. While these demonstrate the basic idea, their detection is not good because they don't take into account the fact that cloned code often has variations; what people really do is clone-and-edit when making copies.
Some match sequences of langauge tokens, e.g., identifiers, keywords, literals, punctuation. These at least are relatively tolerant of whitespace changes. And they can find clones in which single tokens have been substituted for single tokens. However, because they don't understand language structure (blocks, statements, function bodies) they often match sequences that cross such structure boundaries (e.g., "} {" is often considered a clone by these tools), they produce rather high false-positive indications of (non)clones. Some of these attempt to limit the matches to key program structures, such as complete functions, as you have kind of suggested.
More sophisticated detectors match program structures.
Our CloneDR (I'm the original author) is a detector that
uses compiler-quality parsing to abstract syntax trees, which extracts the precise structure of the code. It does this for many languages (including VB6 and VBScript), locating clones as arbitrary functions, blocks, statements or declarations, with parameters shows how the clones vary. CloneDR can find clones in spite of formatting changes, changes in comment locations or content, and even variations where complex constructs (multiple statements or expressions) have been used as alternatives to simple ones (e.g., a single statment or a literal). While it tends to have a high detection rate(it usually finds 10-20% removable redundancy!), its false-positive rate tends to be considerably lower than the token based detectors. You can see sample reports for
a variety of different langauges at the link above.
See Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach which explicitly discusses different approaches and benefits, and compares a large number of detectors including CloneDR.
EDIT October 2010: ... When I first wrote this response, I assumed the OP was interested in VB.net, which CloneDR didn't do. We've since added VB.net, VB6 and VBScript capability to CloneDR. (Parsing VB.net in its modern form is a lot messier than one might imagine for "simple"(!) langauge like Visual Basic).
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've read some of the recent language vs. language questions with interest... Perl vs. Python, Python vs. Java, Can one language be better than another?
One thing I've noticed is that a lot of us have very superficial reasons for disliking languages. We notice these things at first glance and they turn us off. We shun what are probably perfectly good languages as a result of features that we'd probably learn to love or ignore in 2 seconds if we bothered.
Well, I'm as guilty as the next guy, if not more. Here goes:
Ruby: All the Ruby example code I see uses the puts command, and that's a sort of childish Yiddish anatomical term. So as a result, I can't take Ruby code seriously even though I should.
Python: The first time I saw it, I smirked at the whole significant whitespace thing. I avoided it for the next several years. Now I hardly use anything else.
Java: I don't like identifiersThatLookLikeThis. I'm not sure why exactly.
Lisp: I have trouble with all the parentheses. Things of different importance and purpose (function declarations, variable assignments, etc.) are not syntactically differentiated and I'm too lazy to learn what's what.
Fortran: uppercase everything hurts my eyes. I know modern code doesn't have to be written like that, but most example code is...
Visual Basic: it bugs me that Dim is used to declare variables, since I remember the good ol' days of GW-BASIC when it was only used to dimension arrays.
What languages did look right to me at first glance? Perl, C, QBasic, JavaScript, assembly language, BASH shell, FORTH.
Okay, now that I've aired my dirty laundry... I want to hear yours. What are your language hangups? What superficial features bother you? How have you gotten over them?
I hate Hate HATE "End Function" and "End IF" and "If... Then" parts of VB. I would much rather see a curly bracket instead.
PHP's function name inconsistencies.
// common parameters back-to-front
in_array(needle, haystack);
strpos(haystack, needle);
// _ to separate words, or not?
filesize();
file_exists;
// super globals prefix?
$GLOBALS;
$_POST;
I never really liked the keywords spelled backwards in some scripting shells
if-then-fi is bad enough, but case-in-esac is just getting silly
I just thought of another... I hate the mostly-meaningless URLs used in XML to define namespaces, e.g. xmlns="http://purl.org/rss/1.0/"
Pascal's Begin and End. Too verbose, not subject to bracket matching, and worse, there isn't a Begin for every End, eg.
Type foo = Record
// ...
end;
Although I'm mainly a PHP developer, I dislike languages that don't let me do enough things inline. E.g.:
$x = returnsArray();
$x[1];
instead of
returnsArray()[1];
or
function sort($a, $b) {
return $a < $b;
}
usort($array, 'sort');
instead of
usort($array, function($a, $b) { return $a < $b; });
I like object-oriented style. So it bugs me in Python to see len(str) to get the length of a string, or splitting strings like split(str, "|") in another language. That is fine in C; it doesn't have objects. But Python, D, etc. do have objects and use obj.method() other places. (I still think Python is a great language.)
Inconsistency is another big one for me. I do not like inconsistent naming in the same library: length(), size(), getLength(), getlength(), toUTFindex() (why not toUtfIndex?), Constant, CONSTANT, etc.
The long names in .NET bother me sometimes. Can't they shorten DataGridViewCellContextMenuStripNeededEventArgs somehow? What about ListViewVirtualItemsSelectionRangeChangedEventArgs?
And I hate deep directory trees. If a library/project has a 5 level deep directory tree, I'm going to have trouble with it.
C and C++'s syntax is a bit quirky. They reuse operators for different things. You're probably so used to it that you don't think about it (nor do I), but consider how many meanings parentheses have:
int main() // function declaration / definition
printf("hello") // function call
(int)x // type cast
2*(7+8) // override precedence
int (*)(int) // function pointer
int x(3) // initializer
if (condition) // special part of syntax of if, while, for, switch
And if in C++ you saw
foo<bar>(baz(),baaz)
you couldn't know the meaning without the definition of foo and bar.
the < and > might be a template instantiation, or might be less-than and greater-than (unusual but legal)
the () might be a function call, or might be just surrounding the comma operator (ie. perform baz() for size-effects, then return baaz).
The silly thing is that other languages have copied some of these characteristics!
Java, and its checked exceptions. I left Java for a while, dwelling in the .NET world, then recently came back.
It feels like, sometimes, my throws clause is more voluminous than my method content.
There's nothing in the world I hate more than php.
Variables with $, that's one extra odd character for every variable.
Members are accessed with -> for no apparent reason, one extra character for every member access.
A freakshow of language really.
No namespaces.
Strings are concatenated with ..
A freakshow of language.
All the []s and #s in Objective C. Their use is so different from the underlying C's native syntax that the first time I saw them it gave the impression that all the object-orientation had been clumsily bolted on as an afterthought.
I abhor the boiler plate verbosity of Java.
writing getters and setters for properties
checked exception handling and all the verbiage that implies
long lists of imports
Those, in connection with the Java convention of using veryLongVariableNames, sometimes have me thinking I'm back in the 80's, writing IDENTIFICATION DIVISION. at the top of my programs.
Hint: If you can automate the generation of part of your code in your IDE, that's a good hint that you're producing boilerplate code. With automated tools, it's not a problem to write, but it's a hindrance every time someone has to read that code - which is more often.
While I think it goes a bit overboard on type bureaucracy, Scala has successfully addressed some of these concerns.
Coding Style inconsistencies in team projects.
I'm working on a large team project where some contributors have used 4 spaces instead of the tab character.
Working with their code can be very annoying - I like to keep my code clean and with a consistent style.
It's bad enough when you use different standards for different languages, but in a web project with HTML, CSS, Javascript, PHP and MySQL, that's 5 languages, 5 different styles, and multiplied by the number of people working on the project.
I'd love to re-format my co-workers code when I need to fix something, but then the repository would think I changed every line of their code.
It irritates me sometimes how people expect there to be one language for all jobs. Depending on the task you are doing, each language has its advantages and disadvantages. I like the C-based syntax languages because it's what I'm most used to and I like the flexibility they tend to bestow on the developer. Of course, with great power comes great responsibility, and having the power to write 150 line LINQ statements doesn't mean you should.
I love the inline XML in the latest version of VB.NET although I don't like working with VB mainly because I find the IDE less helpful than the IDE for C#.
If Microsoft had to invent yet another C++-like language in C# why didn't they correct Java's mistake and implement support for RAII?
Case sensitivity.
What kinda hangover do you need to think that differentiating two identifiers solely by caSE is a great idea?
I hate semi-colons. I find they add a lot of noise and you rarely need to put two statements on a line. I prefer the style of Python and other languages... end of line is end of a statement.
Any language that can't fully decide if Arrays/Loop/string character indexes are zero based or one based.
I personally prefer zero based, but any language that mixes the two, or lets you "configure" which is used can drive you bonkers. (Apache Velocity - I'm looking in your direction!)
snip from the VTL reference (default is 1, but you can set it to 0):
# Default starting value of the loop
# counter variable reference.
directive.foreach.counter.initial.value = 1
(try merging 2 projects that used different counter schemes - ugh!)
In no particular order...
OCaml
Tuples definitions use * to separate items rather than ,. So, ("Juliet", 23, true) has the type (string * int * bool).
For being such an awesome language, the documentation has this haunting comment on threads: "The threads library is implemented by time-sharing on a single processor. It will not take advantage of multi-processor machines. Using this library will therefore never make programs run faster." JoCaml doesn't fix this problem.
^^^ I've heard the Jane Street guys were working to add concurrent GC and multi-core threads to OCaml, but I don't know how successful they've been. I can't imagine a language without multi-core threads and GC surviving very long.
No easy way to explore modules in the toplevel. Sure, you can write module q = List;; and the toplevel will happily print out the module definition, but that just seems hacky.
C#
Lousy type inference. Beyond the most trivial expressions, I have to give types to generic functions.
All the LINQ code I ever read uses method syntax, x.Where(item => ...).OrderBy(item => ...). No one ever uses expression syntax, from item in x where ... orderby ... select. Between you and me, I think expression syntax is silly, if for no other reason than that it looks "foreign" against the backdrop of all other C# and VB.NET code.
LINQ
Every other language uses the industry standard names are Map, Fold/Reduce/Inject, and Filter. LINQ has to be different and uses Select, Aggregate, and Where.
Functional Programming
Monads are mystifying. Having seen the Parser monad, Maybe monad, State, and List monads, I can understand perfectly how the code works; however, as a general design pattern, I can't seem to look at problems and say "hey, I bet a monad would fit perfect here".
Ruby
GRRRRAAAAAAAH!!!!! I mean... seriously.
VB
Module Hangups
Dim _juliet as String = "Too Wordy!"
Public Property Juliet() as String
Get
Return _juliet
End Get
Set (ByVal value as String)
_juliet = value
End Set
End Property
End Module
And setter declarations are the bane of my existence. Alright, so I change the data type of my property -- now I need to change the data type in my setter too? Why doesn't VB borrow from C# and simply incorporate an implicit variable called value?
.NET Framework
I personally like Java casing convention: classes are PascalCase, methods and properties are camelCase.
In C/C++, it annoys me how there are different ways of writing the same code.
e.g.
if (condition)
{
callSomeConditionalMethod();
}
callSomeOtherMethod();
vs.
if (condition)
callSomeConditionalMethod();
callSomeOtherMethod();
equate to the same thing, but different people have different styles. I wish the original standard was more strict about making a decision about this, so we wouldn't have this ambiguity. It leads to arguments and disagreements in code reviews!
I found Perl's use of "defined" and "undefined" values to be so useful that I have trouble using scripting languages without it.
Perl:
($lastname, $firstname, $rest) = split(' ', $fullname);
This statement performs well no matter how many words are in $fullname. Try it in Python, and it explodes if $fullname doesn't contain exactly three words.
SQL, they say you should not use cursors and when you do, you really understand why...
its so heavy going!
DECLARE mycurse CURSOR LOCAL FAST_FORWARD READ_ONLY
FOR
SELECT field1, field2, fieldN FROM atable
OPEN mycurse
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
WHILE ##fetch_status = 0
BEGIN
-- do something really clever...
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
END
CLOSE mycurse
DEALLOCATE mycurse
Although I program primarily in python, It irks me endlessly that lambda body's must be expressions.
I'm still wrapping my brain around JavaScript, and as a whole, Its mostly acceptable. Why is it so hard to create a namespace. In TCL they're just ugly, but in JavaScript, it's actually a rigmarole AND completely unreadable.
In SQL how come everything is just one, huge freekin SELECT statement.
In Ruby, I very strongly dislike how methods do not require self. to be called on current instance, but properties do (otherwise they will clash with locals); i.e.:
def foo()
123
end
def foo=(x)
end
def bar()
x = foo() # okay, same as self.foo()
x = foo # not okay, reads unassigned local variable foo
foo = 123 # not okay, assigns local variable foo
end
To my mind, it's very inconsistent. I'd rather prefer to either always require self. in all cases, or to have a sigil for locals.
Java's packages. I find them complex, more so because I am not a corporation.
I vastly prefer namespaces. I'll get over it, of course - I'm playing with the Android SDK, and Eclipse removes a lot of the pain. I've never had a machine that could run it interactively before, and now I do I'm very impressed.
Prolog's if-then-else syntax.
x -> y ; z
The problem is that ";" is the "or" operator, so the above looks like "x implies y or z".
Java
Generics (Java version of templates) are limited. I can not call methods of the class and I can not create instances of the class. Generics are used by containers, but I can use containers of instances of Object.
No multiple inheritance. If a multiple inheritance use does not lead to diamond problem, it should be allowed. It should allow to write a default implementation of interface methods, a example of problem: the interface MouseListener has 5 methods, one for each event. If I want to handle just one of them, I have to implement the 4 other methods as an empty method.
It does not allow to choose to manually manage memory of some objects.
Java API uses complex combination of classes to do simple tasks. Example, if I want to read from a file, I have to use many classes (FileReader, FileInputStream).
Python
Indentation is part of syntax, I prefer to use the word "end" to indicate end of block and the word "pass" would not be needed.
In classes, the word "self" should not be needed as argument of functions.
C++
Headers are the worst problem. I have to list the functions in a header file and implement them in a cpp file. It can not hide dependencies of a class. If a class A uses the class B privately as a field, if I include the header of A, the header of B will be included too.
Strings and arrays came from C, they do not provide a length field. It is difficult to control if std::string and std::vector will use stack or heap. I have to use pointers with std::string and std::vector if I want to use assignment, pass as argument to a function or return it, because its "=" operator will copy entire structure.
I can not control the constructor and destructor. It is difficult to create an array of objects without a default constructor or choose what constructor to use with if and switch statements.
In most languages, file access. VB.NET is the only language so far where file access makes any sense to me. I do not understand why if I want to check if a file exists, I should use File.exists("") or something similar instead of creating a file object (actually FileInfo in VB.NET) and asking if it exists. And then if I want to open it, I ask it to open: (assuming a FileInfo object called fi) fi.OpenRead, for example. Returns a stream. Nice. Exactly what I wanted. If I want to move a file, fi.MoveTo. I can also do fi.CopyTo. What is this nonsense about not making files full-fledged objects in most languages? Also, if I want to iterate through the files in a directory, I can just create the directory object and call .GetFiles. Or I can do .GetDirectories, and I get a whole new set of DirectoryInfo objects to play with.
Admittedly, Java has some of this file stuff, but this nonsense of having to have a whole object to tell it how to list files is just silly.
Also, I hate ::, ->, => and all other multi-character operators except for <= and >= (and maybe -- and ++).
[Disclaimer: i only have a passing familiarity with VB, so take my comments with a grain of salt]
I Hate How Every Keyword In VB Is Capitalized Like This. I saw a blog post the other week (month?) about someone who tried writing VB code without any capital letters (they did something to a compiler that would let them compile VB code like that), and the language looked much nicer!
My big hangup is MATLAB's syntax. I use it, and there are things I like about it, but it has so many annoying quirks. Let's see.
Matrices are indexed with parentheses. So if you see something like Image(350,260), you have no clue from that whether we're getting an element from the Image matrix, or if we're calling some function called Image and passing arguments to it.
Scope is insane. I seem to recall that for loop index variables stay in scope after the loop ends.
If you forget to stick a semicolon after an assignment, the value will be dumped to standard output.
You may have one function per file. This proves to be very annoying for organizing one's work.
I'm sure I could come up with more if I thought about it.