What are your language "hangups"? [closed] - syntax

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've read some of the recent language vs. language questions with interest... Perl vs. Python, Python vs. Java, Can one language be better than another?
One thing I've noticed is that a lot of us have very superficial reasons for disliking languages. We notice these things at first glance and they turn us off. We shun what are probably perfectly good languages as a result of features that we'd probably learn to love or ignore in 2 seconds if we bothered.
Well, I'm as guilty as the next guy, if not more. Here goes:
Ruby: All the Ruby example code I see uses the puts command, and that's a sort of childish Yiddish anatomical term. So as a result, I can't take Ruby code seriously even though I should.
Python: The first time I saw it, I smirked at the whole significant whitespace thing. I avoided it for the next several years. Now I hardly use anything else.
Java: I don't like identifiersThatLookLikeThis. I'm not sure why exactly.
Lisp: I have trouble with all the parentheses. Things of different importance and purpose (function declarations, variable assignments, etc.) are not syntactically differentiated and I'm too lazy to learn what's what.
Fortran: uppercase everything hurts my eyes. I know modern code doesn't have to be written like that, but most example code is...
Visual Basic: it bugs me that Dim is used to declare variables, since I remember the good ol' days of GW-BASIC when it was only used to dimension arrays.
What languages did look right to me at first glance? Perl, C, QBasic, JavaScript, assembly language, BASH shell, FORTH.
Okay, now that I've aired my dirty laundry... I want to hear yours. What are your language hangups? What superficial features bother you? How have you gotten over them?

I hate Hate HATE "End Function" and "End IF" and "If... Then" parts of VB. I would much rather see a curly bracket instead.

PHP's function name inconsistencies.
// common parameters back-to-front
in_array(needle, haystack);
strpos(haystack, needle);
// _ to separate words, or not?
filesize();
file_exists;
// super globals prefix?
$GLOBALS;
$_POST;

I never really liked the keywords spelled backwards in some scripting shells
if-then-fi is bad enough, but case-in-esac is just getting silly

I just thought of another... I hate the mostly-meaningless URLs used in XML to define namespaces, e.g. xmlns="http://purl.org/rss/1.0/"

Pascal's Begin and End. Too verbose, not subject to bracket matching, and worse, there isn't a Begin for every End, eg.
Type foo = Record
// ...
end;

Although I'm mainly a PHP developer, I dislike languages that don't let me do enough things inline. E.g.:
$x = returnsArray();
$x[1];
instead of
returnsArray()[1];
or
function sort($a, $b) {
return $a < $b;
}
usort($array, 'sort');
instead of
usort($array, function($a, $b) { return $a < $b; });

I like object-oriented style. So it bugs me in Python to see len(str) to get the length of a string, or splitting strings like split(str, "|") in another language. That is fine in C; it doesn't have objects. But Python, D, etc. do have objects and use obj.method() other places. (I still think Python is a great language.)
Inconsistency is another big one for me. I do not like inconsistent naming in the same library: length(), size(), getLength(), getlength(), toUTFindex() (why not toUtfIndex?), Constant, CONSTANT, etc.
The long names in .NET bother me sometimes. Can't they shorten DataGridViewCellContextMenuStripNeededEventArgs somehow? What about ListViewVirtualItemsSelectionRangeChangedEventArgs?
And I hate deep directory trees. If a library/project has a 5 level deep directory tree, I'm going to have trouble with it.

C and C++'s syntax is a bit quirky. They reuse operators for different things. You're probably so used to it that you don't think about it (nor do I), but consider how many meanings parentheses have:
int main() // function declaration / definition
printf("hello") // function call
(int)x // type cast
2*(7+8) // override precedence
int (*)(int) // function pointer
int x(3) // initializer
if (condition) // special part of syntax of if, while, for, switch
And if in C++ you saw
foo<bar>(baz(),baaz)
you couldn't know the meaning without the definition of foo and bar.
the < and > might be a template instantiation, or might be less-than and greater-than (unusual but legal)
the () might be a function call, or might be just surrounding the comma operator (ie. perform baz() for size-effects, then return baaz).
The silly thing is that other languages have copied some of these characteristics!

Java, and its checked exceptions. I left Java for a while, dwelling in the .NET world, then recently came back.
It feels like, sometimes, my throws clause is more voluminous than my method content.

There's nothing in the world I hate more than php.
Variables with $, that's one extra odd character for every variable.
Members are accessed with -> for no apparent reason, one extra character for every member access.
A freakshow of language really.
No namespaces.
Strings are concatenated with ..
A freakshow of language.

All the []s and #s in Objective C. Their use is so different from the underlying C's native syntax that the first time I saw them it gave the impression that all the object-orientation had been clumsily bolted on as an afterthought.

I abhor the boiler plate verbosity of Java.
writing getters and setters for properties
checked exception handling and all the verbiage that implies
long lists of imports
Those, in connection with the Java convention of using veryLongVariableNames, sometimes have me thinking I'm back in the 80's, writing IDENTIFICATION DIVISION. at the top of my programs.
Hint: If you can automate the generation of part of your code in your IDE, that's a good hint that you're producing boilerplate code. With automated tools, it's not a problem to write, but it's a hindrance every time someone has to read that code - which is more often.
While I think it goes a bit overboard on type bureaucracy, Scala has successfully addressed some of these concerns.

Coding Style inconsistencies in team projects.
I'm working on a large team project where some contributors have used 4 spaces instead of the tab character.
Working with their code can be very annoying - I like to keep my code clean and with a consistent style.
It's bad enough when you use different standards for different languages, but in a web project with HTML, CSS, Javascript, PHP and MySQL, that's 5 languages, 5 different styles, and multiplied by the number of people working on the project.
I'd love to re-format my co-workers code when I need to fix something, but then the repository would think I changed every line of their code.

It irritates me sometimes how people expect there to be one language for all jobs. Depending on the task you are doing, each language has its advantages and disadvantages. I like the C-based syntax languages because it's what I'm most used to and I like the flexibility they tend to bestow on the developer. Of course, with great power comes great responsibility, and having the power to write 150 line LINQ statements doesn't mean you should.
I love the inline XML in the latest version of VB.NET although I don't like working with VB mainly because I find the IDE less helpful than the IDE for C#.

If Microsoft had to invent yet another C++-like language in C# why didn't they correct Java's mistake and implement support for RAII?

Case sensitivity.
What kinda hangover do you need to think that differentiating two identifiers solely by caSE is a great idea?

I hate semi-colons. I find they add a lot of noise and you rarely need to put two statements on a line. I prefer the style of Python and other languages... end of line is end of a statement.

Any language that can't fully decide if Arrays/Loop/string character indexes are zero based or one based.
I personally prefer zero based, but any language that mixes the two, or lets you "configure" which is used can drive you bonkers. (Apache Velocity - I'm looking in your direction!)
snip from the VTL reference (default is 1, but you can set it to 0):
# Default starting value of the loop
# counter variable reference.
directive.foreach.counter.initial.value = 1
(try merging 2 projects that used different counter schemes - ugh!)

In no particular order...
OCaml
Tuples definitions use * to separate items rather than ,. So, ("Juliet", 23, true) has the type (string * int * bool).
For being such an awesome language, the documentation has this haunting comment on threads: "The threads library is implemented by time-sharing on a single processor. It will not take advantage of multi-processor machines. Using this library will therefore never make programs run faster." JoCaml doesn't fix this problem.
^^^ I've heard the Jane Street guys were working to add concurrent GC and multi-core threads to OCaml, but I don't know how successful they've been. I can't imagine a language without multi-core threads and GC surviving very long.
No easy way to explore modules in the toplevel. Sure, you can write module q = List;; and the toplevel will happily print out the module definition, but that just seems hacky.
C#
Lousy type inference. Beyond the most trivial expressions, I have to give types to generic functions.
All the LINQ code I ever read uses method syntax, x.Where(item => ...).OrderBy(item => ...). No one ever uses expression syntax, from item in x where ... orderby ... select. Between you and me, I think expression syntax is silly, if for no other reason than that it looks "foreign" against the backdrop of all other C# and VB.NET code.
LINQ
Every other language uses the industry standard names are Map, Fold/Reduce/Inject, and Filter. LINQ has to be different and uses Select, Aggregate, and Where.
Functional Programming
Monads are mystifying. Having seen the Parser monad, Maybe monad, State, and List monads, I can understand perfectly how the code works; however, as a general design pattern, I can't seem to look at problems and say "hey, I bet a monad would fit perfect here".
Ruby
GRRRRAAAAAAAH!!!!! I mean... seriously.
VB
Module Hangups
Dim _juliet as String = "Too Wordy!"
Public Property Juliet() as String
Get
Return _juliet
End Get
Set (ByVal value as String)
_juliet = value
End Set
End Property
End Module
And setter declarations are the bane of my existence. Alright, so I change the data type of my property -- now I need to change the data type in my setter too? Why doesn't VB borrow from C# and simply incorporate an implicit variable called value?
.NET Framework
I personally like Java casing convention: classes are PascalCase, methods and properties are camelCase.

In C/C++, it annoys me how there are different ways of writing the same code.
e.g.
if (condition)
{
callSomeConditionalMethod();
}
callSomeOtherMethod();
vs.
if (condition)
callSomeConditionalMethod();
callSomeOtherMethod();
equate to the same thing, but different people have different styles. I wish the original standard was more strict about making a decision about this, so we wouldn't have this ambiguity. It leads to arguments and disagreements in code reviews!

I found Perl's use of "defined" and "undefined" values to be so useful that I have trouble using scripting languages without it.
Perl:
($lastname, $firstname, $rest) = split(' ', $fullname);
This statement performs well no matter how many words are in $fullname. Try it in Python, and it explodes if $fullname doesn't contain exactly three words.

SQL, they say you should not use cursors and when you do, you really understand why...
its so heavy going!
DECLARE mycurse CURSOR LOCAL FAST_FORWARD READ_ONLY
FOR
SELECT field1, field2, fieldN FROM atable
OPEN mycurse
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
WHILE ##fetch_status = 0
BEGIN
-- do something really clever...
FETCH NEXT FROM mycurse INTO #Var1, #Var2, #VarN
END
CLOSE mycurse
DEALLOCATE mycurse

Although I program primarily in python, It irks me endlessly that lambda body's must be expressions.
I'm still wrapping my brain around JavaScript, and as a whole, Its mostly acceptable. Why is it so hard to create a namespace. In TCL they're just ugly, but in JavaScript, it's actually a rigmarole AND completely unreadable.
In SQL how come everything is just one, huge freekin SELECT statement.

In Ruby, I very strongly dislike how methods do not require self. to be called on current instance, but properties do (otherwise they will clash with locals); i.e.:
def foo()
123
end
def foo=(x)
end
def bar()
x = foo() # okay, same as self.foo()
x = foo # not okay, reads unassigned local variable foo
foo = 123 # not okay, assigns local variable foo
end
To my mind, it's very inconsistent. I'd rather prefer to either always require self. in all cases, or to have a sigil for locals.

Java's packages. I find them complex, more so because I am not a corporation.
I vastly prefer namespaces. I'll get over it, of course - I'm playing with the Android SDK, and Eclipse removes a lot of the pain. I've never had a machine that could run it interactively before, and now I do I'm very impressed.

Prolog's if-then-else syntax.
x -> y ; z
The problem is that ";" is the "or" operator, so the above looks like "x implies y or z".

Java
Generics (Java version of templates) are limited. I can not call methods of the class and I can not create instances of the class. Generics are used by containers, but I can use containers of instances of Object.
No multiple inheritance. If a multiple inheritance use does not lead to diamond problem, it should be allowed. It should allow to write a default implementation of interface methods, a example of problem: the interface MouseListener has 5 methods, one for each event. If I want to handle just one of them, I have to implement the 4 other methods as an empty method.
It does not allow to choose to manually manage memory of some objects.
Java API uses complex combination of classes to do simple tasks. Example, if I want to read from a file, I have to use many classes (FileReader, FileInputStream).
Python
Indentation is part of syntax, I prefer to use the word "end" to indicate end of block and the word "pass" would not be needed.
In classes, the word "self" should not be needed as argument of functions.
C++
Headers are the worst problem. I have to list the functions in a header file and implement them in a cpp file. It can not hide dependencies of a class. If a class A uses the class B privately as a field, if I include the header of A, the header of B will be included too.
Strings and arrays came from C, they do not provide a length field. It is difficult to control if std::string and std::vector will use stack or heap. I have to use pointers with std::string and std::vector if I want to use assignment, pass as argument to a function or return it, because its "=" operator will copy entire structure.
I can not control the constructor and destructor. It is difficult to create an array of objects without a default constructor or choose what constructor to use with if and switch statements.

In most languages, file access. VB.NET is the only language so far where file access makes any sense to me. I do not understand why if I want to check if a file exists, I should use File.exists("") or something similar instead of creating a file object (actually FileInfo in VB.NET) and asking if it exists. And then if I want to open it, I ask it to open: (assuming a FileInfo object called fi) fi.OpenRead, for example. Returns a stream. Nice. Exactly what I wanted. If I want to move a file, fi.MoveTo. I can also do fi.CopyTo. What is this nonsense about not making files full-fledged objects in most languages? Also, if I want to iterate through the files in a directory, I can just create the directory object and call .GetFiles. Or I can do .GetDirectories, and I get a whole new set of DirectoryInfo objects to play with.
Admittedly, Java has some of this file stuff, but this nonsense of having to have a whole object to tell it how to list files is just silly.
Also, I hate ::, ->, => and all other multi-character operators except for <= and >= (and maybe -- and ++).

[Disclaimer: i only have a passing familiarity with VB, so take my comments with a grain of salt]
I Hate How Every Keyword In VB Is Capitalized Like This. I saw a blog post the other week (month?) about someone who tried writing VB code without any capital letters (they did something to a compiler that would let them compile VB code like that), and the language looked much nicer!

My big hangup is MATLAB's syntax. I use it, and there are things I like about it, but it has so many annoying quirks. Let's see.
Matrices are indexed with parentheses. So if you see something like Image(350,260), you have no clue from that whether we're getting an element from the Image matrix, or if we're calling some function called Image and passing arguments to it.
Scope is insane. I seem to recall that for loop index variables stay in scope after the loop ends.
If you forget to stick a semicolon after an assignment, the value will be dumped to standard output.
You may have one function per file. This proves to be very annoying for organizing one's work.
I'm sure I could come up with more if I thought about it.

Related

How to Work with Ruby Duck Typing

I am learning Ruby and I'm having a major conceptual problem concerning typing. Allow me to detail why I don't understand with paradigm.
Say I am method chaining for concise code as you do in Ruby. I have to precisely know what the return type of each method call in the chain, otherwise I can't know what methods are available on the next link. Do I have to check the method documentation every time?? I'm running into this constantly running tutorial exercises. It seems I'm stuck with a process of reference, infer, run, fail, fix, repeat to get code running rather then knowing precisely what I'm working with during coding. This flies in the face of Ruby's promise of intuitiveness.
Say I am using a third party library, once again I need to know what types are allow to pass on the parameters otherwise I get a failure. I can look at the code but there may or may not be any comments or declaration of what type the method is expecting. I understand you code based on methods are available on an object, not the type. But then I have to be sure whatever I pass as a parameter has all the methods the library is expect, so I still have to do type checking. Do I have to hope and pray everything is documented properly on an interface so I know if I'm expected to give a string, a hash, a class, etc.
If I look at the source of a method I can get a list of methods being called and infer the type expected, but I have to perform analysis.
Ruby and duck typing: design by contract impossible?
The discussions in the preceding stackoverflow question don't really answer anything other than "there are processes you have to follow" and those processes don't seem to be standard, everyone has a different opinion on what process to follow, and the language has zero enforcement. Method Validation? Test-Driven Design? Documented API? Strict Method Naming Conventions? What's the standard and who dictates it? What do I follow? Would these guidelines solve this concern https://stackoverflow.com/questions/616037/ruby-coding-style-guidelines? Is there editors that help?
Conceptually I don't get the advantage either. You need to know what methods are needed for any method called, so regardless you are typing when you code anything. You just aren't informing the language or anyone else explicitly, unless you decide to document it. Then you are stuck doing all type checking at runtime instead of during coding. I've done PHP and Python programming and I don't understand it there either.
What am I missing or not understanding? Please help me understand this paradigm.
This is not a Ruby specific problem, it's the same for all dynamically typed languages.
Usually there are no guidelines for how to document this either (and most of the time not really possible). See for instance map in the ruby documentation
map { |item| block } → new_ary
map → Enumerator
What is item, block and new_ary here and how are they related? There's no way to tell unless you know the implementation or can infer it from the name of the function somehow. Specifying the type is also hard since new_ary depends on what block returns, which in turn depends on the type of item, which could be different for each element in the Array.
A lot of times you also stumble across documentation that says that an argument is of type Object, Which again tells you nothing since everything is an Object.
OCaml has a solution for this, it supports structural typing so a function that needs an object with a property foo that's a String will be inferred to be { foo : String } instead of a concrete type. But OCaml is still statically typed.
Worth noting is that this can be a problem in statically typed lanugages too. Scala has very generic methods on collections which leads to type signatures like ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Array[T], B, That]): That for appending two collections.
So most of the time, you will just have to learn this by heart in dynamically typed languages, and perhaps help improve the documentation of libraries you are using.
And this is why I prefer static typing ;)
Edit One thing that might make sense is to do what Scala also does. It doesn't actually show you that type signature for ++ by default, instead it shows ++[B](that: GenTraversableOnce[B]): Array[B] which is not as generic, but probably covers most of the use cases. So for Ruby's map it could have a monomorphic type signature like Array<a> -> (a -> b) -> Array<b>. It's only correct for the cases where the list only contains values of one type and the block only returns elements of one other type, but it's much easier to understand and gives a good overview of what the function does.
Yes, you seem to misunderstand the concept. It's not a replacement for static type checking. It's just different. For example, if you convert objects to json (for rendering them to client), you don't care about actual type of the object, as long as it has #to_json method. In Java, you'd have to create IJsonable interface. In ruby no overhead is needed.
As for knowing what to pass where and what returns what: memorize this or consult docs each time. We all do that.
Just another day, I've seen rails programmer with 6+ years of experience complain on twitter that he can't memorize order of parameters to alias_method: does new name go first or last?
This flies in the face of Ruby's promise of intuitiveness.
Not really. Maybe it's just badly written library. In core ruby everything is quite intuitive, I dare say.
Statically typed languages with their powerful IDEs have a small advantage here, because they can show you documentation right here, very quickly. This is still accessing documentation, though. Only quicker.
Consider that the design choices of strongly typed languages (C++,Java,C#,et al) enforce strict declarations of type passed to methods, and type returned by methods. This is because these languages were designed to validate that arguments are correct (and since these languages are compiled, this work can be done at compile time). But some questions can only be answered at run time, and C++ for example has the RTTI (Run Time Type Interpreter) to examine and enforce type guarantees. But as the developer, you are guided by syntax, semantics and the compiler to produce code that follows these type constraints.
Ruby gives you flexibility to take dynamic argument types, and return dynamic types. This freedom enables you to write more generic code (read Stepanov on the STL and generic programming), and gives you a rich set of introspection methods (is_a?, instance_of?, respond_to?, kind_of?, is_array?, et al) which you can use dynamically. Ruby enables you to write generic methods, but you can also explicity enforce design by contract, and process failure of contract by means chosen.
Yes, you will need to use care when chaining methods together, but learning Ruby is not just a few new keywords. Ruby supports multiple paradigms; you can write procedural, object oriend, generic, and functional programs. The cycle you are in right now will improve quickly as you learn about Ruby.
Perhaps your concern stems from a bias towards strongly typed languages (C++, Java, C#, et al). Duck typing is a different approach. You think differently. Duck typing means that if an object looks like a , behaves like a , then it is a . Everything (almost) is an Object in Ruby, so everything is polymorphic.
Consider templates (C++ has them, C# has them, Java is getting them, C has macros). You build an algorithm, and then have the compiler generate instances for your chosen types. You aren't doing design by contract with generics, but when you recognize their power, you write less code, and produce more.
Some of your other concerns,
third party libraries (gems) are not as hard to use as you fear
Documented API? See Rdoc and http://www.ruby-doc.org/
Rdoc documentation is (usually) provided for libraries
coding guidelines - look at the source for a couple of simple gems for starters
naming conventions - snake case and camel case are both popular
Suggestion - approach an online tutorial with an open mind, do the tutorial (http://rubymonk.com/learning/books/ is good), and you will have more focused questions.

Why isn't DRY considered a good thing for type declarations?

It seems like people who would never dare cut and paste code have no problem specifying the type of something over and over and over. Why isn't it emphasized as a good practice that type information should be declared once and only once so as to cause as little ripple effect as possible throughout the source code if the type of something is modified? For example, using pseudocode that borrows from C# and D:
MyClass<MyGenericArg> foo = new MyClass<MyGenericArg>(ctorArg);
void fun(MyClass<MyGenericArg> arg) {
gun(arg);
}
void gun(MyClass<MyGenericArg> arg) {
// do stuff.
}
Vs.
var foo = new MyClass<MyGenericArg>(ctorArg);
void fun(T)(T arg) {
gun(arg);
}
void gun(T)(T arg) {
// do stuff.
}
It seems like the second one is a lot less brittle if you change the name of MyClass, or change the type of MyGenericArg, or otherwise decide to change the type of foo.
I don't think you're going to find a lot of disagreement with your argument that the latter example is "better" for the programmer. A lot of language design features are there because they're better for the compiler implementer!
See Scala for one reification of your idea.
Other languages (such as the ML family) take type inference much further, and create a whole style of programming where the type is enormously important, much more so than in the C-like languages. (See The Little MLer for a gentle introduction.)
It isn't considered a bad thing at all. In fact, C# maintainers are already moving a bit towards reducing the tiring boilerplate with the var keyword, where
MyContainer<MyType> cont = new MyContainer<MyType>();
is exactly equivalent to
var cont = new MyContainer<MyType>();
Although you will see many people who will argue against var usage, which kind of shows that many people is not familiar with strong typed languages with type inference; type inference is mistaken for dynamic/soft typing.
Repetition may lead to more readable code, and sometimes may be required in the general case. I've always seen the focus of DRY being more about duplicating logic than repeating literal text. Technically, you can eliminate 'var' and 'void' from your bottom code as well. Not to mention you indicate scope with indentation, why repeat yourself with braces?
Repetition can also have practical benefits: parsing by a program is easier by keeping the 'void', for example.
(However, I still strongly agree with you on prefering "var name = new Type()" over "Type name = new Type()".)
It's a bad thing. This very topic was mentioned in Google's Go language Techtalk.
Albert Einstein said, "Everything should be made as simple as possible, but not one bit simpler."
Your complaint makes no sense in the case of a dynamically typed language, so you must intend this to refer to statically typed languages. In that case, your replacement example implicitly uses Generics (aka Template Classes), which means that any time that fun or gun is used, a new definition based upon the type of the argument. That could result in dozens of extra methods, regardless of the intent of the programmer. In particular, you're throwing away the benefit of compiler-checked type-safety for a runtime error.
If your goal was to simply pass through the argument without checking its type, then the correct type would be Object not T.
Type declarations are intended to make the programmer's life simpler, by catching errors at compile-time, instead of failing at runtime. If you have an overly complex type definition, then you probably don't understand your data. In your example, I would have suggested adding fun and gun to MyClass, instead of defining them separately. If fun and gun don't apply to all possible template types, then they should be defined in an explicit subclass, not as separate functions that take a templated class argument.
Generics exist as a way to wrap behavior around more specific objects. List, Queue, Stack, these are fine reasons for Generics, but at the end of the day, the only thing you should be doing with a bare Generic is creating an instance of it, and calling methods on it. If you really feel the need to do more than that with a Generic, then you probably need to embed your Generic class as an instance object in a wrapper class, one that defines the behaviors you need. You do this for the same reason that you embed primitives into a class: because by themselves, numbers and strings do not convey semantic information about their contents.
Example:
What semantic information does List convey? Just that you're working with multiple triples of integers. On the other hand, List, where a color has 3 integers (red, blue, green) with bounded values (0-255) conveys the intent that you're working with multiple Colors, but provides no hint as to whether the List is ordered, allows duplicates, or any other information about the Colors. Finally a Palette can add those semantics for you: a Palette has a name, contains multiple Colors, but no duplicates, and order isn't important.
This has gotten a bit far afield from the original question, but what it means to me is that DRY (Don't Repeat Yourself) means specifying information once, but that specification should be as precise as is necessary.

What the ugliest API for a relatively well known library that you have seen, and why and how could it be improved? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have been looking at the differences between Lucene 2.9 particular the redone tokenstream API and it just occurs to me its particularly ugly compared to the old just return a new or repopulate the given with values if your reusing said Token.
I have not done any profiling but it seems using a MAP to store attributes is not that efficient and it would be easier to just create a new value type holding values etc. The TokenStream and Attribute stuff looks like object pooling which is pretty much never necessary these days for simple value types like a Token of text.
creat()
When Ken Thompson and Dennis Ritchie received the 1983 Turing Award, after their respective acceptance speeches, someone in the audience asked Ken what he would do differently with Unix if he were to do it all over again. He said, "I'd spell 'creat' with an 'e'."
Livelink (OpenText) API
Everything comes back as some bizarre form of a jagged array
The documentation provides absolutely no examples
[your favorite search engine] typically returns no results for a given API method
The support forums feel near abandoned
The only reliable way of understanding the resultant data is to run the data in the Livelink debugger
And finally... the system costs tens (hundreds) of thousands of dollars
The wall next to my desk has an imprint of my head...
A very simple example of getting a value out of an API method:
var workflow = new LAPI_Workflow(CurrentSession);
// every Livelink method uses an out variable
LLValue outValue;
// every method returns an integer that says if the call was
// a success or not, where 0 = success and any other integer
// is a failure... oh yeah, there is no reference to what any
// of the failure values mean, you have to create your own
// error dictionary.
int result = workflow.ListWorkTasks(workId, subWorkId, taskId, outValue);
if (result = 0)
{
// and now let's traverse through at least 3 different arrays!
string taskName = outValue.toValue(0).toValue("TASKS").toValue(0).toString("TaskName");
}
Aaack!!! :D
I've never been a fan of the java.sql package...
You have to catch the checked exception for everything, and there's only one exception, so it doesn't really give any indication of what went wrong without examining the SQL code String.
Add to that the fact that you have to use java.sql.Date instead of java.util.Data, so you always have to specify the full package for one or the other. Not to mention the conversion that has to take place between the two.
And then there's the parameter index, which is 1-base-indexed instead of the rest of Java, which is 0-base-indexed.
All in all, a pretty annoying library. Thankfully, the Spring library does make it quite a bit easier to work with.
COM. Its biggest improvements ended up being .NET.
Certain java.io.File methods, critical to systems programming, return a boolean to indicate success or failure. If such a method (like, say, mkdir or delete) fails, you have no way at all to find out why.
This always leaves my jaw a-hangin' open.
Java's date/time API is pretty horrible to work with. java.util.Date has several constructors to create an instance for a specific date, but all of them are deprecated. java.util.GregorianCalendar should be used instead, but that has an extremely annoying way of setting fields (think calendar.setField(GregorianCalendar.MONTH, 7) instead of calendar.setMonth(7) which would be far better). The finishing touch is that most other classes and libraries still expect a Date instead of a Calendar, so you have to constantly convert back and forth.
Not not a winner, but deserves a honourably mention; Android. Uses the Java 5 programming language, but barely any of the Java 5 language features. Instead of enums you get integer constants with prefix or suffix.
It can not quite decide if it should be object oriented, or procedural. Showing dialogs being a prime example. Several callbacks with self defined integer ids to display call upon the dialog, that smells of an old C API. And then you get an inner builder class class with chained methods, that smells of over architectured OOP of the worst kind.
The MotionEvent class have X and Y coordinates as absolute and relative values from the same accessory method. But no way to check what kind of coordinates it currently holds.
Android sure is a mixed bag.
I'm going to turn this question on its head and name a beautiful API for a library whose standard API is mostly ugly: the Haskell bindings for OpenGL.
These are the reasons:
Instead of lumping everything into a small number of headers, the library is organized logically into discrete modules, whose contents parallel the structure of the OpenGL specification. This makes browsing the documentation a pleasant experience.
Pairs of "begin/end" functions are replaced by higher-order procedures. For example, instead of
pushMatrix();
doSomeStuff();
doSomeMoreStuff();
popMatrix();
you'd say
preservingMatrix $ do
doSomeStuff
doSomeMoreStuff
The syntax of the bindings enforces the conventions of the library, instead of making you do it by hand. This works for the drawing primitives of quads, triangles, lines, etc. as well. All of this is exception-safe, of course.
Getters and setters are replaced by idiomatic "StateVars", making reading and writing a more symmetric operation.
Multiple versions of functions replaced by polymorphism and extra datatypes. Instead of calling, say, glVertex2f with two float values, you call vertex with a value of type Vertex2 GLFloat.
References:
API Reference
The HaskellWiki page on OpenGL
Beautiful Code, Compelling Evidence (pdf)
Praise from Scott Dillard, quoted in Beautiful Code, Compelling Evidence
Direct3D!
No doubt the old pre-Direct3D 5 interface was pretty darn fugly:
// GL code
glBegin (GL_TRIANGLES);
glVertex (0,0,0);
glVertex (1,1,0);
glVertex (2,0,0);
glEnd ();
// D3D code, tonnes of crap removed
v = &buffer.vertexes[0];
v->x = 0; v->y = 0; v->z = 0;
v++;
v->x = 1; v->y = 1; v->z = 0;
v++;
v->x = 2; v->y = 0; v->z = 0;
c = &buffer.commands;
c->operation = DRAW_TRIANGLE;
c->vertexes[0] = 0;
c->vertexes[1] = 1;
c->vertexes[2] = 2;
IssueExecuteBuffer (buffer);
Its not too bad, nowadays - it only took Microsoft 10 versions to get it right...
I would say MFC, ATL and WTL. All 3 of these libraries use excessive hungarian notation, redefine data types for no apparent reason (CString redefined over and over) and are notoriously changed with each version of visual studio.
I like COM. It provides a component oriented architecture long before .NET was even developed. However, the expansion of COM into DCOM, its many wrappers like ATL and its general lack of comprehensive documentation make it the ugliest API i have to deal with at work.
Most certainly not the ugliest. There are probably so many, but Flex has a special place in hell. Specifically UIComponent which compared to the Sprite, feels like using a chainsaw to peel an apple. I believe Flex would have been much improved by using more lightweight objects and mixin-style features similar to how Dojo works on the Javascript side.
The ECMAScript/Actionscript Date class is all but backwards and useless. It's been a constant pain any time I've needed to do something more complex than add timestamps to logs. They need more parsing options (e.g., the ability to specify the input format), and better time management, like intelligent increments, convenience functions, etc...
C++ STL libraries (and templates in general), while obviously useful, have always felt plain ugly. No suggestions for improvements though. They work.
Oracle's ProC, ProAda, Pro*this-that-the-other things. They were a preprocessor front end for C, Ada, and Fortran, I think, maybe some others, that let you jam SQL into your source code.
They did also have a library which worked much better, and was much more flexible.
(That was more than 10 years ago, I have no idea what they do now, though I wouldn't be surprised if it was still the same, just so as not to break people's code.)
well, it was a well-known library about 20 years ago, but i think the original btrieve data engine has the worst api ever written. almost everything goes through a single call, with each of its many parameters containing a different value depending on which call you're really doing (one parameter was a flag telling the system if you wanted to open a file, close a file, search, insert, etc). i liked btrieve way back then, but i spent a long time making a good abstraction layer.
it could have been easily improved by not forcing everything into one call. not only was the one call hideous, but the programmer was responsible for allocating, passing in, and freeing the position block ... some memory used by btrieve to track the open file handle, position, etc. another improvement would be to allow ascii text to be used when defining the indexing. indices had to be specified by a convoluted binary representation.
best regards,
don
A lot of the CRT library functions are poorly or vaguely named possibly due to legacy coding restrictions back in the day and thus require frequent use of the F1 key for people to find the right function and supply the right arguments.
I've been using CRT functions for a while and I still find myself hitting F1 a fair amount.

What is the most obfuscated code you've had to fix?

Most programmers will have had the experience of debugging/fixing someone else's code. Sometimes that "someone else's code" is so obfuscated it's bad enough trying to understand what it's doing.
What's the worst (most obfuscated) code you've had to debug/fix?
If you didn't throw it away and recode it from scratch, well why didn't you?
PHP OSCommerce is enough to say, it is obfuscated code...
a Java class
only static methods that manipulates DOM
8000 LOCs
long chain of methods that return null on "error": a.b().c().d().e()
very long methods (400/500 LOC each)
nested if, while, like:
if (...) {
for (...) {
if (...) {
if (...) {
while (...) {
if (...) {
cut-and-paste oriented programming
no exceptions, all exceptions are catched and "handled" using printStackTrace()
no unit tests
no documentation
I was tempted to throw away and recode... but, after 3 days of hard debugging,
I've added the magic if :-)
Spaghetti code PHP CMS system.
by default, programmers think someone else's code is obfuscated.
The worse I probably had to do was interpreting what variables i1, i2 j, k, t were in a simple method and they were not counters in 'for' loops.
In all other circumstances I guess the problem area was difficult which made the code look difficult.
I found this line in our codebase today and thought it was a nice example of sneaky obfuscation:
if (MULTICLICK_ENABLED.equals(propService.getProperty(PropertyNames.MULTICLICK_ENABLED))) {} else {
return false;
}
Just making sure I read the whole line. NO SKIMREADING.
When working on a GWT project, I would reach parts of GWT-compiled obfuscated JS code which wasn't mine.
Now good luck debugging real obfuscated code.
I can't remember the full code, but a single part of it remains burned into my memory as something I spend hours trying to understand:
do{
$tmp = shift unless shift;
$tmp;
}while($tmp);
I couldn't understand it at first, it looks so useless, then I printed out #_ for a list of arguments, a series of alternating boolean and function names, the code was used in conjunction with a library detection module that changed behaviour if a function was broken, but the code was so badly documented and made of things like that which made no sense without a complete understanding of the full code I gave up and rewrote the whole thing.
UPDATE from DVK:
And, lest someone claims this was because Perl is unreadable as opposed to coder being a golf master instead of good software developer, here's the same code in a slightly less obfuscated form (the really correct code wouldn't even HAVE alternating sub names and booleans in the first place :)
# This subroutine take a list of alternating true/false flags
# and subroutine names; and executes the named subroutines for which flag is true.
# I am also weird, otherwise I'd have simply have passed list of subroutines to execute :)
my #flags_and_sub_names_list = #_;
while ( #flags_and_sub_names_list ) {
my $flag = shift #flags_and_sub_names_list;
my $subName = shift #flags_and_sub_names_list;
next unless $flag && $subName;
&{ $subName }; # Call the named subroutine
}
I've had a case of a 300lines function performing input sanitization which missed a certain corner case. It was parsing certain situations manually using IndexOf and Substring plus a lot of inlined variables and constants (looks like the original coder didn't know anything about good practices), and no comment was provided. Throwing it away wasn't feasible due to time constraints and the fact that I didn't have the specification required so rewriting it would've meant understanding the original, but after understanding it fixing it was just quicker. I also added lots of comments, so whoever shall come after me won't feel the same pain taking a look at it...
The Perl statement:
select((select(s),$|=1)[0])
which, at the suggestion of the original author (Randal Schwartz himself, who said he disliked it but nothing else was available at the time), was replaced with something a little more understandable:
IO::Handle->autoflush
Beyond that one-liner, some of the Java JDBC libraries from IBM are obfuscated and all variables and functions are either combinations of the letter 'l' and '1' or single/double characters - very hard to track anything down until you get them all renamed. Needed to do this to track down why they worked fine in IBM's JRE but not Sun's.
If you're talking about HLL codes, once I was updating project written by a chinese and all comments were chinese (stored in ansii) and it was a horror to understand some code fragments, if you're talking about low level code there were MANY of them (obfuscated, mutated, vm-ed...).
I once had to reverse engineer a Java 1.1 framework that:
Extended event-driven SAX parser classes for every class, even those that didn't parse XML (the overridden methods were simply invoked ad hoc by other code)
Custom runtime exceptions were thrown in lieu of method invocations wherever possible. As a result, most of the business logic landed in a nested series of catch blocks.
If I had to guess, it was probably someone's "smart" idea that method invocations were expensive in Java 1.1, so throwing exceptions for non-exceptional flow control was somehow considered an optimization.
Went through about three bottles of eye drops.
I once found a time bomb that had been intentionally obfuscated.
When I had finally decoded what it was doing I mentioned it to the manager who said they knew about the time bomb but had left it in place because it was so ineffective and was interwoven with other code.
The time bomb was (presumably) supposed to go off after a certain date.
Instead, it had a bug in it so it only activated if someone was working after lunchtime on Dec 31st.
It had taken three years for that circumstance to occur since the guy who wrote the time bomb left the company.

Ruby Equivalent of C++ Const?

I'm learning Ruby in my spare time, and I have a question about language constructs for constants. Does Ruby have an equivalent of the C++ const keyword to keep variables from being modified? Here's some example code:
first_line = f.gets().chomp()
column_count = first_line.split( %r{\s+} ).size()
print column_count, "\n"
I'd like to declare column_count to be const, because I use it below in my program and I really don't want to modify it by mistake. Does Ruby provide a language construct for doing this, or should I just suck it up and realize that my variables are always mutable?
Response to comments:
'The most likely cause of "accidental" overwriting of variables is, I'd guess, long blocks of code.' I agree with the spirit of your point, but disagree with the letter. Your point about avoiding long blocks of code and unnecessary state is a good one, but for constants can also be useful in describing the design of code inside of the implementation. A large part of the value of const in my code comes from annotating which variables I SHOULD change and which I shouldn't, so that I'm not tempted to change them if I come back to my code next year. This is the same sentiment that suggests that code that uses short comments because of good variable names and clear indentation is better than awkwardly written code explained by detailed comments.
Another option appears to be Ruby's #freeze method, which I like the look of as well. Thanks for the responses everyone.
Ruby variables in general are, well, variable.
Beyond Jeremy's answer, while entirely accurate, doesn't lead you to a Ruby style that's very "mainstream" or idiomatically sound and I wouldn't recommend it for adoption. Ruby doesn't work like C++ and generally isn't very appropriate for things that C++ is best used for. Operating systems, word processors, that kind of thing.
The most likely cause of "accidental" overwriting of variables is, I'd guess, long blocks of code. After all, if you change the value of a variable in a five-line method, it's going to be fairly apparent! If you're habitually writing blocks of code longer than, say, 10 lines, then those chunks are probably doing too many things and I strongly advise that you make efforts to break them up (increase cohesion). Localise variables as much as possible to minimise the chance of unexpected side-effects (reduce coupling).
By convention, constants in ruby are generally written in all caps such as COLUMN_COUNT. But as it was pointed out, all variables that start with a capital letter are Constants.
Variables that start with a capital letter are constants in Ruby. So you could change your code to this:
first_line = f.gets().chomp()
Column_count = first_line.split( %r{\s+} ).size()
print Column_count, "\n"
Now you'll get a warning if you try to modify Column_count.

Resources