How to achieve a recursive deftype - data-structures

I'm curious as to how to do a Clojure deftype that contains a reference to itself, e.g.
(deftype BinaryTree [^BinaryTree left ^BinaryTree right])
This doesn't work... however I see no intrinsic reason why it shouldn't be possible since the underlying Java class is perfectly capable of referring to itself.
What am I doing wrong here?
Mike.

Currently ^Class hints on fields (in opposition to ^primitive hints) are discarded, so there's no gain in trying to put them. This may change in the future.
However auto reference in a type definition (eg in method bodies, not in fields) somewhat works but the implementation is a bit of a hack. There's little incentive to fix auto-reference in the current java compiler given the promise of the rewrite of the compiler in Clojure.

Related

Is define implementation dependent in R7RS?

I've long since known that define is scary and should be used with caution unless you know for sure how your implementation handles it. Out of interest, I recently opened up R7RS and read all that I could find about define and nothing gave me the impression that any of it is implementation dependent. Have I missed something or is define no longer implementation-dependent in R7RS?
You seem to be reading something into the answer you linked which isn’t there.
define has always been well defined, just as well-defined as let is. Most people choose to use define only at the top level of modules to create top-level bindings, but that’s a stylistic choice — it’s also capable of creating local bindings, like let is, if you use it inside and at the top of an ‘internal’ body, such as inside a procedure or a let or similar. Multiple defines in such a context are technically equivalent to letrec*, as another answer noted.
The most common interpretation of define is to replace it with letrec*.
But this problem has indeed many possible interpretations and the language does not impose any. Any interpretation is valid from the viewpoint of the language.

Java Bytecode manipulation libraries

I am starting to work on a project and for one of the tasks I need to analyze the source code in order to gather information about the classes and their methods. More specifically, for each method I need to know which internal attributes and external objects (references) it uses throughout the entire method body.
I discussed it with my supervisors and they think that Bytecode manipulation libraries is the way to go. I already looked at BCEL, ASM and Javassist but I'm not sure which one I need to use. Do they all provide access to the method body where I can see all the instructions and get the information I need?
Any advice would be appreciate it. Thank you!
If you really “need to analyze the source code”, then libraries which allow to inspect the bytecode are not the way to go.
Otherwise, you really need to define your task precisely. Either, you are about to analyze classes, regardless of whether you will look at their source code or byte code, or you want to analyze source code and consider doing it by compiling first, followed by analyzing the compiled result. In the latter case, you have to compare the effort of both steps with alternative solution, which may, e.g. incorporate direct source code analysis.
Parsing byte code is rather easy, easier than analyzing source code, which is the reason why bytecode is produced prior to the execution of Java programs. To answer your concrete question, yes, all three libraries offer you a way to analyze the instructions and associated information. Which one is the best to fit your needs, is a question that is beyond the scope of Stackoverflow.
Whether analyzing the byte code helps, depends on your exact requirements. When it comes to field and method access, you may precisely get most of them using that approach. Only inlined compile-time constants lack their origins. When it comes to type use, you have to consider that not every source code artifact has an existing counterpart in the byte code, e.g. widening casts produce no actual code and and local variables usually don’t have a declared type (debugging information aside), but only an implied type which depends on how they are actually used. They also have no information about Generics, unless debugging information has been included.

How to Work with Ruby Duck Typing

I am learning Ruby and I'm having a major conceptual problem concerning typing. Allow me to detail why I don't understand with paradigm.
Say I am method chaining for concise code as you do in Ruby. I have to precisely know what the return type of each method call in the chain, otherwise I can't know what methods are available on the next link. Do I have to check the method documentation every time?? I'm running into this constantly running tutorial exercises. It seems I'm stuck with a process of reference, infer, run, fail, fix, repeat to get code running rather then knowing precisely what I'm working with during coding. This flies in the face of Ruby's promise of intuitiveness.
Say I am using a third party library, once again I need to know what types are allow to pass on the parameters otherwise I get a failure. I can look at the code but there may or may not be any comments or declaration of what type the method is expecting. I understand you code based on methods are available on an object, not the type. But then I have to be sure whatever I pass as a parameter has all the methods the library is expect, so I still have to do type checking. Do I have to hope and pray everything is documented properly on an interface so I know if I'm expected to give a string, a hash, a class, etc.
If I look at the source of a method I can get a list of methods being called and infer the type expected, but I have to perform analysis.
Ruby and duck typing: design by contract impossible?
The discussions in the preceding stackoverflow question don't really answer anything other than "there are processes you have to follow" and those processes don't seem to be standard, everyone has a different opinion on what process to follow, and the language has zero enforcement. Method Validation? Test-Driven Design? Documented API? Strict Method Naming Conventions? What's the standard and who dictates it? What do I follow? Would these guidelines solve this concern https://stackoverflow.com/questions/616037/ruby-coding-style-guidelines? Is there editors that help?
Conceptually I don't get the advantage either. You need to know what methods are needed for any method called, so regardless you are typing when you code anything. You just aren't informing the language or anyone else explicitly, unless you decide to document it. Then you are stuck doing all type checking at runtime instead of during coding. I've done PHP and Python programming and I don't understand it there either.
What am I missing or not understanding? Please help me understand this paradigm.
This is not a Ruby specific problem, it's the same for all dynamically typed languages.
Usually there are no guidelines for how to document this either (and most of the time not really possible). See for instance map in the ruby documentation
map { |item| block } → new_ary
map → Enumerator
What is item, block and new_ary here and how are they related? There's no way to tell unless you know the implementation or can infer it from the name of the function somehow. Specifying the type is also hard since new_ary depends on what block returns, which in turn depends on the type of item, which could be different for each element in the Array.
A lot of times you also stumble across documentation that says that an argument is of type Object, Which again tells you nothing since everything is an Object.
OCaml has a solution for this, it supports structural typing so a function that needs an object with a property foo that's a String will be inferred to be { foo : String } instead of a concrete type. But OCaml is still statically typed.
Worth noting is that this can be a problem in statically typed lanugages too. Scala has very generic methods on collections which leads to type signatures like ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[Array[T], B, That]): That for appending two collections.
So most of the time, you will just have to learn this by heart in dynamically typed languages, and perhaps help improve the documentation of libraries you are using.
And this is why I prefer static typing ;)
Edit One thing that might make sense is to do what Scala also does. It doesn't actually show you that type signature for ++ by default, instead it shows ++[B](that: GenTraversableOnce[B]): Array[B] which is not as generic, but probably covers most of the use cases. So for Ruby's map it could have a monomorphic type signature like Array<a> -> (a -> b) -> Array<b>. It's only correct for the cases where the list only contains values of one type and the block only returns elements of one other type, but it's much easier to understand and gives a good overview of what the function does.
Yes, you seem to misunderstand the concept. It's not a replacement for static type checking. It's just different. For example, if you convert objects to json (for rendering them to client), you don't care about actual type of the object, as long as it has #to_json method. In Java, you'd have to create IJsonable interface. In ruby no overhead is needed.
As for knowing what to pass where and what returns what: memorize this or consult docs each time. We all do that.
Just another day, I've seen rails programmer with 6+ years of experience complain on twitter that he can't memorize order of parameters to alias_method: does new name go first or last?
This flies in the face of Ruby's promise of intuitiveness.
Not really. Maybe it's just badly written library. In core ruby everything is quite intuitive, I dare say.
Statically typed languages with their powerful IDEs have a small advantage here, because they can show you documentation right here, very quickly. This is still accessing documentation, though. Only quicker.
Consider that the design choices of strongly typed languages (C++,Java,C#,et al) enforce strict declarations of type passed to methods, and type returned by methods. This is because these languages were designed to validate that arguments are correct (and since these languages are compiled, this work can be done at compile time). But some questions can only be answered at run time, and C++ for example has the RTTI (Run Time Type Interpreter) to examine and enforce type guarantees. But as the developer, you are guided by syntax, semantics and the compiler to produce code that follows these type constraints.
Ruby gives you flexibility to take dynamic argument types, and return dynamic types. This freedom enables you to write more generic code (read Stepanov on the STL and generic programming), and gives you a rich set of introspection methods (is_a?, instance_of?, respond_to?, kind_of?, is_array?, et al) which you can use dynamically. Ruby enables you to write generic methods, but you can also explicity enforce design by contract, and process failure of contract by means chosen.
Yes, you will need to use care when chaining methods together, but learning Ruby is not just a few new keywords. Ruby supports multiple paradigms; you can write procedural, object oriend, generic, and functional programs. The cycle you are in right now will improve quickly as you learn about Ruby.
Perhaps your concern stems from a bias towards strongly typed languages (C++, Java, C#, et al). Duck typing is a different approach. You think differently. Duck typing means that if an object looks like a , behaves like a , then it is a . Everything (almost) is an Object in Ruby, so everything is polymorphic.
Consider templates (C++ has them, C# has them, Java is getting them, C has macros). You build an algorithm, and then have the compiler generate instances for your chosen types. You aren't doing design by contract with generics, but when you recognize their power, you write less code, and produce more.
Some of your other concerns,
third party libraries (gems) are not as hard to use as you fear
Documented API? See Rdoc and http://www.ruby-doc.org/
Rdoc documentation is (usually) provided for libraries
coding guidelines - look at the source for a couple of simple gems for starters
naming conventions - snake case and camel case are both popular
Suggestion - approach an online tutorial with an open mind, do the tutorial (http://rubymonk.com/learning/books/ is good), and you will have more focused questions.

What is the difference between Form5!ProgressBar.Max and Form5.ProgressBar.Max?

I'm looking at a piece of very old VB6, and have come across usages such as
Form5!ProgressBar.Max = time_max
and
Form5!ProgressBar.Value = current_time
Perusing the answer to this question here and reading this page here, I deduce that these things mean the same as
Form5.ProgressBar.Max = time_max
Form5.ProgressBar.Value = current_time
but it isn't at all clear that this is the case. Can anyone confirm or deny this, and/or point me at an explanation in words of one syllable?
Yes, Form5!ProgressBar is almost exactly equivalent to Form5.ProgressBar
As far as I can remember there is one difference: the behaviour if the Form5 object does not have a ProgressBar member (i.e. the form does not have a control called ProgressBar). The dot-notation is checked at compile time but the exclamation-mark notation is checked at run time.
Form5.ProgressBar will not compile.
Form5!ProgressBar will compile but will give an error at runtime.
IMHO the dot notation is preferred in VB6, especially when accessing controls. The exclamation mark is only supported for backward-compatibility with very old versions of VB.
The default member of a Form is (indirectly) the Controls collection.
The bang (!) syntax is used for collection access in VB, and in many cases the compiler makes use of it to early bind things that otherwise would be accessed more slowly through late binding.
Far from deprecated, it is often preferable.
However in this case since the default member of Form objects is [_Default] As Object containing a reference to a Controls As Object instance, there is no particular advantage or disadvantage to this syntax over:
Form5("ProgressBar").Value
I agree that in this case however it is better to more directly access the control as a member of the Form as in:
Form5.ProgressBar.Value
Knowing the difference between these is a matter of actually knowing VB. It isn't simply syntactic though, the two "paths" do different things that get to the same result.
Hopefully this answer offers an explanation rather merely invoking voodoo.

Is there any downside to redundant qualifiers? Any benefit?

For example, referencing something as System.Data.Datagrid as opposed to just Datagrid. Please provide examples and explanation. Thanks.
The benefit is that you don't need to add an import for everything you use, especially if it's the only thing you use from a particular namespace, it also prevents collisions.
The downside, of course, is that the code balloons out in size and gets harder to read the more you use specific qualifiers.
Personally I tend to use imports for most things unless I know for sure I will only be using something from a particular namespace once or twice, so it won't impact the readability of my code.
You're being very explicit about the type you're referencing, and that is a benefit. Although, in the very same process you're giving up code clarity, which clearly is a downside in my case, as I want code to be readable and understandable. I go for the short version unless I have a conflict in different namespaces which can only be solved with the explicit referencing to classes.. Unless I make an alias for it with the keyword using:
using Datagrid = System.Data.Datagrid;
Actually the full path is global::System.Data.DataGrid. The point of using a more qualified path is to avoid having to use additional using statements, especially if the introduction of another using will cause problems with type resolution. More fully qualified identifiers exist so that you can be explicit when you need to be explicit, but if the class's namespace is clear, then the DataGrid version is clearer to many.
I generally use the shortest form available in order to keep the code as clean and readable as possible. That's what using directives are for, after all, and tooltips in the VS editor give you instant detail on the provenance of a type.
I also tend to use a namespace tag for RCWs in a COM interop layer, to call out those variables explicitly in the code (they may need special attention on lifecycle and collection), eg
using _Interop = Some.Interop.Namespace;
In terms of performance there is no upside/downside. Everything is resolved at compile time and the generated MSIL is identical whether you use fully-qualified names or not.
The reason why its use is prevalent in the .NET world is because of auto-generated code, such as designer markup. In that case it would be better to fully-qualify names like class names because of possible conflicts with other classes you may have in your code.
If you have a tool like ReSharper, it will actually tell you what fully-qualified references you have are unnecessary (e.g. by graying them out) so you can lop them off. If you frequently cut-paste code across your various code bases, it would be a must to fully qualify them. (then again, why would you want to do cut-paste all the time; it's a bad form of code reuse!)
I don't think there is really a downside, just readability vs actual time spent coding. In general if you don't have namespaces with ambiguous object I don't think it's really needed. Another thing to consider is level of use. If you have one method that uses reflection and you are alright with typeing System.Reflection 10 times, then it's not a big deal but if you plan on using a namespace alot then I would recommend an include.
Depending on your situation, extra qualifiers will generate a warning (if this is what you mean by redundant). If you then treat warnings as errors, that's a pretty serious downside.
I've run into this with GCC for example.
struct A {
int A::b; // warning!
}

Resources