Mapping Untyped Lisp data into a typed binary format for use in compiled functions - data-structures

Background: I'm writing a toy Lisp (Scheme) interpreter in Haskell. I'm at the point where I would like to be able to compile code using LLVM. I've spent a couple days dreaming up various ways of feeding untyped Lisp values into compiled functions that expect to know the format of the data coming at them. It occurs to me that I am not the first person to need to solve this problem.
Question: What are some historically successful ways of mapping untyped data into an efficient binary format.
Addendum: In point of fact, I do know which of about a dozen different types the data is, I just don't know which one might be sent to the function at compile time. The function itself needs a way to determine what it got.

Do you mean, "I just don't know which [type] might be sent to the function at runtime"? It's not that the data isn't typed; certainly 1 and '() have different types. Rather, the data is not statically typed, i.e., it's not known at compile time what the type of a given variable will be. This is called dynamic typing.
You're right that you're not the first person to need to solve this problem. The canonical solution is to tag each runtime value with its type. For example, if you have a dozen types, number them like so:
0 = integer
1 = cons pair
2 = vector
etc.
Once you've done this, reserve the first four bits of each word for the tag. Then, every time two objects get passed in to +, first you perform a simple bit mask to verify that both objects' first four bits are 0b0000, i.e., that they are both integers. If they are not, you jump to an error message; otherwise, you proceed with the addition, and make sure that the result is also tagged accordingly.
This technique essentially makes each runtime value a manually-tagged union, which should be familiar to you if you've used C. In fact, it's also just like a Haskell data type, except that in Haskell the taggedness is much more abstract.
I'm guessing that you're familiar with pointers if you're trying to write a Scheme compiler. To avoid limiting your usable memory space, it may be more sensical to use the bottom (least significant) four bits, rather than the top ones. Better yet, because aligned dword pointers already have three meaningless bits at the bottom, you can simply co-opt those bits for your tag, as long as you dereference the actual address, rather than the tagged one.
Does that help?

Your default solution should be a simple tagged union. If you want to narrow your typing down to more specific types, you can do it - but it won't be that "toy" any more. A thing to look at is called abstract interpretation.
There are few successful implementations of such an optimisation, with V8 being probably the most widespread. In the Scheme world, the most aggressively optimising implementation is Stalin.

Related

Use big.Rat with Go to get Abs() value

I am a beginner with Go and a java developer.
I am currently working with big.Rat.
I need to get the Abs of a Rat n for which I have to write something like
n.Abs(n) or something like big.Rat{}.Abs(n)
Why didn't go provide something like just n.Abs()?
Or am I going wrong somewhere?
Go's big package is concerned with memory allocation when it comes to its function signatures. A big.Rat consists of two big.Ints which each contain an array of uints. Unlike an int (native 32 or 64 bit integer), a big.Int must thus be allocated dynamically, depending on its value. For large values this means more elements in the array.
Your proposed function signature n.Abs() would mean that a new array of the same size as n's would have to be allocated for this operation. In reality we often have the case that the original n is no longer needed, thus we can reuse its existing memory. To allow this, the Abs function takes a pointer to an existing big.Rat which might be n itself. The implementation can now reuse the memory. The caller is now in full control of what memory to use for these operations.
This might not make the nicest API for all use cases, in fact if you just want to do a quick calculation for a few large numbers, on a computer with Gigabytes of RAM, you might have preferred the n.Abs() version, but if you do numerically expensive computations with a lot of large numbers, you must be able to control your memory. Imagine doing some image manipulation on a Raspberry for example, where you are more constraint by the available memory. In this case the existing API allows you to be more efficient.

Does using enums instead of booleans really affect cache usage?

I saw a comment thread where it was suggested that enums should be used instead of booleans in general since it's clearer what the parameters do at the call site, and it's easier to refactor if you need to add a case.
Then someone else claimed that that was a terrible idea since it would often create an unnecessary variable for each call, which would be an unnecessary use of compute resources and cache space.
Does this second claim hold up? My understanding is that booleans are usually not stored as single bits, but as sets of bits with more than enough room for the amount of extra options in a typical enum. So the same amount of data would need to be moved around.
If I understand correctly, there would be an extra variable required if the enum has 3 or more options, (one to store the enum and one for a derived boolean on each check,) but in that case, you actually needed the three options so what can you do? In the case that you have exactly two enum options, then couldn't a compiler just transform it into a boolean in the same register, (assuming the enum value was specified as not having one 0 value and one non-zero value,) therefore not using any extra space?
One extra compare instruction I suppose, but cache usage seems to be the much bigger deal performance-wise these days. And if you have an enum that's isomorphic to a boolean, you'll often make the values automatic anyway, so the compiler should be free to fully optimize it.
Do I understand this correctly, or am I missing something?

Halide: Filter elements out of vector (Halide::Runtime::Buffer)

I have a Halide::Runtime::Buffer and would like to remove elements that match a criteria, ideally such that the operation occurs in-place and that the function can be defined in a Halide::Generator.
I have looked into using reductions, but it seems to me that I cannot output a vector of a different length -- I can only set certain elements to a value of my choice.
So far, the only way I got it to work was by using a extern "C" call and passing the Buffer I wanted to filter, along with a boolean Buffer (1's and 0's as ints). I read the Buffers into vectors of another library (Armadillo), conducted my desired filter, then read the filtered vector back into Halide.
This seems quite messy and also, with this code, I'm passing a Halide::Buffer object, and not a Halide::Runtime::Buffer object, so I don't know how to implement this within a Halide::Generator.
So my question is twofold:
Can this kind of filtering be achieved in pure Halide, preferably in-place?
Is there an example of using extern "C" functions within Generators?
The first part is effectively stream compaction. It can be done in Halide, though the output size will either need to be fixed or a function of the input size (e.g. the same size as the input). One can get the max index produced as output as well to indicate how many results were produced. I wrote up a bit of an answer on how to do a prefix sum based stream compaction here: Halide: Reduction over a domain for the specific values . It is an open question how to do this most efficiently in parallel across a variety of targets and we hope to do some work on exploring that space soon.
Whether this is in-place or not depends on whether one can put everything into a single series of update definitions for a Func. E.g. It cannot be done in-place on an input passed into a Halide filter because reductions always allocate a buffer to work on. It may be possible to do so if the input is produced inside the Generator.
Re: the second question, are you using define_extern? This is not super well integrated with Halide::Runtime::Buffer as the external function must be implemented with halide_buffer_t but it is fairly straight forward to access from within a Generator. We don't have a tutorial on this yet, but there are a number of examples in the tests. E.g.:
https://github.com/halide/Halide/blob/master/test/generator/define_extern_opencl_generator.cpp#L19
and the definition:
https://github.com/halide/Halide/blob/master/test/generator/define_extern_opencl_aottest.cpp#L119
(These do not need to be extern "C" as I implemented C++ name mangling a while back. Just set the name mangling parameter to define_extern to NameMangling::CPlusPlus and remove the extern "C" from the external function's declaration. This is very useful as it gets one link time type checking on the external function, which catches a moderately frequent class of errors.)

Why is useful to have a atom type (like in elixir, erlang)?

According to http://elixir-lang.org/getting-started/basic-types.html#atoms:
Atoms are constants where their name is their own value. Some other
languages call these symbols
I wonder what is the point of have a atom type. Probably to help build a parser or for macros? But in everyday use how it help the programmer?
BTW: Never use elixir or erlang, just note it exist (also in kdb)
They're basically strings that can easily be tested for equality.
Consider a string. Conceptually, we generally want to think of strings as being equal if they have the same contents. For example, "dog" == "dog" but "dog" != "cat". However, to check the equality of strings, we have to check to see if each letter in one string is equal to the letter in the same position in another string, which means that we have to walk through each element of the string and check each character for equality. This becomes a bit more cumbersome if dealing with Unicode strings and having to consider different ways of composing identical characters (for example, the character é has two representations in UTF-8).
It would be much simpler if we stored identical strings at the same location in memory. Then, checking equality would be a simple pointer or index comparison.
As a consequence of storing identical strings in the same location in memory, we can also store one copy of each unique kind of string regardless of how many times it is used in the program, thus saving some memory for commonly-used strings as well.
At a higher level, using atoms also lets us think of strings the same way we think of other primitive data types like integers.
I think that one of the most common usage in erlang is to tag variables and messages, with the benefit of fast comparison (pattern match) as mipadi says.
For example you write a function that may fail depending on parameters provided, the status of connection to a server, or any reason. A very frequent usage is to return a tuple {ok,Value} in case of success, {error,Reason} in case of error. The calling function will have the choice to manage only the success case coding {ok,Value} = yourModule:yourFunction(Param...). Doing this it is clear that you consider only the success case, you extract directly the Value from the function return, it is fast, and you don't have to share any header with yourModule to decode the ok atom.
In messages you will often see things like {add,Key,Value}, {delete,Key},{delete_all}, {replace,Key,Value}, {append,Key,Value}... These are explicit messages, with the same advantages as mentioned before: fast,sensible,no share of header...
Atoms are constants with itself as value.
This is a concept very usefull in distributed systems, where constants can be defined differently on each system, while atoms are self-containing with no need for definement.

What's the most efficient way of combining switch/if statements

This question doesn't address any programming language in particular but of course I'm happy to hear some examples.
Imagine a big number of files, let's say 5000, that have all kinds of letters and numbers in it. Then, there is a method that receives a user input that acts as an alias in order to display that file. Without having the files sorted in a folder, the method(s) need to return the file name that is associated to the alias the user provided.
So let's say user input "gd322" stands for the file named "k4e23", the method would look like
if(input.equals("gd322")){
return "k4e23";
}
Now, imagine having 4 values in that method:
switch(input){
case gd322: return fw332;
case g344d: return 5g4gh;
case s3red: return 536fg;
case h563d: return h425d;
} //switch on string, no break, no string indicators, ..., pls ignore the syntax, it's just pseudo
Keeping in mind we have 5000 entries, there are probably more than just 2 entries starting with g. Now, if the user input starts with 's', instead of wasting CPU cycles checking all the a's, b's, c's, ..., we could also make another switch for this, which then directs to the 'next' methods like this:
switch(input[0]){ //implying we could access strings like that
case a: switchA(input);
case b: switchB(input);
// [...]
case g: switchG(input);
case s: switchS(input);
}
So the CPU doesn't have to check on all of them, but rather calls a method like this:
switchG(String input){
switch(input){
case gd322: return fw332;
case g344d: return 5g4gh;
// [...]
}
Is there any field of computer science dealing with this? I don't know how to call it and therefore don't know how to search for it but I think my thoughts make sense on a large scale. Pls move the thread if it doesn't belong here but I really wanna see your thoughts on this.
EDIT: don't quote me on that "5000", I am not in the situation described above and I wanted to talk about this completely theoretical, it could also be 3 entries or 300'000, maybe even less or more
If you have 5000 options, you're probably better off hashing them than having hard-coded if / switch statements. In c++ you could also use std::map to pair a function pointer or other option handling information with each possible option.
Interesting, but I don't think you can give a generic answer. It all depends on how the code is executed. Many compilers will have all kinds of optimizations, in the if and switch, but also in the way strings are compared.
That said, if you have actual (disk) files with those lists, then reading the file will probably take much longer than processing it, since disk I/O is very slow compared to memory access and CPU processing.
And if you have a list like that, you may want to build a hash table, or simply a sorted list/array in which you can perform a binary search. Sorting it also takes time, but if you have to do many lookups in the same list, it may be well worth the time.
Is there any field of computer science dealing with this?
Yes, the science of efficient data structures. Well, isn't that what CS is all about? :-)
The algorithm you described resembles a trie. It wouldn't be statically encoded in the source code with switch statements, but would use dynamic lookups in a structure loaded from somewhere and stuff, but the idea is the same.
Yes the problem is known and solved since decades. Hash functions.
Basically you have a set of values (here strings like "gd322", "g344d") and you want to know if some other value v is among them.
The idea is to put the strings in a big array, at an index calculated from their values by some function. Given a value v, you'll compute an index the same way, and check whether the value v is here or not. Much faster than checking the whole array.
Of course there is a problem with different values falling at the same place : collisions. Some magic is needed then : perfect hash functions whose coefficients are tweaked so values from the initial set don't cause any collisions.

Resources