list_to_atom Failed Caused by System Limitation in Erlang

list_to_atom Failed Caused by System Limitation in Erlang - compilation

I tried to compile a asn file with Erlang's asn1ct:compile function. I run the following code:
asn1ct:compile("PDU-definitions", [per, verbose]).
then got the following errors:
...
{error,{system_limit,[{erlang,list_to_atom,
["enc_InterRATHandoverInfo_v390NonCriticalExtensions_present_v3a0NonCriticalExtensions_laterNonCriticalExtensions_v3g0NonCriticalExtensions_v4b0NonCriticalExtensions_v4d0NonCriticalExtensions_v590NonCriticalExtensions_v690NonCriticalExtensions_nonCriticalExtensions"], []},
...
I googled and found there's a 255-character length limitation of Erlang's atom. Because there are too many nested data structure in the ASN file, the length of corresponding atom exceed the limitation.
My question is: if I can modify the default length limitation to a bigger value, or there are some workarounds for this situation?
Thanks!

As of R17, there remains no way to modify the maximum character limit in Erlang outside of modifying the source and recompiling. A sparse look at asn1ct's documentation suggests no means of changing the atom-encoding behaviour, either.
Best bet that I saw was the "n2n" option of compiling, which instructs the compiler to generate functions for doing name-to-enumeration conversion. I assume that it will still construct atoms in this case, however, which would be a moot point.
Nothing else in the documentation suggests a way to change name-construction behaviour, and as such severely nested data structures will cause problems.

Related

How to verify and validate parsed Google Protobuf v2 file

First, I'll just couch this in the acknowledgement, yes, I am aware of protoc, but I've got a specific requirement to extrapolate some specialized target language artifacts based on a .proto file parser outcome.
That being established, I've already got the parser itself working. I am working on resolving imported .proto dependencies. Not a terribly difficult endeavor on the surface, in and of itself.
The next steps after that, I think, are to perform a kind of "transitive linkage", as I've learned, but I am curious what I should be aware of. Prima facie, I think I should be collating a set (most likely map) of element paths to field numbers, as well as collating the reserved as well as extensions, then perhaps verifying as I traverse the .proto dependency tree.
However, I'd like to get an idea of others' experience, guidance, feedback, along these lines.
For what I'm wanting to accomplish, I do not think this verification step needs to be that elaborate, only enough to rule out invalid .proto, etc.
Oh. last but not least, I need to handle this for Protobuf v2 language spec.

How does the PPMD decompression algorithm work?

So we have a ppmd decompression code which was cut and pasted from Dmitry Shkarin's original code from 1997 (based on the few comments in it). The code itself is mostly uncommented, and I just can't find out how it works.
The code uses a suballocator, but it doesn't just allocates or deallocates from it, instead it manipulates the free block list directly from the calling code in various ways I can't decipher yet.
We have found a fuzzed sample that causes the code to crash, I was assigned to fix it.
But in order to tackle the problem I need to understand how does the decompression works (only interested in decompression).
Google wasn't very helpful either. Search results are dominated by the results of a gamer with identical nickname, or feature list of various archivers. I eventually found a Russian website where the algorithm specification can be found - only through the Wayback machine and in Russian - which I cannot read due to language barrier.
But it looks like it's only a mathematical description. So far I found nothing about the specification on how does a PPMD compressed data is laid out in compressed file or how it is consumed when decompressing.
Can anyone who understands the PPMD algorithm give me some pointers?
Ideally I'm looking for documents that explains the structure of PPMd encoded data. Something as detailed as the RFC1951 for the deflate.
UPDATE:
Well it turns out the code has quite a few fishy things.
For example this one:
MaxContext=FoundState->Successor; return;
}
*pText++ = FSymbol; Successor = (PPM_CONTEXT*) pText;
if (pText >= UnitsStart) goto RESTART_MODEL;
if ( FSuccessor ) {
if ((BYTE*) FSuccessor < UnitsStart)
It writes stuff into a byte buffer, then casts it into a struct that contains pointers.
Then in the CreateSuccessors functions we have another sorcery.
ct.oneState().Successor=(PPM_CONTEXT*) (((BYTE*) UpBranch)+1);
The UpBranch and ct.oneState().Successor are PPM_CONTEXT pointers. I can't imagine what would be the purpose of a statement like this. As I said this structure contains pointers which can be dereferenced eventually (I tried to set these pointers to NULL to see whether they are used). And it turns out they are indeed dereferenced! (at least in the second case).

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.

There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.

Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.

A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.

In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

How do I get a callstack in Haskell?

I am trying to track down a non-exhaustive pattern in a libraries code. Specifically HDBC's mysql implementation. It is trying to match over types in my program and map them to mysql's types I believe. I can't seem to get a callstack for this error which means that since there are a number of parameters to the SQL query it is difficult to track down exactly what is causing it.
Is it possible to get a callstack in haskell so I would know which parameter was causing the error? Also I would think that this should be caught by the compiler since it should be able to look at my types and the patterns and make sure that there was a corresponding match.

You can use the GHCi debugger to identify where the exception is coming from.
I walk through a full example here.

You might also take a look at the Debug.Trace library.

Cross version line matching

I'm considering how to do automatic bug tracking and as part of that I'm wondering what is available to match source code line numbers (or more accurate numbers mapped from instruction pointers via something like addr2line) in one version of a program to the same line in another. (Assume everything is in some kind of source control and is available to my code)
The simplest approach would be to use a diff tool/lib on the files and do some math on the line number spans, however this has some limitations:
It doesn't handle cross file motion.
It might not play well with lines that get changed
It doesn't look at the information available in the intermediate versions.
It provides no way to manually patch up lines when the diff tool gets things wrong.
It's kinda clunky
Before I start diving into developing something better:
What already exists to do this?
What features do similar system have that I've not thought of?

Why do you need to do this? If you use decent source version control, you should have access to old versions of the code, you can simply provide a link to that so people can see the bug in its original place. In fact the main problem I see with this system is that the bug may have already been fixed, but your automatic line tracking code will point to a line and say there's a bug there. Seems this system would be a pain to build, and not provide a whole lot of help in practice.

My suggestion is: instead of trying to track line numbers, which as you observed can quickly get out of sync as software changes, you should decorate each assertion (or other line of interest) with a unique identifier.
Assuming you're using C, in the case of assertions, this could be as simple as changing something like assert(x == 42); to assert(("check_x", x == 42)); -- this is functionally identical, due to the semantics of the comma operator in C and the fact that a string literal will always evaluate to true.
Of course this means that you need to identify a priori those items that you wish to track. But given that there's no generally reliable way to match up source line numbers across versions (by which I mean that for any mechanism you could propose, I believe I could propose a situation in which that mechanism does the wrong thing) I would argue that this is the best you can do.
Another idea: If you're using C++, you can make use of RAII to track dynamic scopes very elegantly. Basically, you have a Track class whose constructor takes a string describing the scope and adds this to a global stack of currently active scopes. The Track destructor pops the top element off the stack. The final ingredient is a static function Track::getState(), which simply returns a list of all currently active scopes -- this can be called from an exception handler or other error-handling mechanism.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio