Is there a word that means "trans-compilation result"? - compilation

A compiler takes source code and produces "binary" code.
A trans-compiler (or "transpiler") takes source code and produces ??? code?
I wouldn't like using the word source again because that is what humans write, and something isn't source if it is automatically produced by something from something else.
Is there an accepted term to mean trans-compilation output?

The phrase I think you should use is one of two:
translated code
generated code

Related

What is this apparently non-standard struct packing syntax fed into GCC?

I am a little bit dumbstruck by some code that is associated with a 3rd-party code base I'm working with. All code is written in C or assembler except for a number of files adhering to the syntax described below. I cannot find any documentation on this syntax yet GCC swallows it without any problem. It's GCC 8 I work with. The syntac must be some extension to GCC. It would be very nice if somebody could enlighten me as to exactly what extension it is and where it is documented.
The code obviously defines struct types with packing and uses syntax like this:
Comment lines begin with "--"
Keywords are "block", "padding", "field", and "field_high", possibly more. A typical piece of code looks like this:
block <BLOCK_NAME> {
field <FIELD_NAME_NO_1> 1
field <FIELD_NAME_NO_2> 1
padding 8
field_high <FIELD_NAME_NO_3> 6
}
A block can contain any number of fields and paddings. The numbers given always add up to a word length on the target architecture.
Files containing this kind of code most often have ".bf" es their extension while ".c" can occur too. Some files have #include's referring to ordinary C headers while some ordinary C files have #includes referring to ".bf" files.
A quick glance at the tools directory in the Git repository found me bitfield_gen.py, which claims to be a code generator for "bitfield structures". I presume that's what .bf stands for.
There are some CMake functions for building bitfield targets in tools/helpers.cmake. That will probably make sense to people more familiar with CMake than I am.
The Bit Field Generator is documented here http://research.davidcock.fastmail.fm/papers/Cock_08.pdf

Find write statement in Fortran

I'm using Fortran for my research and sometimes, for debugging purposes, someone will insert in the code something like this:
write(*,*) 'Variable x:', varx
The problem is that sometimes it happens that we forget to remove that statement from the code and it becomes difficult to find where it is being printed. I usually can get a good idea where it is by the name 'Variable x' but it sometimes happens that that information might no be present and I just see random numbers showing up.
One can imagine that doing a grep for write(*,*) is basically useless so I was wondering if there is an efficient way of finding my culprit, like forcing every call of write(*,*) to print a file and line number, or tracking stdout.
Thank you.
Intel's Fortran preprocessor defines a number of macros, such as __file__ and __line__ which will be replaced by, respectively, the file name (as a string) and line number (as an integer) when the pre-processor runs. For more details consult the documentation.
GFortran offers similar facilities, consult the documentation.
Perhaps your compiler offers similar capabilities.
As has been previously implied, there's no Fortran--although there may be a compiler approach---way to change the behaviour of the write statement as you want. However, as your problem is more to do with handling (unintentionally produced) bad code there are options.
If you can't easily find an unwanted write(*,*) in your code that suggests that you have many legitimate such statements. One solution is to reduce the count:
use an explicit format, rather than list-directed output (* as the format);
instead of * as the output unit, use output_unit from the intrinsic module iso_fortran_env.
[Having an explicit format for "proper" output is a good idea, anyway.]
If that fails, use your version control system to compare an old "good" version against the new "bad" version. Perhaps even have your version control system flag/block commits with new write(*,*)s.
And if all that still doesn't help, then the pre-processor macros previously mentioned could be a final resort.

Is there a way to comment out blocks of code?

I recently got an assignment where I volunteer to teach kids a basic programming language. I chose Small Basic, as it's relatively easy to learn and teaches the basics of programs (if, for and while).
I haven't used it much before (I learned how to do if/for/while loops but that's about it) and was wondering if there's a way to comment out lines of code at a time. For example, in C# you can do this:
//Comment
//Comment
Or
/*Comment
Comment
Comment/*
Is there a way to do the latter in small basic? I know you can do this:
'Comment
'Comment
Etc, but can you do a ton of lines at once?
Just like in Visual Basic, there is no way to handle multi-line commenting all at once. You have to manually input each apostrophe.
If you want to comment out a correctly-formatted piece of code, one way to "comment" it out would be to place it in an If (False) ... EndIf block, but it's generally not recommended in any language.
Source
Alternatively, turn the section of code into a dummy subroutine that is never called:
Sub dummy ... EndSub
The advantage of not commenting-out code is that SB will check the syntax and variable declarations for you, so once the code is reinstated it will work straight away.
Then again if you put 'random' stuff into that section you may feel this is a dis-advantage! :-)

Building a syntax checker

I am building a app like a compiler with my own script language. The user will enter the code and the output will be another app.
So I need tell to user if some line is wrong and why it is.
But I don't know how to start.
I thought this:
All lines will start with a keyword, except for those who start with an variable. So different that are wrong.
So, I can calculate the next valid entries and check them.
Also, I thought that I can check each line, but it's complex because I can have this
var varName { /* ... */ };
Or
var varName {
/* ... */
};
Or Even
var varName
{
/* ... */
};
So why not remove the break-lines and check? Because I will lose the line number, which in this case is the most important.
Maybe I'm going to create a map between the code with and without break-line.
But first I want to hear you, if you already has this experience or you have any idea.
Thanks
There are formal languages to describe syntax and semantics of the language and there are tools that will generate parsers out of these descriptions. I suggest reading on flex and bison for starters.
It'll be fairly complicated to write your own language. But totally doable.
To able to recognize if a line is wrong, in the syntactical sense, you'd need to build a parser.
The parser checks the context-free grammar for a correct derivation of a structure from its tokens.
First you need to tokenize the file, then reconstruct it into a parse tree (to check syntax).
I took a class in this, CS 241. There's a very nice set of course notes which this is all explained in detail.
https://github.com/christhomson/lecture-notes/blob/master/cs241.pdf
You should check tools like: lex, bison and yacc.
lex is lexical analyser generator. It generates a code, which could be used for breaking the script to tokens (like numbers, keywords and so on...).
bison and yacc are both parser generators. Both can be used for generating code for parsing your language (combining tokens to statements).
Just google tutorials for those tools.

Is there any disassembler which generates compilable assembly source code?

I would like to know, is there any Windows platform disassembler (software) can generate the assembly source code which is also compilable by an assembler?
Since disassembler can generate the assembly code based on an EXE file, is it possible the assembly code be used directly as a source code, then the source code be compiled by an assembler like NASM?
IDA can generate the source code. But in most cases you can't edit it. Assume the following code:
loc_401020:
ret
; ...
dd 0FFFFFFFFh, 0, 1, 401020h, 0
; ^^^^^^^ can you find it in big real program?
to insert any new bytes to you must either be shure that any sub_XXXX or loc_XXXX will remain at the same offset, either you must replace all its references to labels.
If you don't move any code, you don't need to recompile it - just patch and maybe extend the code section.
I think IDA is quite good at this.
Anyway, the main problem would be that the generated assembly code would be quite unreadable and very hard to mantain (no variable names, function names and signatures), so altough technically it would be ASM code, it's still better to use IDA as clever editor to deduce this information.
I'd patch the executable directly in a debugger and then save the modified executable.
Decompiling and then recompiling is a fragile process since any change of position of the reassembled code can break the program. And I think there are multiple binary representations of certain asm instructions complicating the matter ever further.
You are losing some information when disassembling an executable so you are unlikely to get a fully working executable when assembling the disassembly. If you are clever, you can extract single functions from an executable, but not the whole program.
The objconv disassembler can produce assembly code in masm, nasm, yasm and gas syntax.

Resources