How do compilers handle records and unions? [duplicate]

How do compilers handle records and unions? [duplicate] - data-structures

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How does a compiled C++ class look like?
Hi all,
bash$cat struct.c
struct test
{
int i;
float f;
};
bash$gcc -c struct.c
The object file struct.o is of elf format. I am trying to understand what does this object file contain. The source code is just a definition of a struct. There is nothing executable here so there should be nothing in text, and there is no data really either.
So where does the definition of struct go really?
I tried using;
readelf -a struct.o
objdump -s struct.o
but don't quite understand this.
Thanks,
Jagrati

So where does the definition of struct
go really?
Struct definition usually goes to /dev/null. C does not have any introspection features, so struct definition is not needed at run time. During compilation, calls to struct fields are converted to numeric offsets, eg. x->f would be compiled to equivalent of *((void*)x + sizeof(int)). That's why you need to include headers every time you use struct.

There is nothing. It does not exist. You have created nothing and used nothing.
The definition of the struct is used at compile time. That definition would normally be placed in a non-compiled header file. It is when a struct is used that some code is generated. The definition affects what the compiler produces at that point.
This, among other reasons, is why compiling against one version of a library and then using another version at runtime can crash programs.

structs are not compiled, they are declared. Functions get compiled though.

I'm not an expert and I can't actually answer the question... But I thought of this.
Memory is memory: if you use 1 byte as integer or char, it is still one byte. The results depends only on the compiler.
So, why can't be the same for structs? I mean, the compiler probably will calculate the memory to allocate (as your computer probably will allocate WORDS of memory, not bytes, if your struct is 1 byte long, probably 3 bytes will be added allowing the allocation of 4 bytes word), and then struct will just be a "reference" for you when accessing data.
I think that there is no need to actually HAVE something underneath: it's sufficient for the compiler to know that, in compile time, if you refer to field "name" of your struct, it shall treat is as an array of chars of length X.
As I said, I'm not expert in such internals, but as I see it, there is no need for a struct to be converted in "real code"... It's just an annotation for the compiler, which can be destroyed after the compilation is done.

Related

Can register name be passed into assembly template in GCC inline assembly [duplicate]

I have recently started learning how to use the inline assembly in C Code and came across an interesting feature where you can specify registers for local variables (https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables).
The usage of this feature is as follows:
register int *foo asm ("r12");
Then I started to wonder whether it was possible to insert a char pointer such as
const char d[4] = "r12";
register int *foo asm (d);
but got the error: expected string literal before ‘d’ (as expected)
I can understand why this would be a bad practice, but is there any possible way to achieve a similar effect where I can use a char pointer to access the register? If not, is there any particular reason why this is not allowed besides the potential security issues?
Additionally, I read this StackOverflow question: String literals: pointer vs. char array
Thank you.

The syntax to initialize the variable would be register char *foo asm ("r12") = d; to point an asm-register variable at a string. You can't use a runtime-variable string as the register name; register choices have to get assembled into machine code at compile time.
If that's what you're trying to do, you're misunderstanding something fundamental about assembly language and/or how ahead-of-time compiled languages compile into machine code. GCC won't make self-modifying code (and even if it wanted to, doing that safely would require redoing register allocation done by the ahead-of-time optimizer), or code that re-JITs itself based on a string.
(The first time I looked at your question, I didn't understand what you were even trying to do, because I was only considering things that are possible. #FelixG's comment was the clue I needed to make sense of the question.)
(Also note that registers aren't indexable; even in asm you can't use a single instruction to read a register number selected by an integer in another register. You could branch on it, or store all the registers in memory and index that like variadic functions do for their incoming register args.)
And if you do want a compile-time constant string literal, just use it with the normal syntax. Use a CPP macro if you want the same string to initialize a char array.

Purpose of using Windows Data Types in a program

I am trying to understand the purpose of using Windows Data Types when defining parameters of a function/structure fields in a particular language. I've read explanations detailing how this prevents code from "breaking" if "underlying types" are changed. Can some one present a concise explanation and example to clarify? Thanks.

Found answer in a similar post (Why are the standard datatypes not used in Win32 API?):
And the reason that these types are defined the way they are, rather than using int, char and so on is that it removes the "whatever the compiler thinks an int should be sized as" from the interface of the OS. Which is a very good thing, because if you use compiler A, or compiler B, or compiler C, they will all use the same types - only the library interface header file needs to do the right thing defining the types.
By defining types that are not standard types, it's easy to change int from 16 to 32 bit, for example. The first C/C++ compilers for Windows were using 16-bit integers. It was only in the mid to late 1990's that Windows got a 32-bit API, and up until that point, you were using int that was 16-bit. Imagine that you have a well-working program that uses several hundred int variables, and all of a sudden, you have to change ALL of those variables to something else... Wouldn't be very nice, right - especially as SOME of those variables DON'T need changing, because moving to a 32-bit int for some of your code won't make any difference, so no point in changing those bits.
It should be noted that WCHAR is NOT the same as const char - WCHAR is a "wide char" so wchar_t is the comparable type.
So, basically, the "define our own type" is a way to guarantee that it's possible to change the underlying compiler architecture, without having to change (much of the) source code. All larger projects that do machine-dependant coding does this sort of thing.

Can I define C functions that accept native Go types through CGo?

For the work I'm doing to integrate with an existing library, I ended up needing to write some additional C code to provide an interface that was usable through CGo.
In order to avoid redundant data copies, I would like to be able to pass some standard Go types (e.g. Go strings) to these C adapter functions.
I can see that there are GoString and GoInterface types defined in the header CGo generates for use by exported Go functions, but is there any way to use these types in my own function prototypes that CGo will recognise?
At the moment, I've ended up using void * in the C prototypes and passing unsafe.Pointer(&value) on the Go side. This is less clean than I'd like though (for one thing, it gives the C code the ability to write to the value).
Update:
Just to be clear, I do know the difference between Go's native string type and C char *. My point is that since I will be copying the string data passed into my C function anyway, it doesn't make sense to have the code on the Go side make its own copy.
I also understand that the string layout could change in a future version of Go, and its size may differ by platform. But CGo is already exposing type definitions that match the current platform to me via the documented _cgo_export.h header it generates for me, so it seems a bit odd to talk of it being unspecified:
typedef struct { char *p; int n; } GoString;
But there doesn't seem to be a way to use this definition in prototypes visible to CGo. I'm not overly worried about binary compatibility, since the code making use of this definition would be part of my Go package, so source level compatibility would be enough (and it wouldn't be that big a deal to update the package if that wasn't the case).

Not really. You cannot safely mix, for example Go strings (string) and C "strings" (*char) code without using the provided helpers for that, ie. GoString and CString. The reason is that to conform to the language specs a full copy of the string's content between the Go and C worlds must be made. Not only that, the garbage collector must know what to consider (Go strings) and what to ignore (C strings). And there are even more things to do about this, but let me keep it simple here.
Similar and/or other restrictions/problems apply to other Go "magical" types, like map or interface{} types. In the interface types case (but not only it), it's important to realize that the inner implementation of an interface{} (again not only this type), is not specified and is implementation specific.
That's not only about the possible differences between, say gc and gccgo. It also means that your code will break at any time the compiler developers decide to change some detail of the (unspecified and thus non guaranteed) implementation.
Additionally, even though Go doesn't (now) use a compacting garbage collector, it may change and without some pinning mechanism, any code accessing Go run time stuff directly will be again doomed.
Conclusion: Pass only simple entities as arguments to C functions. POD structs with simple fields are safe as well (pointer fields generally not). From the complex Go types, use the provided helpers for Go strings, they exists for a (very good) reason.

Passing a Go string to C is harder than it should be. There is no really good way to do it today. See https://golang.org/issue/6907.
The best approach I know of today is
// typedef struct { const char *p; ptrdiff_t n; } gostring;
// extern CFunc(gostring s);
import "C"
func GoFunc(s string) {
C.CFunc(*(*C.gostring)(unsafe.Pointer(&s)))
}
This of course assumes that Go representation of a string value will not change, which is not guaranteed.

Where Is gcvt or gcvtf Defined in gcc Source Code?

I'm working on some old source code for an embedded system on an m68k target, and I'm seeing massive memory allocation requests sometimes when calling gcvtf to format a floating point number for display. I can probably work around this by writing my own substitute routine, but the nature of the error has me very curious, because it only occurs when the heap starts at or above a certain address, and it goes away if I hack the .ld linker script or remove any set of global variables (which are placed before the heap in my memory map) that add up to enough byte size so that the heap starts below the mysterious critical address.
So, I thought I'd look in the gcc source code for the compiler version I'm using (m68k-elf-gcc 3.3.2). I downloaded what appears to be the source for this version at http://gcc.petsads.us/releases/gcc-3.3.2/, but I can't find the definition for gcvt or gcvtf anywhere in there. When I search for it, grep only finds some documentation and .h references, but not the definition:
$ find | xargs grep gcvt
./gcc/doc/gcc.info: C library functions `ecvt', `fcvt' and `gcvt'. Given va
lid
./gcc/doc/trouble.texi:library functions #code{ecvt}, #code{fcvt} and #code{gcvt
}. Given valid
./gcc/sys-protos.h:extern char * gcvt(double, int, char *);
So, where is this function actually defined in the source code? Or did I download the entirely wrong thing?
I don't want to change this project to use the most recent gcc, due to project stability and testing considerations, and like I said, I can work around this by writing my own formatting routine, but this behavior is very confusing to me, and it will grind my brain if I don't find out why it's acting so weird.

Wallyk is correct that this is defined in the C library rather than the compiler. However, the GNU C library is (nearly always) only used with Linux compilers and distributions. Your compiler, being a "bare-metal" compiler, almost certainly uses the Newlib C library instead.
The main website for Newlib is here: http://sourceware.org/newlib/, and this particular function is defined in the newlib/libc/stdlib/efgcvt.c file. The sources have been quite stable for a long time, so (unless this is a result of a bug) chances are pretty good that the current sources are not too different from what your compiler is using.
As with the GNU C source, I don't see anything in there that would obviously cause this weirdness that you're seeing, but it's all eventually a bunch of wrappers around the basic sprintf routines.

It is in the GNU C library as glibc/misc/efgcvt.c. To save you some trouble, the code for the function is:
char *
__APPEND (FUNC_PREFIX, gcvt) (value, ndigit, buf)
FLOAT_TYPE value;
int ndigit;
char *buf;
{
sprintf (buf, "%.*" FLOAT_FMT_FLAG "g", MIN (ndigit, NDIGIT_MAX), value);
return buf;
}
The directions for obtain glibc are here.

Looking for C source code for snprintf()

I need to port snprintf() to another platform that does not fully support GLibC.
I am looking for the underlying declaration in the Glibc 2.14 source code. I follow many function calls, but get stuck on vfprintf(). It then seems to call _IO_vfprintf(), but I cannot find the definition. Probably a macro is obfuscating things.
I need to see the real C code that scans the format string and calculates the number of bytes it would write if input buffer was large enough.
I also tried looking in newlib 1.19.0, but I got stuck on _svfprintf_r(). I cannot find the definition anywhere.
Can someone point me to either definition or another one for snprintf()?

I've spent quite a while digging the sources to find _svfprintf_r() (and friends) definitions in the Newlib. Since OP asked about it, I'll post my finding for the poor souls who need those as well. The following holds true for Newlib 1.20.0, but I guess it is more or less the same across different versions.
The actual sources are located in the vfprintf.c file. There is a macro _VFPRINTF_R set to one of _svfiprintf_r, _vfiprintf_r, _svfprintf_r, or _vfprintf_r (depending on the build options), and then the actual implementation function is defined accordingly:
int
_DEFUN(_VFPRINTF_R, (data, fp, fmt0, ap),
struct _reent *data _AND
FILE * fp _AND
_CONST char *fmt0 _AND
va_list ap)
{
...

http://www.ijs.si/software/snprintf/ has what they claim is a portable implementation of snprintf, including vsnprintf.c, asnprintf, vasnprintf, asprintf, vasprintf. Perhaps it can help.

The source code of the GNU C library (glibc) is hosted on sourceware.org.
Here is a link to the implementation of vfprintf(), which is called by snprintf():
https://sourceware.org/git/?p=glibc.git;a=blob;f=stdio-common/vfprintf.c

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio