what is FILE_H called in the include guard - include

In the file file.h, following code is seen.
#ifndef FILE_H
#define FILE_H
...
...
#endif
QUESTION: Who generated FILE_H (is FILE_H called identifier?) What is this naming convention called? What I should read to understand more of this?
At the moment, I know this is called include guard, and stuff to do with preprocessor. But I can't seem to google futher. Any links would be highly appreciated.

Include guards are not actually a feature of the language itself, they're just a way to ensure the same header file isn't included twice into the same translation unit, and they're built from lower-level language features.
The actual features that make this possible are macro replacement (specifically object-like macros) and conditional inclusion.
Hence, to find out where the FILE_H comes from, you need to examine two things.
The first is the limitations imposed by the standard (C11 6.4.2 for example). In macro replacement, the macro name must be drawn from a limited character set, the minimum of which includes the upper and lower case letters, the underscore, and the digits (there are all sorts of extras that can be allowed, such as universal character names or other implementation-defined characters but this is the mandated baseline).
The second is the mind of the developer. Beyond the constraints of the standard, the developer must provide a unique identifier used for the include guard and the easiest way to do this is to make it reliant somehow on the file name itself. Hence, one practice is to use the uppercase file name with . replaced by an underscore.
That's why you'll end up with an include guard for btree.h being of the form:
#ifndef BTREE_H
#define BTREE_H
// weave your magic here
#endif
You should keep in mind however that it doesn't always work out well. Sometimes you may end up with two similarly named header files that use the same include guard name, resulting in one of the header files not being included at all. This happens infrequently enough that it's usually not worth being concerned about.

Related

Why do Julia programmers need to prefix macros with the at-sign?

Whenever I see a Julia macro in use like #assert or #time I'm always wondering about the need to distinguish a macro syntactically with the # prefix. What should I be thinking of when using # for a macro? For me it adds noise and distraction to an otherwise very nice language (syntactically speaking).
I mean, for me '#' has a meaning of reference, i.e. a location like a domain or address. In the location sense # does not have a meaning for macros other than that it is a different compilation step.
The # should be seen as a warning sign which indicates that the normal rules of the language might not apply. E.g., a function call
f(x)
will never modify the value of the variable x in the calling context, but a macro invocation
#mymacro x
(or #mymacro f(x) for that matter) very well might.
Another reason is that macros in Julia are not based on textual substitution as in C, but substitution in the abstract syntax tree (which is much more powerful and avoids the unexpected consequences that textual substitution macros are notorious for).
Macros have special syntax in Julia, and since they are expanded after parse time, the parser also needs an unambiguous way to recognise them
(without knowing which macros have been defined in the current scope).
ASCII characters are a precious resource in the design of most programming languages, Julia very much included. I would guess that the choice of # mostly comes down to the fact that it was not needed for something more important, and that it stands out pretty well.
Symbols always need to be interpreted within the context they are used. Having multiple meanings for symbols, across contexts, is not new and will probably never go away. For example, no one should expect #include in a C program to go viral on Twitter.
Julia's Documentation entry Hold up: why macros? explains pretty well some of the things you might keep in mind while writing and/or using macros.
Here are a few snippets:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run.
...
It is important to emphasize that macros receive their arguments as
expressions, literals, or symbols.
So, if a macro is called with an expression, it gets the whole expression, not just the result.
...
In place of the written syntax, the macro call is expanded at parse
time to its returned result.
It actually fits quite nicely with the semantics of the # symbol on its own.
If we look up the Wikipedia entry for 'At symbol' we find that it is often used as a replacement for the preposition 'at' (yes it even reads 'at'). And the preposition 'at' is used to express a spatial or temporal relation.
Because of that we can use the #-symbol as an abbreviation for the preposition at to refer to a spatial relation, i.e. a location like #tony's bar, #france, etc., to some memory location #0x50FA2C (e.g. for pointers/addresses), to the receiver of a message (#user0851 which twitter and other forums use, etc.) but as well for a temporal relation, i.e. #05:00 am, #midnight, #compile_time or #parse_time.
And since macros are processed at parse time (here you have it) and this is totally distinct from the other code that is evaluated at run time (yes there are many different phases in between but that's not the point here).
In addition to explicitly direct the attention to the programmer that the following code fragment is processed at parse time! as oppossed to run time, we use #.
For me this explanation fits nicely in the language.
thanks#all ;)

Prefixing printk / pr_* calls

I would like to prefix my drivers (debug) output with its name, i.e. [myDriver] Actual message. Since it is tiresome to write printk(level NAMEMACRO "Actual message\n") every time I was thinking of overwriting printk/pr_* to actually include the [myDriver] part. However I can not think of a way to do this. In the best case the solution would not force me to change the printk/pr_* calls in the code (With changed calls this becomes trivial).
Is this possible? (Since I included other headers which in turn include the printk header it will always be defined this rules out not linking to the original as suggested in a different so answer)
Are there any reasons why current drivers do not at this to the text? (Is there another way to filter dmesg by driver?)
I am somewhat aware of dev_dbg but I have not found anything dev specific for warnings in general so I will use printk/pr_err for that.
Its standard to use pr_{debug,warn,err}() with [drivername] prefixed.
ex:
pr_debug("kvm: matched tsc offset for %llu\n", data);
Alternatively you can use dev_warn()
ex:
dev_warn(&adap->dev, "Bus may be unreliable\n");
Is there another way to filter dmesg by driver?
Not unless you want to run dmesg -c to clear the logs, before getting the your driver loaded. Its always recommenced prefixing the driver name in your debug / print messages. As when you receives logs from customers, you don't want to waste time reading through each line manually.
The relevant answer (found in the duplicate) is to #define pr_fmt (code from the duplicate question linked above):
/* At the top of the file, before any includes */
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/printk.h>
As an additional note, if I include variables, sometime pr_fmt is not automatically applied for me. Manual use as in printk(pr_fmt("message %p"), (void*)ptr) fixes those occasions, and adheres to the convention of defining pr_fmt
Since I did not find the duplicate question, I will not delete this question for other googlers like me.

Using precompiled headers in gcc

I am investigating using precompiled headers to reduce our compile times.
I have read the documentaiton on the subject here: https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html, where I read the following:
Only one precompiled header can be used in a particular compilation.
On the project whose build time I would like to improve, there are often very Long lists of includes. The above leads me to Think that to get the most performance improvements, I would have to make a collection of common includes, put them into a single Header file, compile and include that Header file.
On the other hand, I prefer to list my dependancies in particular file explicitly, so I would be inclined to include first the precompiled Header, followed by the Manual list of actual Header files.
I have two questions related to this:
Is my analysis and approach correct? Have I interpreted the statement correctly?
Doing this, I will use this file (say stdafx.h) in many places, thereby including files I don't need. I would like to explicitly list my dependencies however, for code documentation purposes.
Where I to do something like the following:
#ifdef USE_PRECOMPILED_HEADERS
#include "stdafx.h"
#else
#include "dep1.h"
#include "dep2.h"
#endif
I could periodically run a build without pre-compiled headers to check if all my dependencis are listed. This is a bit clunky however. Does anyone have a better solution?
If anyone has Information to help us obtain better results in our Investigation, I am happy to hear them.
Yes, your observation is absolutely fine!
You "would have to make a collection of common includes, put them into a single Header file, compile and include that Header file". This common header file is generally named as stdafx.h (although you can name it anything you want!)
I am afraid I don't really understand this part of the question.
EDIT :
Do you also want the standard headers (like iostream, map, vector, etc.) to be included as dependencies in the code documentation?
Generally this must be a NO. Hence, you must include only those header files in stdafx.h which are not under your control (i.e., [1] standard language includes [2] includes from dependent modules (mostly exposed interface headers)). Rest all includes (whose source is in the current project/module) must be explicitly included in each header file wherever required, and not put in the pre-compiled stdafx.h.
The above leads me to Think that to get the most Performance
improvements, I would have to make a collection of common
includes, put them into a single Header file, compile and
include that Header file.
Yes, this observation is correct: You put most (all?) includes in one single header file, which is then precompiled.
Which, in turn, means that...
any compilation without the aid of that header being precompiled will take ages;
you are relying on naming conventions or other means (documentation?) to make the information link between things referenced in your individual translation unit and their declaration.
I don't much like precompiled headers for those reasons...

How can I force the order of functions in a binary with the gcc toolchain?

I'm building a static binary out of several source files and libraries, and I want to control the order in which the functions are put into the resulting binary.
The background is, I have external code which is linked against offsets in this binary. Now if I change the source, all the offsets change because gcc may decide to order the functions differently, so I want to put the referenced functions at the beginning in a fixed order so their offsets stay unchanged...
I looked through ld's documentation but couldn't find anything about order of functions.
The only thing i found was -fno-toplevel-reorder which doesn't really help me.
There is really no clean and reliable way of forcing a function to a particular address (except for the entry function) or even forcing functions having a particular order (and if you could enforce the order that would still not mean that the addresses stay the same when the source is changed!).
The biggest problem that I see is that even if it may be possible to fix a function to some address, it will be sheer impossible to fix all of them to exactly the addresses that the already existing external program expects (assuming you cannot modify this program). If that actually worked, it would be total coincidence and sheer luck.
It might be almost easiest to provide trampolines at the addresses that the other program expects, and having the real functions (whereever they may be) pointed to by these. That would require your code to use a different base address, so the actual program code doesn't collide with the trampolines.
There are three things that almost work for giving functions fixed addresses:
You can place each function that isn't allowed to move in its proper section using __attribute__ ((section ("some name"))). Unluckily, .text always appears as the first section, so if anything in .text changes so the size is bumped over the 512 byte boundary, your offsets will change. By default (but see below) you can't get a section to start before .text.
The -falign-functions=n commandline option lets you align functions to a boundary. Normally this is something around 16 bytes. Now, you could choose a large value like for example 1024. That will waste an immense amount of space, but it will also make sure that as long as functions only change moderately, the addresses of the following functions will remain the same. Obviously it still does not prevent the compiler/linker from reordering entire blocks when it feels like it (though -fno-toplevel-reorder will prevent this at least partially).
If you are willing to write a custom linker script, you can assign a start address for each section. These are virtual memory addresses, not positions in the executable, but I assume the hard linking works with VMAs (based on the default image base) too. So that could kind of work, although with much trouble and not in a pretty way.
When writing your own linker script, you could also consider putting the functions that must not move into their own sections and moving these sections at the beginning of the executable (in front of .text), so changes in .text won't move your functions around.
Update:
The "gcc" tag suggests that you probably target *NIX, so again this is probably not going to help you, but... if you have the option to use COFF, dollar-sign sections might work (the info might be interesting for others, in any case).
I just stumbled across this today (emphasis mine):
The "$" character (dollar sign) has a special interpretation in section names in object files. When determining the image section that will contain the contents of an object section, the linker discards the "$" and all characters that follow it. Thus, an object section named .text$X actually contributes to the .text section in the image. However, the characters following the "$" determine the ordering of the contributions to the image section. All contributions with the same object-section name are allocated contiguously in the image, and the blocks of contributions are sorted in lexical order by object-section name. Therefore, everything in object files with section name .text$X ends up together, after the .text$W contributions and before the .text$Y contributions.
If the documentation does not lie (and if I'm not reading wrong), this means you should be able to pack all the functions that you want located in the front into one section .text$A, and everything else into .text$B, and it should do just that.
Build your code with -ffunction-sections -- this will place each function into its own section.
If you are using GNU-ld, the linker script gives you absolute control, but is a very platform-specific and somewhat painful solution.
A better solution might be to use the recent work on gold, which allows exactly the function ordering you are seeking.
A lot of it comes from the order the functions are in the file and the order the files are on the command line when you link.
Embed something in the code that your external code can find, a const structure with some ascii code and the address to functions perhaps, then no matter where the compiler puts the functions you can find them.
that or use the normal .dll or .so mechanisms, and not have to mess with it.
In my experience, gcc -O0 will fix the binary order of functions to match the order in the source code.
However as others have mentioned, even if the order is fixed, the offsets can change as you modify the source code or upgrade your toolchain.

Using strings instead of symbols: good or evil?

Often enough, I find myself dealing with lists of function options (or more general replacement lists) of the form {foo->value,...}. This leads to bugs when foo already has a value in $Context. One obvious way to prevent this is using a string "foo" instead of the symbol: {"foo"->value,...}. This works, but seems to draw ire of some seasoned LISPers I know, who chastise me for conflating symbols and strings and tell me to use built-in quoting constructs.
While it is certainly possible to write code that avoids collisions without using strings, it often seems more trouble than it is worth. On the other hand, I haven't seen too many examples of {"string"->value} type replacement rules. So the question to you is -- is this an acceptable usage pattern?.. Are there cases where it is particularly appropriate?.. Where should it be avoided?..
In my opinion (disclaimer - it is only my opinion), it is best to avoid using strings as option names, at least for "main" options in your function. Strings OTOH are totally fine as settings (r.h.s. of options). This is not to say that you can not use strings, just as you noted. Perhaps, they could be more appropriate for sub-options, and they are used in this way by many system functions (usually "superfunctions" like NDSolve, that may have sub-options within options). The main problems I see with using strings is that they reduce the introspection capabilities, both for the system and for the user. In other words, it is harder to discover an option that has a string name than that with a symbol name - for the latter I can just inspect the names of the symbols in a package, and also symbolic option names have usage messages. You may also want to automate some things, such as writing a utility that finds all option names in the package etc. It is easier to do when option names are symbols, since they all belong to the same context. It is also easy to discover that some options do not have usage messages, one can do that automatically by writing a utility function.
Finally, you may have a better protection against accidental collisions of similar option names. It may be, that many option sequences are passed to your function, and occasionally they may contain options with the same name. If option names were symbols, full symbol names would be different. Then, you will both get a shadowing warning, and at the same time a protection - only the correct option (full) name will be used. For string, you don't get any warning, and may end up using incorrect option setting, if the duplicate string option name with a wrong setting (intended for a different function, say) happens to be first in the list. This scenario is more likely to occur in larger projects, but bugs like this are probably very hard to catch (this is a guess, I never had such situation).
As for possible collisions, if you follow some naming conventions such as option name always starting with a capital letter, plus put most of your code in packages, and do not start your variable or function names (for functions in the interactive session), with a capital letter, then you will greatly reduce the chance of such collisions. Additionally, you should Protect option names, when you define them, or at the end of the package. Then, the collisions will be detected as cases of shadowing. Avoiding shadowing, OTOH, is a general necessity, so the case of options is no more special in this respect than for function names etc.

Resources