C++ istream conditional direct initialization w/ ternary operator - c++11

Given this issue-exemplifing minimal code:
#include <fstream>
#include <string>
using std::ifstream;
using std::string;
string TEST_PATH;
string TRAIN_PATH;
enum DataContext
{
TRAINING, TESTING
};
void StreamData(DataContext context)
{
ifstream dataFile = (context == TRAINING) ? ifstream(TRAIN_PATH) : ifstream(TEST_PATH);
dataFile.close();
}
void StreamData2(DataContext context)
{
ifstream datafile;
datafile.open((context == TRAINING) ? TRAIN_PATH: TEST_PATH);
datafile.close();
}
int main()
{
StreamData(TRAINING);
StreamData2(TRAINING);
return 0;
}
I am implementing an AI Iterative Dichotomizer, but that isn't relevant to this issue. What is relevant are two external testfiles, one with training data and one with testing data. Certain control flow is governed by what type of data is being dealt with at the time, a simple enum is used to simplify such association.
StreamData aims to conditionally direct initialize an ifstream based on a check of the current context. It does so with the ternary operator. In Visual Studio(which means nothing), but also in cygwin g++ v5.1.0, the use of the StreamData function as written compiles successfully, and executes per my expectations.(in my real program, something is actually returned). However, in Debian, g++ v4.9.2, the use of that function produces an essay of compiler-ese. I won't copy the whole output, unless it becomes necessary, but the highlights include a warning that g++ is deleting the default ifstream constuctor because it would be ill-formed. Then a forceful criticism of attempting to call a deleted function.
"g++ -o TT -std=c++11 ./ThisCode.cpp" is the g++ command executed in both environments.
StreamData2 is a natural alternative to StreamData, and it compiles fine everywhere.
My problem is solved with StreamData2. I'm left with a question. Is StreamData valid C++11/use of the ternary operator? Is compilation failure with the lesser revision of g++ a bug/mal-implementation?
Thanks for any insight.

Related

How to use a static map of function pointer field in a template with using or typedef [duplicate]

Quote from The C++ standard library: a tutorial and handbook:
The only portable way of using templates at the moment is to implement them in header files by using inline functions.
Why is this?
(Clarification: header files are not the only portable solution. But they are the most convenient portable solution.)
Caveat: It is not necessary to put the implementation in the header file, see the alternative solution at the end of this answer.
Anyway, the reason your code is failing is that, when instantiating a template, the compiler creates a new class with the given template argument. For example:
template<typename T>
struct Foo
{
T bar;
void doSomething(T param) {/* do stuff using T */}
};
// somewhere in a .cpp
Foo<int> f;
When reading this line, the compiler will create a new class (let's call it FooInt), which is equivalent to the following:
struct FooInt
{
int bar;
void doSomething(int param) {/* do stuff using int */}
}
Consequently, the compiler needs to have access to the implementation of the methods, to instantiate them with the template argument (in this case int). If these implementations were not in the header, they wouldn't be accessible, and therefore the compiler wouldn't be able to instantiate the template.
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
Foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
#include "Foo.tpp"
Foo.tpp
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
This way, implementation is still separated from declaration, but is accessible to the compiler.
Alternative solution
Another solution is to keep the implementation separated, and explicitly instantiate all the template instances you'll need:
Foo.h
// no implementation
template <typename T> struct Foo { ... };
Foo.cpp
// implementation of Foo's methods
// explicit instantiations
template class Foo<int>;
template class Foo<float>;
// You will only be able to use Foo with int or float
If my explanation isn't clear enough, you can have a look at the C++ Super-FAQ on this subject.
It's because of the requirement for separate compilation and because templates are instantiation-style polymorphism.
Lets get a little closer to concrete for an explanation. Say I've got the following files:
foo.h
declares the interface of class MyClass<T>
foo.cpp
defines the implementation of class MyClass<T>
bar.cpp
uses MyClass<int>
Separate compilation means I should be able to compile foo.cpp independently from bar.cpp. The compiler does all the hard work of analysis, optimization, and code generation on each compilation unit completely independently; we don't need to do whole-program analysis. It's only the linker that needs to handle the entire program at once, and the linker's job is substantially easier.
bar.cpp doesn't even need to exist when I compile foo.cpp, but I should still be able to link the foo.o I already had together with the bar.o I've only just produced, without needing to recompile foo.cpp. foo.cpp could even be compiled into a dynamic library, distributed somewhere else without foo.cpp, and linked with code they write years after I wrote foo.cpp.
"Instantiation-style polymorphism" means that the template MyClass<T> isn't really a generic class that can be compiled to code that can work for any value of T. That would add overhead such as boxing, needing to pass function pointers to allocators and constructors, etc. The intention of C++ templates is to avoid having to write nearly identical class MyClass_int, class MyClass_float, etc, but to still be able to end up with compiled code that is mostly as if we had written each version separately. So a template is literally a template; a class template is not a class, it's a recipe for creating a new class for each T we encounter. A template cannot be compiled into code, only the result of instantiating the template can be compiled.
So when foo.cpp is compiled, the compiler can't see bar.cpp to know that MyClass<int> is needed. It can see the template MyClass<T>, but it can't emit code for that (it's a template, not a class). And when bar.cpp is compiled, the compiler can see that it needs to create a MyClass<int>, but it can't see the template MyClass<T> (only its interface in foo.h) so it can't create it.
If foo.cpp itself uses MyClass<int>, then code for that will be generated while compiling foo.cpp, so when bar.o is linked to foo.o they can be hooked up and will work. We can use that fact to allow a finite set of template instantiations to be implemented in a .cpp file by writing a single template. But there's no way for bar.cpp to use the template as a template and instantiate it on whatever types it likes; it can only use pre-existing versions of the templated class that the author of foo.cpp thought to provide.
You might think that when compiling a template the compiler should "generate all versions", with the ones that are never used being filtered out during linking. Aside from the huge overhead and the extreme difficulties such an approach would face because "type modifier" features like pointers and arrays allow even just the built-in types to give rise to an infinite number of types, what happens when I now extend my program by adding:
baz.cpp
declares and implements class BazPrivate, and uses MyClass<BazPrivate>
There is no possible way that this could work unless we either
Have to recompile foo.cpp every time we change any other file in the program, in case it added a new novel instantiation of MyClass<T>
Require that baz.cpp contains (possibly via header includes) the full template of MyClass<T>, so that the compiler can generate MyClass<BazPrivate> during compilation of baz.cpp.
Nobody likes (1), because whole-program-analysis compilation systems take forever to compile , and because it makes it impossible to distribute compiled libraries without the source code. So we have (2) instead.
Plenty correct answers here, but I wanted to add this (for completeness):
If you, at the bottom of the implementation cpp file, do explicit instantiation of all the types the template will be used with, the linker will be able to find them as usual.
Edit: Adding example of explicit template instantiation. Used after the template has been defined, and all member functions has been defined.
template class vector<int>;
This will instantiate (and thus make available to the linker) the class and all its member functions (only). Similar syntax works for function templates, so if you have non-member operator overloads you may need to do the same for those.
The above example is fairly useless since vector is fully defined in headers, except when a common include file (precompiled header?) uses extern template class vector<int> so as to keep it from instantiating it in all the other (1000?) files that use vector.
Templates need to be instantiated by the compiler before actually compiling them into object code. This instantiation can only be achieved if the template arguments are known. Now imagine a scenario where a template function is declared in a.h, defined in a.cpp and used in b.cpp. When a.cpp is compiled, it is not necessarily known that the upcoming compilation b.cpp will require an instance of the template, let alone which specific instance would that be. For more header and source files, the situation can quickly get more complicated.
One can argue that compilers can be made smarter to "look ahead" for all uses of the template, but I'm sure that it wouldn't be difficult to create recursive or otherwise complicated scenarios. AFAIK, compilers don't do such look aheads. As Anton pointed out, some compilers support explicit export declarations of template instantiations, but not all compilers support it (yet?).
Actually, prior to C++11 the standard defined the export keyword that would make it possible to declare templates in a header file and implement them elsewhere. In a manner of speaking. Not really, as the only ones who ever implemented that feature pointed out:
Phantom advantage #1: Hiding source code. Many users, have said that they expect that by using export they will
no longer have to ship definitions for member/nonmember function templates and member functions of class
templates. This is not true. With export, library writers still have to ship full template source code or its direct
equivalent (e.g., a system-specific parse tree) because the full information is required for instantiation. [...]
Phantom advantage #2: Fast builds, reduced dependencies. Many users expect that export will allow true separate
compilation of templates to object code which they expect would allow faster builds. It doesn’t because the
compilation of exported templates is indeed separate but not to object code. Instead, export almost always makes
builds slower, because at least the same amount of compilation work must still be done at prelink time. Export
does not even reduce dependencies between template definitions because the dependencies are intrinsic,
independent of file organization.
None of the popular compilers implemented this keyword. The only implementation of the feature was in the frontend written by the Edison Design Group, which is used by the Comeau C++ compiler. All others required you to write templates in header files, because the compiler needs the template definition for proper instantiation (as others pointed out already).
As a result, the ISO C++ standard committee decided to remove the export feature of templates with C++11.
Although standard C++ has no such requirement, some compilers require that all function and class templates need to be made available in every translation unit they are used. In effect, for those compilers, the bodies of template functions must be made available in a header file. To repeat: that means those compilers won't allow them to be defined in non-header files such as .cpp files
There is an export keyword which is supposed to mitigate this problem, but it's nowhere close to being portable.
Templates are often used in headers because the compiler needs to instantiate different versions of the code, depending on the parameters given/deduced for template parameters, and it's easier (as a programmer) to let the compiler recompile the same code multiple times and deduplicate later.
Remember that a template doesn't represent code directly, but a template for several versions of that code.
When you compile a non-template function in a .cpp file, you are compiling a concrete function/class.
This is not the case for templates, which can be instantiated with different types, namely, concrete code must be emitted when replacing template parameters with concrete types.
There was a feature with the export keyword that was meant to be used for separate compilation.
The export feature is deprecated in C++11 and, AFAIK, only one compiler implemented it.
You shouldn't make use of export.
Separate compilation is not possible in C++ or C++11 but maybe in C++17, if concepts make it in, we could have some way of separate compilation.
For separate compilation to be achieved, separate template body checking must be possible.
It seems that a solution is possible with concepts.
Take a look at this paper recently presented at the
standards committee meeting.
I think this is not the only requirement, since you still need to instantiate code for the template code in user code.
The separate compilation problem for templates I guess it's also a problem that is arising with the migration to modules, which is currently being worked.
EDIT: As of August 2020 Modules are already a reality for C++: https://en.cppreference.com/w/cpp/language/modules
Even though there are plenty of good explanations above, I'm missing a practical way to separate templates into header and body.
My main concern is avoiding recompilation of all template users, when I change its definition.
Having all template instantiations in the template body is not a viable solution for me, since the template author may not know all if its usage and the template user may not have the right to modify it.
I took the following approach, which works also for older compilers (gcc 4.3.4, aCC A.03.13).
For each template usage there's a typedef in its own header file (generated from the UML model). Its body contains the instantiation (which ends up in a library which is linked in at the end).
Each user of the template includes that header file and uses the typedef.
A schematic example:
MyTemplate.h:
#ifndef MyTemplate_h
#define MyTemplate_h 1
template <class T>
class MyTemplate
{
public:
MyTemplate(const T& rt);
void dump();
T t;
};
#endif
MyTemplate.cpp:
#include "MyTemplate.h"
#include <iostream>
template <class T>
MyTemplate<T>::MyTemplate(const T& rt)
: t(rt)
{
}
template <class T>
void MyTemplate<T>::dump()
{
cerr << t << endl;
}
MyInstantiatedTemplate.h:
#ifndef MyInstantiatedTemplate_h
#define MyInstantiatedTemplate_h 1
#include "MyTemplate.h"
typedef MyTemplate< int > MyInstantiatedTemplate;
#endif
MyInstantiatedTemplate.cpp:
#include "MyTemplate.cpp"
template class MyTemplate< int >;
main.cpp:
#include "MyInstantiatedTemplate.h"
int main()
{
MyInstantiatedTemplate m(100);
m.dump();
return 0;
}
This way only the template instantiations will need to be recompiled, not all template users (and dependencies).
It means that the most portable way to define method implementations of template classes is to define them inside the template class definition.
template < typename ... >
class MyClass
{
int myMethod()
{
// Not just declaration. Add method implementation here
}
};
Just to add something noteworthy here. One can define methods of a templated class just fine in the implementation file when they are not function templates.
myQueue.hpp:
template <class T>
class QueueA {
int size;
...
public:
template <class T> T dequeue() {
// implementation here
}
bool isEmpty();
...
}
myQueue.cpp:
// implementation of regular methods goes like this:
template <class T> bool QueueA<T>::isEmpty() {
return this->size == 0;
}
main()
{
QueueA<char> Q;
...
}
The compiler will generate code for each template instantiation when you use a template during the compilation step.
In the compilation and linking process .cpp files are converted to pure object or machine code which in them contains references or undefined symbols because the .h files that are included in your main.cpp have no implementation YET. These are ready to be linked with another object file that defines an implementation for your template and thus you have a full a.out executable.
However since templates need to be processed in the compilation step in order to generate code for each template instantiation that you define, so simply compiling a template separate from it's header file won't work because they always go hand and hand, for the very reason that each template instantiation is a whole new class literally. In a regular class you can separate .h and .cpp because .h is a blueprint of that class and the .cpp is the raw implementation so any implementation files can be compiled and linked regularly, however using templates .h is a blueprint of how the class should look not how the object should look meaning a template .cpp file isn't a raw regular implementation of a class, it's simply a blueprint for a class, so any implementation of a .h template file can't be compiled because you need something concrete to compile, templates are abstract in that sense.
Therefore templates are never separately compiled and are only compiled wherever you have a concrete instantiation in some other source file. However, the concrete instantiation needs to know the implementation of the template file, because simply modifying the typename T using a concrete type in the .h file is not going to do the job because what .cpp is there to link, I can't find it later on because remember templates are abstract and can't be compiled, so I'm forced to give the implementation right now so I know what to compile and link, and now that I have the implementation it gets linked into the enclosing source file. Basically, the moment I instantiate a template I need to create a whole new class, and I can't do that if I don't know how that class should look like when using the type I provide unless I make notice to the compiler of the template implementation, so now the compiler can replace T with my type and create a concrete class that's ready to be compiled and linked.
To sum up, templates are blueprints for how classes should look, classes are blueprints for how an object should look.
I can't compile templates separate from their concrete instantiation because the compiler only compiles concrete types, in other words, templates at least in C++, is pure language abstraction. We have to de-abstract templates so to speak, and we do so by giving them a concrete type to deal with so that our template abstraction can transform into a regular class file and in turn, it can be compiled normally. Separating the template .h file and the template .cpp file is meaningless. It is nonsensical because the separation of .cpp and .h only is only where the .cpp can be compiled individually and linked individually, with templates since we can't compile them separately, because templates are an abstraction, therefore we are always forced to put the abstraction always together with the concrete instantiation where the concrete instantiation always has to know about the type being used.
Meaning typename T get's replaced during the compilation step not the linking step so if I try to compile a template without T being replaced as a concrete value type that is completely meaningless to the compiler and as a result object code can't be created because it doesn't know what T is.
It is technically possible to create some sort of functionality that will save the template.cpp file and switch out the types when it finds them in other sources, I think that the standard does have a keyword export that will allow you to put templates in a separate cpp file but not that many compilers actually implement this.
Just a side note, when making specializations for a template class, you can separate the header from the implementation because a specialization by definition means that I am specializing for a concrete type that can be compiled and linked individually.
If the concern is the extra compilation time and binary size bloat produced by compiling the .h as part of all the .cpp modules using it, in many cases what you can do is make the template class descend from a non-templatized base class for non type-dependent parts of the interface, and that base class can have its implementation in the .cpp file.
A way to have separate implementation is as follows.
inner_foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
foo.tpp
#include "inner_foo.h"
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
foo.h
#include <foo.tpp>
main.cpp
#include <foo.h>
inner_foo.h has the forward declarations. foo.tpp has the implementation and includes inner_foo.h; and foo.h will have just one line, to include foo.tpp.
On compile time, contents of foo.h are copied to foo.tpp and then the whole file is copied to foo.h after which it compiles. This way, there is no limitations, and the naming is consistent, in exchange for one extra file.
I do this because static analyzers for the code break when it does not see the forward declarations of class in *.tpp. This is annoying when writing code in any IDE or using YouCompleteMe or others.
That is exactly correct because the compiler has to know what type it is for allocation. So template classes, functions, enums,etc.. must be implemented as well in the header file if it is to be made public or part of a library (static or dynamic) because header files are NOT compiled unlike the c/cpp files which are. If the compiler doesn't know the type is can't compile it. In .Net it can because all objects derive from the Object class. This is not .Net.
I suggest looking at this gcc page which discusses the tradeoffs between the "cfront" and "borland" model for template instantiations.
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Template-Instantiation.html
The "borland" model corresponds to what the author suggests, providing the full template definition, and having things compiled multiple times.
It contains explicit recommendations concerning using manual and automatic template instantiation. For example, the "-repo" option can be used to collect templates which need to be instantiated. Or another option is to disable automatic template instantiations using "-fno-implicit-templates" to force manual template instantiation.
In my experience, I rely on the C++ Standard Library and Boost templates being instantiated for each compilation unit (using a template library). For my large template classes, I do manual template instantiation, once, for the types I need.
This is my approach because I am providing a working program, not a template library for use in other programs. The author of the book, Josuttis, works a lot on template libraries.
If I was really worried about speed, I suppose I would explore using Precompiled Headers
https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
which is gaining support in many compilers. However, I think precompiled headers would be difficult with template header files.
Another reason that it's a good idea to write both declarations and definitions in header files is for readability. Suppose there's such a template function in Utility.h:
template <class T>
T min(T const& one, T const& theOther);
And in the Utility.cpp:
#include "Utility.h"
template <class T>
T min(T const& one, T const& other)
{
return one < other ? one : other;
}
This requires every T class here to implement the less than operator (<). It will throw a compiler error when you compare two class instances that haven't implemented the "<".
Therefore if you separate the template declaration and definition, you won't be able to only read the header file to see the ins and outs of this template in order to use this API on your own classes, though the compiler will tell you in this case about which operator needs to be overridden.
I had to write a template class an d this example worked for me
Here is an example of this for a dynamic array class.
#ifndef dynarray_h
#define dynarray_h
#include <iostream>
template <class T>
class DynArray{
int capacity_;
int size_;
T* data;
public:
explicit DynArray(int size = 0, int capacity=2);
DynArray(const DynArray& d1);
~DynArray();
T& operator[]( const int index);
void operator=(const DynArray<T>& d1);
int size();
int capacity();
void clear();
void push_back(int n);
void pop_back();
T& at(const int n);
T& back();
T& front();
};
#include "dynarray.template" // this is how you get the header file
#endif
Now inside you .template file you define your functions just how you normally would.
template <class T>
DynArray<T>::DynArray(int size, int capacity){
if (capacity >= size){
this->size_ = size;
this->capacity_ = capacity;
data = new T[capacity];
}
// for (int i = 0; i < size; ++i) {
// data[i] = 0;
// }
}
template <class T>
DynArray<T>::DynArray(const DynArray& d1){
//clear();
//delete [] data;
std::cout << "copy" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
}
template <class T>
DynArray<T>::~DynArray(){
delete [] data;
}
template <class T>
T& DynArray<T>::operator[]( const int index){
return at(index);
}
template <class T>
void DynArray<T>::operator=(const DynArray<T>& d1){
if (this->size() > 0) {
clear();
}
std::cout << "assign" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
//delete [] d1.data;
}
template <class T>
int DynArray<T>::size(){
return size_;
}
template <class T>
int DynArray<T>::capacity(){
return capacity_;
}
template <class T>
void DynArray<T>::clear(){
for( int i = 0; i < size(); ++i){
data[i] = 0;
}
size_ = 0;
capacity_ = 2;
}
template <class T>
void DynArray<T>::push_back(int n){
if (size() >= capacity()) {
std::cout << "grow" << std::endl;
//redo the array
T* copy = new T[capacity_ + 40];
for (int i = 0; i < size(); ++i) {
copy[i] = data[i];
}
delete [] data;
data = new T[ capacity_ * 2];
for (int i = 0; i < capacity() * 2; ++i) {
data[i] = copy[i];
}
delete [] copy;
capacity_ *= 2;
}
data[size()] = n;
++size_;
}
template <class T>
void DynArray<T>::pop_back(){
data[size()-1] = 0;
--size_;
}
template <class T>
T& DynArray<T>::at(const int n){
if (n >= size()) {
throw std::runtime_error("invalid index");
}
return data[n];
}
template <class T>
T& DynArray<T>::back(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[size()-1];
}
template <class T>
T& DynArray<T>::front(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[0];
}

Is there a way to make a moved object "invalid"?

I've some code that moves an object into another object. I won't need the original, moved object anymore in the upper level. Thus move is the right choice I think.
However, thinking about safety I wonder if there is a way to invalidate the moved object and thus preventing undefined behaviour if someone accesses it.
Here is a nice example:
// move example
#include <utility> // std::move
#include <vector> // std::vector
#include <string> // std::string
int main () {
std::string foo = "foo-string";
std::string bar = "bar-string";
std::vector<std::string> myvector;
myvector.push_back (foo); // copies
myvector.push_back (std::move(bar)); // moves
return 0;
}
The description says:
The first call to myvector.push_back copies the value of foo into the
vector (foo keeps the value it had before the call). The second call
moves the value of bar into the vector. This transfers its content
into the vector (while bar loses its value, and now is in a valid but
unspecified state).
Is there a way to invalidate bar, such that access to it will cause a compiler error? Something like:
myvector.push_back (std::move(bar)); // moves
invalidate(bar); //something like bar.end() will then result in a compiler error
Edit: And if there is no such thing, why?
Accessing the moved object is not undefined behavior. The moved object is still a valid object, and the program may very well want to continue using said object. For example,
template< typename T >
void swap_by_move(T &a, T &b)
{
using std::move;
T c = move(b);
b = move(a);
a = move(c);
}
The bigger picture answer is because moving or not moving is a decision made at runtime, and giving a compile-time error is a decision made at compile time.
foo(bar); // foo might move or not
bar.baz(); // compile time error or not?
It's not going to work.. you can approximate in compile time analysis, but then it's going to be really difficult for developers to either not get an error or making anything useful in order to keep a valid program or the developer has to make annoying and fragile annotations on functions called to promise not to move the argument.
To put it a different way, you are asking about having a compile time error if you use an integer variable that contains the value 42. Or if you use a pointer that contains a null pointer value. You might be succcessful in implementing an approximate build-time code convention checker using clang the analysis API, however, working on the CFG of the C++ AST and erroring out if you can't prove that std::move has not been called till a given use of a variable.
Move semantics works like that so you get an object in any it's correct state. Correct state means that all fields have correct value, and all internal invariants are still good. That was done because after move you don't actually care about contents of moved object, but stuff like resource management, assignments and destructors should work OK.
All STL classes (and all classed with default move constructor/assignment) just swap it's content with new one, so both states are correct, and it's very easy to implement, fast, and convinient enough.
You can define your class that has isValid field that's generally true and on move (i. e. in move constructor / move assignment) sets that to false. Then your object will have correct state I am invalid. Just don't forget to check it where needed (destructor, assignment etc).
That isValid field can be either one pointer having null value. The point is: you know, that object is in predictable state after move, not just random bytes in memory.
Edit: example of String:
class String {
public:
string data;
private:
bool m_isValid;
public:
String(string const& b): data(b.data), isValid(true) {}
String(String &&b): data(move(b.data)) {
b.m_isValid = false;
}
String const& operator =(String &&b) {
data = move(b.data);
b.m_isValid = false;
return &this;
}
bool isValid() {
return m_isValid;
}
}

Undefined reference to xxx only in one portion of the code [duplicate]

Quote from The C++ standard library: a tutorial and handbook:
The only portable way of using templates at the moment is to implement them in header files by using inline functions.
Why is this?
(Clarification: header files are not the only portable solution. But they are the most convenient portable solution.)
Caveat: It is not necessary to put the implementation in the header file, see the alternative solution at the end of this answer.
Anyway, the reason your code is failing is that, when instantiating a template, the compiler creates a new class with the given template argument. For example:
template<typename T>
struct Foo
{
T bar;
void doSomething(T param) {/* do stuff using T */}
};
// somewhere in a .cpp
Foo<int> f;
When reading this line, the compiler will create a new class (let's call it FooInt), which is equivalent to the following:
struct FooInt
{
int bar;
void doSomething(int param) {/* do stuff using int */}
}
Consequently, the compiler needs to have access to the implementation of the methods, to instantiate them with the template argument (in this case int). If these implementations were not in the header, they wouldn't be accessible, and therefore the compiler wouldn't be able to instantiate the template.
A common solution to this is to write the template declaration in a header file, then implement the class in an implementation file (for example .tpp), and include this implementation file at the end of the header.
Foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
#include "Foo.tpp"
Foo.tpp
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
This way, implementation is still separated from declaration, but is accessible to the compiler.
Alternative solution
Another solution is to keep the implementation separated, and explicitly instantiate all the template instances you'll need:
Foo.h
// no implementation
template <typename T> struct Foo { ... };
Foo.cpp
// implementation of Foo's methods
// explicit instantiations
template class Foo<int>;
template class Foo<float>;
// You will only be able to use Foo with int or float
If my explanation isn't clear enough, you can have a look at the C++ Super-FAQ on this subject.
It's because of the requirement for separate compilation and because templates are instantiation-style polymorphism.
Lets get a little closer to concrete for an explanation. Say I've got the following files:
foo.h
declares the interface of class MyClass<T>
foo.cpp
defines the implementation of class MyClass<T>
bar.cpp
uses MyClass<int>
Separate compilation means I should be able to compile foo.cpp independently from bar.cpp. The compiler does all the hard work of analysis, optimization, and code generation on each compilation unit completely independently; we don't need to do whole-program analysis. It's only the linker that needs to handle the entire program at once, and the linker's job is substantially easier.
bar.cpp doesn't even need to exist when I compile foo.cpp, but I should still be able to link the foo.o I already had together with the bar.o I've only just produced, without needing to recompile foo.cpp. foo.cpp could even be compiled into a dynamic library, distributed somewhere else without foo.cpp, and linked with code they write years after I wrote foo.cpp.
"Instantiation-style polymorphism" means that the template MyClass<T> isn't really a generic class that can be compiled to code that can work for any value of T. That would add overhead such as boxing, needing to pass function pointers to allocators and constructors, etc. The intention of C++ templates is to avoid having to write nearly identical class MyClass_int, class MyClass_float, etc, but to still be able to end up with compiled code that is mostly as if we had written each version separately. So a template is literally a template; a class template is not a class, it's a recipe for creating a new class for each T we encounter. A template cannot be compiled into code, only the result of instantiating the template can be compiled.
So when foo.cpp is compiled, the compiler can't see bar.cpp to know that MyClass<int> is needed. It can see the template MyClass<T>, but it can't emit code for that (it's a template, not a class). And when bar.cpp is compiled, the compiler can see that it needs to create a MyClass<int>, but it can't see the template MyClass<T> (only its interface in foo.h) so it can't create it.
If foo.cpp itself uses MyClass<int>, then code for that will be generated while compiling foo.cpp, so when bar.o is linked to foo.o they can be hooked up and will work. We can use that fact to allow a finite set of template instantiations to be implemented in a .cpp file by writing a single template. But there's no way for bar.cpp to use the template as a template and instantiate it on whatever types it likes; it can only use pre-existing versions of the templated class that the author of foo.cpp thought to provide.
You might think that when compiling a template the compiler should "generate all versions", with the ones that are never used being filtered out during linking. Aside from the huge overhead and the extreme difficulties such an approach would face because "type modifier" features like pointers and arrays allow even just the built-in types to give rise to an infinite number of types, what happens when I now extend my program by adding:
baz.cpp
declares and implements class BazPrivate, and uses MyClass<BazPrivate>
There is no possible way that this could work unless we either
Have to recompile foo.cpp every time we change any other file in the program, in case it added a new novel instantiation of MyClass<T>
Require that baz.cpp contains (possibly via header includes) the full template of MyClass<T>, so that the compiler can generate MyClass<BazPrivate> during compilation of baz.cpp.
Nobody likes (1), because whole-program-analysis compilation systems take forever to compile , and because it makes it impossible to distribute compiled libraries without the source code. So we have (2) instead.
Plenty correct answers here, but I wanted to add this (for completeness):
If you, at the bottom of the implementation cpp file, do explicit instantiation of all the types the template will be used with, the linker will be able to find them as usual.
Edit: Adding example of explicit template instantiation. Used after the template has been defined, and all member functions has been defined.
template class vector<int>;
This will instantiate (and thus make available to the linker) the class and all its member functions (only). Similar syntax works for function templates, so if you have non-member operator overloads you may need to do the same for those.
The above example is fairly useless since vector is fully defined in headers, except when a common include file (precompiled header?) uses extern template class vector<int> so as to keep it from instantiating it in all the other (1000?) files that use vector.
Templates need to be instantiated by the compiler before actually compiling them into object code. This instantiation can only be achieved if the template arguments are known. Now imagine a scenario where a template function is declared in a.h, defined in a.cpp and used in b.cpp. When a.cpp is compiled, it is not necessarily known that the upcoming compilation b.cpp will require an instance of the template, let alone which specific instance would that be. For more header and source files, the situation can quickly get more complicated.
One can argue that compilers can be made smarter to "look ahead" for all uses of the template, but I'm sure that it wouldn't be difficult to create recursive or otherwise complicated scenarios. AFAIK, compilers don't do such look aheads. As Anton pointed out, some compilers support explicit export declarations of template instantiations, but not all compilers support it (yet?).
Actually, prior to C++11 the standard defined the export keyword that would make it possible to declare templates in a header file and implement them elsewhere. In a manner of speaking. Not really, as the only ones who ever implemented that feature pointed out:
Phantom advantage #1: Hiding source code. Many users, have said that they expect that by using export they will
no longer have to ship definitions for member/nonmember function templates and member functions of class
templates. This is not true. With export, library writers still have to ship full template source code or its direct
equivalent (e.g., a system-specific parse tree) because the full information is required for instantiation. [...]
Phantom advantage #2: Fast builds, reduced dependencies. Many users expect that export will allow true separate
compilation of templates to object code which they expect would allow faster builds. It doesn’t because the
compilation of exported templates is indeed separate but not to object code. Instead, export almost always makes
builds slower, because at least the same amount of compilation work must still be done at prelink time. Export
does not even reduce dependencies between template definitions because the dependencies are intrinsic,
independent of file organization.
None of the popular compilers implemented this keyword. The only implementation of the feature was in the frontend written by the Edison Design Group, which is used by the Comeau C++ compiler. All others required you to write templates in header files, because the compiler needs the template definition for proper instantiation (as others pointed out already).
As a result, the ISO C++ standard committee decided to remove the export feature of templates with C++11.
Although standard C++ has no such requirement, some compilers require that all function and class templates need to be made available in every translation unit they are used. In effect, for those compilers, the bodies of template functions must be made available in a header file. To repeat: that means those compilers won't allow them to be defined in non-header files such as .cpp files
There is an export keyword which is supposed to mitigate this problem, but it's nowhere close to being portable.
Templates are often used in headers because the compiler needs to instantiate different versions of the code, depending on the parameters given/deduced for template parameters, and it's easier (as a programmer) to let the compiler recompile the same code multiple times and deduplicate later.
Remember that a template doesn't represent code directly, but a template for several versions of that code.
When you compile a non-template function in a .cpp file, you are compiling a concrete function/class.
This is not the case for templates, which can be instantiated with different types, namely, concrete code must be emitted when replacing template parameters with concrete types.
There was a feature with the export keyword that was meant to be used for separate compilation.
The export feature is deprecated in C++11 and, AFAIK, only one compiler implemented it.
You shouldn't make use of export.
Separate compilation is not possible in C++ or C++11 but maybe in C++17, if concepts make it in, we could have some way of separate compilation.
For separate compilation to be achieved, separate template body checking must be possible.
It seems that a solution is possible with concepts.
Take a look at this paper recently presented at the
standards committee meeting.
I think this is not the only requirement, since you still need to instantiate code for the template code in user code.
The separate compilation problem for templates I guess it's also a problem that is arising with the migration to modules, which is currently being worked.
EDIT: As of August 2020 Modules are already a reality for C++: https://en.cppreference.com/w/cpp/language/modules
Even though there are plenty of good explanations above, I'm missing a practical way to separate templates into header and body.
My main concern is avoiding recompilation of all template users, when I change its definition.
Having all template instantiations in the template body is not a viable solution for me, since the template author may not know all if its usage and the template user may not have the right to modify it.
I took the following approach, which works also for older compilers (gcc 4.3.4, aCC A.03.13).
For each template usage there's a typedef in its own header file (generated from the UML model). Its body contains the instantiation (which ends up in a library which is linked in at the end).
Each user of the template includes that header file and uses the typedef.
A schematic example:
MyTemplate.h:
#ifndef MyTemplate_h
#define MyTemplate_h 1
template <class T>
class MyTemplate
{
public:
MyTemplate(const T& rt);
void dump();
T t;
};
#endif
MyTemplate.cpp:
#include "MyTemplate.h"
#include <iostream>
template <class T>
MyTemplate<T>::MyTemplate(const T& rt)
: t(rt)
{
}
template <class T>
void MyTemplate<T>::dump()
{
cerr << t << endl;
}
MyInstantiatedTemplate.h:
#ifndef MyInstantiatedTemplate_h
#define MyInstantiatedTemplate_h 1
#include "MyTemplate.h"
typedef MyTemplate< int > MyInstantiatedTemplate;
#endif
MyInstantiatedTemplate.cpp:
#include "MyTemplate.cpp"
template class MyTemplate< int >;
main.cpp:
#include "MyInstantiatedTemplate.h"
int main()
{
MyInstantiatedTemplate m(100);
m.dump();
return 0;
}
This way only the template instantiations will need to be recompiled, not all template users (and dependencies).
It means that the most portable way to define method implementations of template classes is to define them inside the template class definition.
template < typename ... >
class MyClass
{
int myMethod()
{
// Not just declaration. Add method implementation here
}
};
Just to add something noteworthy here. One can define methods of a templated class just fine in the implementation file when they are not function templates.
myQueue.hpp:
template <class T>
class QueueA {
int size;
...
public:
template <class T> T dequeue() {
// implementation here
}
bool isEmpty();
...
}
myQueue.cpp:
// implementation of regular methods goes like this:
template <class T> bool QueueA<T>::isEmpty() {
return this->size == 0;
}
main()
{
QueueA<char> Q;
...
}
The compiler will generate code for each template instantiation when you use a template during the compilation step.
In the compilation and linking process .cpp files are converted to pure object or machine code which in them contains references or undefined symbols because the .h files that are included in your main.cpp have no implementation YET. These are ready to be linked with another object file that defines an implementation for your template and thus you have a full a.out executable.
However since templates need to be processed in the compilation step in order to generate code for each template instantiation that you define, so simply compiling a template separate from it's header file won't work because they always go hand and hand, for the very reason that each template instantiation is a whole new class literally. In a regular class you can separate .h and .cpp because .h is a blueprint of that class and the .cpp is the raw implementation so any implementation files can be compiled and linked regularly, however using templates .h is a blueprint of how the class should look not how the object should look meaning a template .cpp file isn't a raw regular implementation of a class, it's simply a blueprint for a class, so any implementation of a .h template file can't be compiled because you need something concrete to compile, templates are abstract in that sense.
Therefore templates are never separately compiled and are only compiled wherever you have a concrete instantiation in some other source file. However, the concrete instantiation needs to know the implementation of the template file, because simply modifying the typename T using a concrete type in the .h file is not going to do the job because what .cpp is there to link, I can't find it later on because remember templates are abstract and can't be compiled, so I'm forced to give the implementation right now so I know what to compile and link, and now that I have the implementation it gets linked into the enclosing source file. Basically, the moment I instantiate a template I need to create a whole new class, and I can't do that if I don't know how that class should look like when using the type I provide unless I make notice to the compiler of the template implementation, so now the compiler can replace T with my type and create a concrete class that's ready to be compiled and linked.
To sum up, templates are blueprints for how classes should look, classes are blueprints for how an object should look.
I can't compile templates separate from their concrete instantiation because the compiler only compiles concrete types, in other words, templates at least in C++, is pure language abstraction. We have to de-abstract templates so to speak, and we do so by giving them a concrete type to deal with so that our template abstraction can transform into a regular class file and in turn, it can be compiled normally. Separating the template .h file and the template .cpp file is meaningless. It is nonsensical because the separation of .cpp and .h only is only where the .cpp can be compiled individually and linked individually, with templates since we can't compile them separately, because templates are an abstraction, therefore we are always forced to put the abstraction always together with the concrete instantiation where the concrete instantiation always has to know about the type being used.
Meaning typename T get's replaced during the compilation step not the linking step so if I try to compile a template without T being replaced as a concrete value type that is completely meaningless to the compiler and as a result object code can't be created because it doesn't know what T is.
It is technically possible to create some sort of functionality that will save the template.cpp file and switch out the types when it finds them in other sources, I think that the standard does have a keyword export that will allow you to put templates in a separate cpp file but not that many compilers actually implement this.
Just a side note, when making specializations for a template class, you can separate the header from the implementation because a specialization by definition means that I am specializing for a concrete type that can be compiled and linked individually.
If the concern is the extra compilation time and binary size bloat produced by compiling the .h as part of all the .cpp modules using it, in many cases what you can do is make the template class descend from a non-templatized base class for non type-dependent parts of the interface, and that base class can have its implementation in the .cpp file.
A way to have separate implementation is as follows.
inner_foo.h
template <typename T>
struct Foo
{
void doSomething(T param);
};
foo.tpp
#include "inner_foo.h"
template <typename T>
void Foo<T>::doSomething(T param)
{
//implementation
}
foo.h
#include <foo.tpp>
main.cpp
#include <foo.h>
inner_foo.h has the forward declarations. foo.tpp has the implementation and includes inner_foo.h; and foo.h will have just one line, to include foo.tpp.
On compile time, contents of foo.h are copied to foo.tpp and then the whole file is copied to foo.h after which it compiles. This way, there is no limitations, and the naming is consistent, in exchange for one extra file.
I do this because static analyzers for the code break when it does not see the forward declarations of class in *.tpp. This is annoying when writing code in any IDE or using YouCompleteMe or others.
That is exactly correct because the compiler has to know what type it is for allocation. So template classes, functions, enums,etc.. must be implemented as well in the header file if it is to be made public or part of a library (static or dynamic) because header files are NOT compiled unlike the c/cpp files which are. If the compiler doesn't know the type is can't compile it. In .Net it can because all objects derive from the Object class. This is not .Net.
I suggest looking at this gcc page which discusses the tradeoffs between the "cfront" and "borland" model for template instantiations.
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Template-Instantiation.html
The "borland" model corresponds to what the author suggests, providing the full template definition, and having things compiled multiple times.
It contains explicit recommendations concerning using manual and automatic template instantiation. For example, the "-repo" option can be used to collect templates which need to be instantiated. Or another option is to disable automatic template instantiations using "-fno-implicit-templates" to force manual template instantiation.
In my experience, I rely on the C++ Standard Library and Boost templates being instantiated for each compilation unit (using a template library). For my large template classes, I do manual template instantiation, once, for the types I need.
This is my approach because I am providing a working program, not a template library for use in other programs. The author of the book, Josuttis, works a lot on template libraries.
If I was really worried about speed, I suppose I would explore using Precompiled Headers
https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html
which is gaining support in many compilers. However, I think precompiled headers would be difficult with template header files.
Another reason that it's a good idea to write both declarations and definitions in header files is for readability. Suppose there's such a template function in Utility.h:
template <class T>
T min(T const& one, T const& theOther);
And in the Utility.cpp:
#include "Utility.h"
template <class T>
T min(T const& one, T const& other)
{
return one < other ? one : other;
}
This requires every T class here to implement the less than operator (<). It will throw a compiler error when you compare two class instances that haven't implemented the "<".
Therefore if you separate the template declaration and definition, you won't be able to only read the header file to see the ins and outs of this template in order to use this API on your own classes, though the compiler will tell you in this case about which operator needs to be overridden.
I had to write a template class an d this example worked for me
Here is an example of this for a dynamic array class.
#ifndef dynarray_h
#define dynarray_h
#include <iostream>
template <class T>
class DynArray{
int capacity_;
int size_;
T* data;
public:
explicit DynArray(int size = 0, int capacity=2);
DynArray(const DynArray& d1);
~DynArray();
T& operator[]( const int index);
void operator=(const DynArray<T>& d1);
int size();
int capacity();
void clear();
void push_back(int n);
void pop_back();
T& at(const int n);
T& back();
T& front();
};
#include "dynarray.template" // this is how you get the header file
#endif
Now inside you .template file you define your functions just how you normally would.
template <class T>
DynArray<T>::DynArray(int size, int capacity){
if (capacity >= size){
this->size_ = size;
this->capacity_ = capacity;
data = new T[capacity];
}
// for (int i = 0; i < size; ++i) {
// data[i] = 0;
// }
}
template <class T>
DynArray<T>::DynArray(const DynArray& d1){
//clear();
//delete [] data;
std::cout << "copy" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
}
template <class T>
DynArray<T>::~DynArray(){
delete [] data;
}
template <class T>
T& DynArray<T>::operator[]( const int index){
return at(index);
}
template <class T>
void DynArray<T>::operator=(const DynArray<T>& d1){
if (this->size() > 0) {
clear();
}
std::cout << "assign" << std::endl;
this->size_ = d1.size_;
this->capacity_ = d1.capacity_;
data = new T[capacity()];
for(int i = 0; i < size(); ++i){
data[i] = d1.data[i];
}
//delete [] d1.data;
}
template <class T>
int DynArray<T>::size(){
return size_;
}
template <class T>
int DynArray<T>::capacity(){
return capacity_;
}
template <class T>
void DynArray<T>::clear(){
for( int i = 0; i < size(); ++i){
data[i] = 0;
}
size_ = 0;
capacity_ = 2;
}
template <class T>
void DynArray<T>::push_back(int n){
if (size() >= capacity()) {
std::cout << "grow" << std::endl;
//redo the array
T* copy = new T[capacity_ + 40];
for (int i = 0; i < size(); ++i) {
copy[i] = data[i];
}
delete [] data;
data = new T[ capacity_ * 2];
for (int i = 0; i < capacity() * 2; ++i) {
data[i] = copy[i];
}
delete [] copy;
capacity_ *= 2;
}
data[size()] = n;
++size_;
}
template <class T>
void DynArray<T>::pop_back(){
data[size()-1] = 0;
--size_;
}
template <class T>
T& DynArray<T>::at(const int n){
if (n >= size()) {
throw std::runtime_error("invalid index");
}
return data[n];
}
template <class T>
T& DynArray<T>::back(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[size()-1];
}
template <class T>
T& DynArray<T>::front(){
if (size() == 0) {
throw std::runtime_error("vector is empty");
}
return data[0];
}

Conditional compiling/linking with gcc

I have scientific simulation code written in C that runs from the command line. The user provides an input model as a set of C subroutines in a model.c file that are then compiled into code at runtime.
Certain model properties are not always relevant for a specific problem but currently the user still needs to provide an empty dummy function for that property in order for the code to compile.
Is it possible to have dummy subroutines for the model properties embedded in the source code that are linked only if the user-provided model.c does not contain a subroutine for that property?
As an example, if model.c contains a subroutine called temperature(), the code should link to that one, rather than the subroutine called temperature() found in src/dummy_function.c. If model.c does not have temperature() the compiler should use the dummy function in src/dummy_function.c.
If possible, I would prefer a solution that does not require preprocessor directives in the model.c file.
Yes you can. Suppose you have simple code in file, say undesym.c:
int
main(void)
{
user_routine();
return 0;
}
Create weak stub in say file fakeone.c
#include "assert.h"
int __attribute__((weak))
user_routine(void)
{
assert(0 == "stub user_routine is not for call");
return 0;
}
Now create "user" function in, say goodone.c
#include "stdio.h"
int
user_routine(void)
{
printf("user_routine Ok\n");
return 0;
}
Now if you will link together gcc undesym.c fakeone.c then a.out will run to assert, but if you will add goodone.c to compilation, like gcc undesym.c fakeone.c goodone.c, then it will prefer strong definition to weak, and will run to message.
You may adopt the same mechanism, defining default functions weak.
Since you say that the user's subroutines are "compiled into the code at runtime", you could use dynamic linking to load a user-provided binary and look for their entry points at runtime. In linux (or any POSIX system), this would be based on dlopen()/dlsym() and look more or less like this:
#include <dlfcn.h>
/* ... */
/* Link the UserModule.so into the executable */
void *user_module = dlopen("UserModule.so", RTLD_NOW);
if (!user_module) {
/* Module could not be loaded, handle error */
}
/* Locate the "temperature" function */
void *temperature_ptr = dlsym(user_module, "temperature"):
if (!temperature_ptr) {
/* Module does not define the "temperature" function */
}
void (*user_temperature)() = (void(*)())temperature_ptr;
/* Call the user function */
user_temperature();
See the dlopen documentation for details. A similar facility is very likely available in whatever OS you're using.

What is the proper signature for operator=?

I've updated my development machine to the just-released 12.04 version of Ubuntu, which apparently gave me a new version of GCC (4.6.3). Now source code that used to compile fine is giving me errors about the compiler-generated assignment operator:
source/local.cpp:1185:59: error: no match for ‘operator=’ in ‘olddata = stype::block_t((*(const oc::json_t*)(& local::database_t::getData(const sdata::uuid_t&, size_t*, sdata::override::type_t)((* & uuid), (& otype), (sdata::override::type_t)1u))), 0)’
source/local.cpp:1185:59: note: candidate is:
source/sdata/block.hpp:13:7: note: sdata::block_t& sdata::block_t::operator=(sdata::block_t&)
source/sdata/block.hpp:13:7: note: no known conversion for argument 1 from ‘sdata::block_t’ to ‘sdata::block_t&’
So far as I can tell, the compiler-generated operator= should have the signature foo& operator=(const foo&), not foo& operator=(foo&). Is this a bug in this version of GCC, or is my understanding wrong? I can't find any references to this on the GCC bug-tracker, and I can't believe that no one else has run into it.
(Google doesn't make searching for terms that contain an equals sign easy, but with patience I found a single reference to this problem here, discussing GCC 4.6.2.)
An example of a string class i did once:
String& String::operator=(const String& other) {
// checking for self-assignment (pointer-compare)
if(this != &other) {
delete[] s_buffer;
s_length = other.Length();
s_buffer = new char[s_length+1];
memcpy(s_buffer,other.c_str(),s_length+1);
}
return *this;
}
I don't know what's correct, but GCC's behavior definitely causes problems. I'm calling it a bug.
Oddly enough, it doesn't seem to happen on every class, only a few, so until I can come up with a simple example, I can't really file a bug-report on it.

Resources