The following code in C++11 compiles correctly with g++ 6.3.0 and result in the behavior that I consider correct (namely, the first function is picked). However with Intel's C++ compiler (icc 17.0.4) it fails to compile; the compiler indicates that multiple possible function overloads exist.
#include <iostream>
template<typename R, typename ... Args>
static void f(R(func)(const int&, Args...)) {
std::cout << "In first version of f" << std::endl;
}
template<typename R, typename ... Args, typename X = typename std::is_void<R>::type>
static void f(R(func)(Args...), X x = X()) {
std::cout << "In second version of f" << std::endl;
}
double h(const int& x, double y) {
return 0;
}
int main(int argc, char** argv) {
f(h);
return 0;
}
Here is the error reported by icc:
test.cpp(18): error: more than one instance of overloaded function "f" matches the argument list:
function template "void f(R (*)(const int &, Args...))"
function template "void f(R (*)(Args...), X)"
argument types are: (double (const int &, double))
f(h);
^
So my two questions are: which compiler is correct with respect to the standard? and how would you modify this code so that it compiles? (note that f is a user-facing API and I would like to avoid modifying its prototype).
Note that if I remove typename X = typename std::is_void<R>::type and the X x = X() argument in the second version of f, icc compiles it fine.
This is a bug in Intel Compiler 17.0 Update 4. The behavior of GCC is right in this case. This issue is resolved in Intel Compiler 18.0 Update 4 and above.
Related
I am depending on a library, the authors of which extensively used the brace-notation for invoking all constructors, as has been fondly advertised and recommended by a number of parties in recent years.
The library is mostly developed on Linux using gcc but aims at being cross-platform compatible and in my case is used on Windows using Visual Studio 2015.
If I try to build the library, I get a C2447 compiler error which arises when templates of templates use this brace-notation.
I tried to illustrate my case with the following MWE.
#include <iostream>
template <typename T>
class A
{
public:
A(T x);
virtual ~A() = default;
T getX();
private:
T x;
};
template <typename T>
class B : public A<T>
{
public:
B(T x);
};
template <typename T>
class C : public A<T>
{
public:
C(T x);
};
template <typename T>
class D : public A<T>
{
public:
D(T x);
};
int main(int argn, char** argc)
{
A<int> a(42);
B<int> b(42);
C<int> c(42);
D<int> d(42);
std::cout << "A: " << a.getX() << std::endl
<< "B: " << b.getX() << std::endl
<< "C: " << c.getX() << std::endl
<< "D: " << d.getX() << std::endl;
return 0;
}
template<typename T>
A<T>::A(T x) : x(x) {}
template<typename T>
T A<T>::getX() { return x; }
template<typename T>
B<T>::B(T x) : A{ x / 2 } {} // does not compile in gcc [1]
template<typename T>
C<T>::C(T x) : A<T>(x * 2) {} // compiles fine in both
template<typename T>
D<T>::D(T x) : A<T>{ x*x } {} // does not compile in MSVC 2015 [2]
/*
[1]: error: class 'B<T>' does not have any field named 'A'
B<T>::B(T x) : A{ x / 2 } {}
[2]: error C2447: '{': missing function header (old-style formal list?)
*/
My online-search to figure out whether this is a compiler-bug or if it is invalid notation according to the standard remained fruitless. Can anybody please elucidate which notations used in B, C, and D are to be considered correct, please? Obviously both compilers agree on C, but naively I would consider the notations used in B and D valid as well.
In notation B (below), base class template A is missing the template parameter (the constructor of A refers to it), so GCC (and Clang as well) fail to compile it.
template<typename T>
B<T>::B(T x) : A{ x / 2 } {} // does not compile in gcc [1]
Changing it like this works.
template<typename T>
B<T>::B(T x) : A<T>{ x / 2 } {} // does not compile in gcc [1]
In case of notation D, GCC (and Clang as well) are correct because there is a matching constructor for initialization of A<int>. MSVC is wrong to reject it.
I am trying to use recursion to solve this problem where if i call
decimal<0,0,1>();
i should get the decimal number (4 in this case).
I am trying to use recursion with variadic templates but cannot get it to work.
Here's my code;
template<>
int decimal(){
return 0;
}
template<bool a,bool...pack>
int decimal(){
cout<<a<<"called"<<endl;
return a*2 + decimal<pack...>();
};
int main(int argc, char *argv[]){
cout<<decimal<0,0,1>()<<endl;
return 0;
}
What would be the best way to solve this?
template<typename = void>
int decimal(){
return 0;
}
template<bool a,bool...pack>
int decimal(){
cout<<a<<"called"<<endl;
return a + 2*decimal<pack...>();
};
The problem was with the recursive case, where it expects to be able to call decltype<>(). That is what I have defined in the first overload above. You can essentially ignore the typename=void, the is just necessary to allow the first one to compile.
A possible solution can be the use of a constexpr function (so you can use it's values it's value run-time, when appropriate) where the values are argument of the function.
Something like
#include <iostream>
constexpr int decimal ()
{ return 0; }
template <typename T, typename ... packT>
constexpr int decimal (T const & a, packT ... pack)
{ return a*2 + decimal(pack...); }
int main(int argc, char *argv[])
{
constexpr int val { decimal(0, 0, 1) };
static_assert( val == 2, "!");
std::cout << val << std::endl;
return 0;
}
But I obtain 2, not 4.
Are you sure that your code should return 4?
-- EDIT --
As pointed by aschepler, my example decimal() template function return "eturns twice the sum of its arguments, which is not" what do you want.
Well, with 0, 1, true and false you obtain the same; with other number, you obtain different results.
But you can modify decimal() as follows
template <typename ... packT>
constexpr int decimal (bool a, packT ... pack)
{ return a*2 + decimal(pack...); }
to avoid this problem.
This is a C++14 solution. It is mostly C++11, except for std::integral_sequence nad std::index_sequence, both of which are relatively easy to implement in C++11.
template<bool...bs>
using bools = std::integer_sequence<bool, bs...>;
template<std::uint64_t x>
using uint64 = std::integral_constant< std::uint64_t, x >;
template<std::size_t N>
constexpr uint64< ((std::uint64_t)1) << (std::uint64_t)N > bit{};
template<std::uint64_t... xs>
struct or_bits : uint64<0> {};
template<std::int64_t x0, std::int64_t... xs>
struct or_bits<x0, xs...> : uint64<x0 | or_bits<xs...>{} > {};
template<bool...bs, std::size_t...Is>
constexpr
uint64<
or_bits<
uint64<
bs?bit<Is>:std::uint64_t(0)
>{}...
>{}
>
from_binary( bools<bs...> bits, std::index_sequence<Is...> ) {
(void)bits; // suppress warning
return {};
}
template<bool...bs>
constexpr
auto from_binary( bools<bs...> bits={} )
-> decltype( from_binary( bits, std::make_index_sequence<sizeof...(bs)>{} ) )
{ return {}; }
It generates the resulting value as a type with a constexpr conversion to scalar. This is slightly more powerful than a constexpr function in its "compile-time-ness".
It assumes that the first bit is the most significant bit in the list.
You can use from_binary<1,0,1>() or from_binary( bools<1,0,1>{} ).
Live example.
This particular style of type-based programming results in code that does all of its work in its signature. The bodies consist of return {};.
Let's consider the following code:
class A {
public:
constexpr A(int value) : m_value(value);
private:
const int m_value;
};
void f(const A& a);
f(42); // OK
f(std::rand()); // KO - How to detect this case?
Is there a way to determine if A is built at compile time or run-time?
One way to verify that an expression is indeed a constexpr is to assign it to a constexpr variable:
int main()
{
constexpr int a = f(42);
constexpr int b = f(std::rand());
return(0);
}
Since the value constexpr variables must be computable by the compiler, the assignment of b will produce an error. GCC says the following:
foo.cpp:25:35: error: call to non-constexpr function ‘int rand()’
constexpr int b = f(std::rand());
I'm not sure if it can be done in a way to merely produce a warning, however.
I am currently experimenting with the GCC vector extensions. However, I am wondering how to go about getting sqrt(vec) to work as expected.
As in:
typedef double v4d __attribute__ ((vector_size (16)));
v4d myfunc(v4d in)
{
return some_sqrt(in);
}
and at least on a recent x86 system have it emit a call to the relevant intrinsic sqrtpd. Is there a GCC builtin for sqrt that works on vector types or does one need to drop down to the intrinsic level to accomplish this?
Looks like it's a bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54408 I don't know of any workaround other than do it component-wise. The vector extensions were never meant to replace platform specific intrinsics anyway.
Some funky code to this effect:
#include <cmath>
#include <utility>
template <::std::size_t...> struct indices { };
template <::std::size_t M, ::std::size_t... Is>
struct make_indices : make_indices<M - 1, M - 1, Is...> {};
template <::std::size_t... Is>
struct make_indices<0, Is...> : indices<Is...> {};
typedef float vec_type __attribute__ ((vector_size(4 * sizeof(float))));
template <::std::size_t ...Is>
vec_type sqrt_(vec_type const& v, indices<Is...> const)
{
vec_type r;
::std::initializer_list<int>{(r[Is] = ::std::sqrt(v[Is]), 0)...};
return r;
}
vec_type sqrt(vec_type const& v)
{
return sqrt_(v, make_indices<4>());
}
int main()
{
vec_type v;
return sqrt(v)[0];
}
You could also try your luck with auto-vectorization, which is separate from the vector extension.
You can loop over the vectors directly
#include <math.h>
typedef double v2d __attribute__ ((vector_size (16)));
v2d myfunc(v2d in) {
v2d out;
for(int i=0; i<2; i++) out[i] = sqrt(in[i]);
return out;
}
The sqrt function has to trap for signed zero and NAN but if you avoid these with -Ofast both Clang and GCC produce simply sqrtpd.
https://godbolt.org/g/aCuovX
GCC might have a bug because I had to loop to 4 even though there are only 2 elements to get optimal code.
But with AVX and AVX512 GCC and Clang are ideal
AVX
https://godbolt.org/g/qdTxyp
AVX512
https://godbolt.org/g/MJP1n7
My reading of the question is that you want the square root of 4 packed double precision values... that's 32 bytes. Use the appropriate AVX intrinsic:
#include <x86intrin.h>
typedef double v4d __attribute__ ((vector_size (32)));
v4d myfunc (v4d v) {
return _mm256_sqrt_pd(v);
}
x86-64 gcc 10.2 and x86-64 clang 10.0.1
using -O3 -march=skylake :
myfunc:
vsqrtpd %ymm0, %ymm0 # (or just `ymm0` for Intel syntax)
ret
ymm0 is the return value register.
That said, it just so happens there is a builtin: __builtin_ia32_sqrtpd256, which doesn't require the intrinsics header. I would definitely discourage its use however.
Both clang and gcc fail to compile the code below when ArrayCount is a template. This seems wrong, especially in light of the fact that the sizeof ArrayCount solution work. The template version of ArrayCount is normally a better solution, but it's getting in the way here and constexpr is seemingly not living up to the spirit of its promise.
#if 1
template<typename T, size_t N>
constexpr size_t ArrayCount(T (&)[N])
{
return N;
}
// Results in this (clang): error : static_assert expression is not an integral constant expression
// Results in this (gcc): error: non-constant condition for static assertion, 'this' is not a constant expression
#else
#define ArrayCount(t) (sizeof(t) / sizeof(t[0]))
// Succeeds
#endif
struct X
{
int x[4];
X() { static_assert(ArrayCount(x) == 4, "should never fail"); }
};
The right solution doesn't use homebrew code, but a simple type trait:
int a[] = {1, 2, 3};
#include <type_traits>
static_assert(std::extent<decltype(a)>::value == 3, "You won't see this");
It makes sense to me that this code would fail to compile since ArrayCount is a function taking a non-constexpr argument. According to the standard, I believe this means that ArrayCount must be intstantiated as a non-constexpr function.
There are workarounds, of course. I can think of two off the top of my head (one implemented in terms of the other):
template<typename T> struct ArrayCount;
template<typename T, size_t N>
struct ArrayCount<T[N]> {
static size_t const size = N;
};
template<typename T>
constexpr size_t ArrayCount2() {
return ArrayCount<T>::size;
}
struct X {
int x[4];
X() {
static_assert(ArrayCount<decltype(x)>::size == 4, "should never fail");
static_assert(ArrayCount2<decltype(x)>() == 4, "should never fail");
}
};
It does mean having to use decltype() when you might not wish to, but it does break the pro-forma constraint on taking a non-constexpr parameter.