I am dynamically loading the cudart (Cuda Run Time Library) to access just the cudaGetDeviceProperties function. This one requires two arguments:
A cudaDeviceProp structure which is defined in a header of the run time library;
An integer which represents the device ID.
I am not including the cuda_runtime.h header in order to not get extra constants, macros, enum, class... that I do not want to use.
However, I need the cudaDeviceProp structure. Is there a way to get it without redefining it? I wrote the following code:
struct cudaDeviceProp;
class CudaRTGPUInfoDL
{
typedef int(*CudaDriverVersion)(int*);
typedef int(*CudaRunTimeVersion)(int*);
typedef int(*CudaDeviceProperties)(cudaDeviceProp*,int);
public:
struct Properties
{
char name[256]; /**< ASCII string identifying device */
size_t totalGlobalMem; /**< Global memory available on device in bytes */
size_t sharedMemPerBlock; /**< Shared memory available per block in bytes */
int regsPerBlock; /**< 32-bit registers available per block */
int warpSize; /**< Warp size in threads */
size_t memPitch; /**< Maximum pitch in bytes allowed by memory copies */
/*... Tons of members follow..*/
};
public:
CudaRTGPUInfoDL();
~CudaRTGPUInfoDL();
int getCudaDriverVersion();
int getCudaRunTimeVersion();
const Properties& getCudaDeviceProperties();
private:
QLibrary library;
private:
CudaDriverVersion cuDriverVer;
CudaRunTimeVersion cuRTVer;
CudaDeviceProperties cuDeviceProp;
Properties properties;
};
As everybody can see, I simply "copy-pasted" the declaration of the structure.
In order to get the GPU properties, I simply use this method:
const CudaRTGPUInfoDL::Properties& CudaRTGPUInfoDL::getCudaDeviceProperties()
{
// Unsafe but needed.
cuDeviceProp(reinterpret_cast<cudaDeviceProp*>(&properties), 0);
return properties;
}
Thanks for your answers.
If you need the structure to be complete, you should define it (probably by including the appropriate header).
If you're just going to be passing around references or pointers, such as in the method you show, then it doesn't need to be complete and can just be forward declared:
class cudaDeviceProp;
Related
I am trying to port a code base from iar to avr-gcc. Amongst other things that have to replaced, the iar eeprom memory attribute __eeprom has to replaced with a avr-gcc friendly attribute. AFAIK the replacement for that is EEMEM, but the usage differs and I am not able to figure out how to replace __eeprom in the cleanest manner.
../src/myfunc.h:35:46: error: section attribute not allowed for 'src'
UBYTE *strcpye(UBYTE *dest, UBYTE EEMEM *src);
This error is not limited to pointers, but to all variables in general. IMO the usage of EEMEM is correct, where am I going wrong?
In the avr-gcc toolchain, avr-libc defines macro EEMEM in avr/eeprom.h:
#define EEMEM __attribute__((section(".eeprom")))
This means it's just an attribute that determines the section in which an object with this attribute will be located. In particular, EEMEM only makes sense for variables in static storage. Moreover, accesses to objects located in EEMEM have to be done by hand using functions / macros supplied by avr/eeprom.h like
void eeprom_read_block (void *dst, const void *src, size_t n);
void eeprom_write_byte (uint8_t *p, uint8_t value);
void eeprom_update_word (uint16_t *p, uint16_t value);
etc. Also notice that EEMEM is just an attribute and not a qualifier (like __flash for example). This means that even though you can tag a pointer (target) using attributes, that won't change the access in any way. To be more specific, any access through a pointer that's attributed EEMEM will be to RAM and not to eeprom.
In your case, the prototype of strcpye would read
char* strcpye (char *dest, const char *src);
and the implementation of that function would apply eeprom_read_byte on src++ and write to dest++ until it reads a terminal \0. Notice that you might need an explicit pointer cast as eeprom_read_byte expects [const] uint8_t*, and that char, signed char and unsigned char are 3 distinct types in C.
when I read pool_allocator in STL, I found that many functions of the __pool_alloc_base class are implemented in gcc/libstdc++-v3/src/c++98/pool_allocator.cc, why is that designed? Why not put the implementation of these functions in the pool_allocator.h?
pool_allocator.h
class __pool_alloc_base {
protected:
...
_GLIBCXX_CONST _Obj *volatile *_M_get_free_list(size_t __bytes) throw();
__mutex & _M_get_mutex() throw();
// Returns an object of size __n, and optionally adds to size __n
// free list.
void *_M_refill(size_t __n);
// Allocates a chunk for nobjs of size size. nobjs may be reduced
// if it is inconvenient to allocate the requested number.
char *_M_allocate_chunk(size_t __n, int &__nobjs);
};
I'd started a bare-metal (Cortex-M) project some years ago. At project setup we decided to use gcc toolchain with C++11 / C++14 etc. enabled and even for using C++ exceptions and rtti.
We are currently using gcc 4.9 from launchpad.net/gcc-arm-embedded (having some issue which prevent us currently to update to a more recent gcc version).
For example, I'd wrote a base class and a derived class like this (see also running example here):
class OutStream {
public:
explicit OutStream() {}
virtual ~OutStream() {}
OutStream& operator << (const char* s) {
write(s, strlen(s));
return *this;
}
virtual void write(const void* buffer, size_t size) = 0;
};
class FixedMemoryStream: public OutStream {
public:
explicit FixedMemoryStream(void* memBuffer, size_t memBufferSize): memBuffer(memBuffer), memBufferSize(memBufferSize) {}
virtual ~FixedMemoryStream() {}
const void* getBuffer() const { return memBuffer; }
size_t getBufferSize() const { return memBufferSize; }
const char* getText() const { return reinterpret_cast<const char*>(memBuffer); } ///< returns content as zero terminated C-string
size_t getSize() const { return index; } ///< number of bytes really written to the buffer (max = buffersize-1)
bool isOverflow() const { return overflow; }
virtual void write(const void* buffer, size_t size) override { /* ... */ }
private:
void* memBuffer = nullptr; ///< buffer
size_t memBufferSize = 0; ///< buffer size
size_t index = 0; ///< current write index
bool overflow = false; ///< flag if we are overflown
};
So that the customers of my class are now able to use e.g.:
char buffer[10];
FixedMemoryStream ms1(buffer, sizeof(buffer));
ms1 << "Hello World";
Now I'd want to make the usage of the class a bit more comfortable and introduced the following template:
template<size_t bufferSize> class FixedMemoryStreamWithBuffer: public FixedMemoryStream {
public:
explicit FixedMemoryStreamWithBuffer(): FixedMemoryStream(buffer, bufferSize) {}
private:
uint8_t buffer[bufferSize];
};
And from now, my customers can write:
FixedMemoryStreamWithBuffer<10> ms2;
ms2 << "Hello World";
But from now, I'd observed increasing size of my executable binary. It seems that gcc added symbol information for each different template instantiation of FixedMemoryStreamWithBuffer (because we are using rtti for some reason).
Might there be a way to get rid of symbol information only for some specific classes / templates / template instantiations?
It's ok to get a non portable gcc only solution for this.
For some reason we decided to prefer templates instead of preprocessor macros, I want to avoid a preprocessor solution.
First of all, keep in mind that compiler also generates separate v-table (as well as RTTI information) for every FixedMemoryStreamWithBuffer<> type instance, as well as every class in the inheritance chain.
In order to resolve the problem I'd recommend using containment instead of inheritance with some conversion function and/or operator inside:
template<size_t bufferSize>
class FixedMemoryStreamWithBuffer
{
uint8_t buffer[bufferSize];
FixedMemoryStream m_stream;
public:
explicit FixedMemoryStreamWithBuffer() : m_stream(m_buffer, bufferSize) {}
operator FixedMemoryStream&() { return m_stream; }
FixedMemoryStream& toStream() { return m_stream; }
};
Yes, there's a way to bring the necessary symbols almost down to 0: using the standard library. Your OutStream class is a simplified version of std::basic_ostream. Your OutStream::write is really just std::basic_ostream::write and so on. Take a look at it here. Overflow is handled really closely, though, for completeness' sake, it also deals with underflow i.e. the need for data retrieval; you may leave it as undefined (it's virtual too).
Similarly, your FixedMemoryStream is std::basic_streambuf<T> with a fixed-size (a std::array<T>) get/put area.
So, just make your classes inherit from the standard ones and you'll cut off on binary size since you're reusing already declared symbols.
Now, regarding template<size_t bufferSize> class FixedMemoryStreamWithBuffer. This class is very similar to std::array<std::uint8_t, bufferSize> as for the way memory is specified and acquired. You can't optimize much about that: each instantiation is a different type with all what that implies. The compiler cannot "merge" or do anything magic about them: each instantiation must have its own type.
So either fall back on std::vector or have some fixed-size specialized chunks, like 32, 128 etc. and for any values in between would choose the right one; this can be achieved entirely at compile-time, so no runtime cost.
I use the following C structs in my C++11 code (the code comes from liblwgeom of PostGis, but this is not the core of the question). The code is compiled with the following options using g++-4.8:
-std=c++11 -Wall -Wextra -pedantic-errors -pedantic -Werror
and I don't get any errors during compilation (or warnings) (should I get any?)
Question
Is safe to use LWPOLY (actually pointed by LWGEOM*) in functions that accept LWGEOM and don't modify the void *data; member. I understand that this is poor man's inheritance but this is what I need to work with.
Details
POLYGON:
typedef struct
{
uint8_t type; /* POLYGONTYPE */
uint8_t flags;
GBOX *bbox;
int32_t srid;
int nrings; /* how many rings we are currently storing */
int maxrings; /* how many rings we have space for in **rings */
POINTARRAY **rings; /* list of rings (list of points) */
}
LWPOLY; /* "light-weight polygon" */
LWGEOM:
typedef struct
{
uint8_t type;
uint8_t flags;
GBOX *bbox;
int32_t srid;
void *data;
}
LWGEOM;
POINTARRAY:
typedef struct
{
/* Array of POINT 2D, 3D or 4D, possibly missaligned. */
uint8_t *serialized_pointlist;
/* Use FLAGS_* macros to handle */
uint8_t flags;
int npoints; /* how many points we are currently storing */
int maxpoints; /* how many points we have space for in serialized_pointlist */
}
POINTARRAY;
GBOX:
typedef struct
{
uint8_t flags;
double xmin;
double xmax;
double ymin;
double ymax;
double zmin;
double zmax;
double mmin;
double mmax;
} GBOX;
Am I violating strict aliasing rule when I do something like?
const LWGEOM* lwgeom;
...
const LWPOLY* lwpoly = reinterpret_cast<const LWPOLY*>(lwgeom);
I know that in PostGis types are specifically designed to be "compatible" however I'd like to know if I am violating the standard by doing so.
Also, I noticed that PostGis is not compiled with strict aliasing disabled by default (at least version 2.1.5).
Solution
My colleague helped me to investigate it and it seems the answer is No it doesn't violate strict aliasing, but only in case we access LWGEOMS members that are of the same type as of LWPOLY's and are laid out in the beginning of the struct contiguously. Here is why (quoting standard):
3.10.10 says that you can access a member through a pointer to "aggregate or union".
8.5.1 defines aggregates (C structs are aggregates):
An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or
protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).
9.2.19 says that pointer to the struct is the same as pointer to the fist member for standard layout classes (C structs are standard layout).
Whether this is a safe way to code is a different question.
Yes, it violates the strict aliasing rule. LWGEOM and LWPOLY are unrelated types, and so are int and void*. So, for example, modification to lwgeom->data may not be read through lwpoly->nrings and vice versa.
I validated this with GCC4.9. My code is as follows:
#include <cinttypes>
#include <iostream>
using namespace std;
typedef struct {
uint8_t type; /* POLYGONTYPE */
uint8_t flags;
int32_t srid;
int nrings; /* how many rings we are currently storing */
} LWPOLY; /* "light-weight polygon" */
typedef struct {
uint8_t type;
uint8_t flags;
int32_t srid;
void *data;
} LWGEOM;
void f(LWGEOM* pgeom, LWPOLY* ppoly) {
ppoly->nrings = 7;
pgeom->data = 0;
std::cout << ppoly->nrings << '\n';
}
int main() {
LWGEOM geom = {};
LWGEOM* pgeom = &geom;
LWPOLY* ppoly = (LWPOLY*)pgeom;
f(pgeom, ppoly);
}
Guess what, the output is 7.
I have been studying I2C driver (client) code for a while.
I have seen this function "i2c_get_clientdata" and "i2c_set_clientdata" every where.
I have seen the this question here .
Use of pointer to structure instead of creating static local copy
Some times i think like it is like "container_of" macro to get a pointer to the structure.
But still i didn't understood properly why to use it and when to use it.
Below i am posting a sample code where I see its usage.
If any one could help me understand why it is used there and when we shall use it when we write our own drivers.
struct max6875_data {
struct i2c_client *fake_client;
struct mutex update_lock;
u32 valid;
u8 data[USER_EEPROM_SIZE];
unsigned long last_updated[USER_EEPROM_SLICES];
};
static ssize_t max6875_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf, loff_t off, size_t count)
{
struct i2c_client *client = kobj_to_i2c_client(kobj);
struct max6875_data *data = i2c_get_clientdata(client);
int slice, max_slice;
if (off > USER_EEPROM_SIZE)
return 0;
if (off + count > USER_EEPROM_SIZE)
count = USER_EEPROM_SIZE - off;
/* refresh slices which contain requested bytes */
max_slice = (off + count - 1) >> SLICE_BITS;
for (slice = (off >> SLICE_BITS); slice <= max_slice; slice++)
max6875_update_slice(client, slice);
memcpy(buf, &data->data[off], count);
return count;
}
Those functions are used to get/set the void *driver_data pointer that is part of the struct device, itself part of struct i2c_client.
This is a void pointer that is for the driver to use. One would use this pointer mainly to pass driver related data around.
That is what is happening in your example. The max6875_read is a callback getting a structu kobject. That kobject is an i2c_client which is enough to communicate with the underlying device using the driver_data pointer here allows to get back the driver related data (instead of using global variables for example).