I was wondering if there is a way to access a data member within a struct that is being pointed to by a void*? What I'm trying to explain will hopefully be more apparent in my example code:
int main()
{
struct S
{
int val;
};
S s;
s.val = 5;
void* p;
p = malloc(sizeof(S));
*(struct S*) p = s;
std::cout<< *(struct S*)p.val << std::endl;
}
I have ran this exact code casting p as *(int*)p and it printed fine, however, using exact code above results in a compilation error. Haven't been able to find an example that quite accomplishes this task. Is it possible to access the data members of the struct after it is casted? why or why not? if so, how?
The . operator has higher precedence than a C-style cast. So *(struct S*)p.val is treated as *((struct S*)(p.val)), which doesn't make sense since p is a pointer and does not have members.
So you need parentheses to specify what you intended:
std::cout<< (*(struct S*)p).val << std::endl;
Or equivalently,
std::cout<< static_cast<S*>(p)->val << std::endl;
[But also: the statement *(struct S*) p = s; technically has undefined behavior, even though all most implementations will allow it. This is because C++ has rules about when an object is created, and there was no object of type S previously at that address, and assignment does not create an object except for some cases involving union members. A similar statement that does not have this problem would be new(p) S{s};.
Also also: use of malloc or void* is usually not a good idea in C++ in the first place. malloc should only be used when interfacing with a C library that requires it. Anything for which void* seems useful can probably be done more safely using templates. In a few cases a void* might be the only way to do something or "cleverly" avoid code duplication or something, but still use it sparingly and always with extreme caution.]
Ok, muddling though Stack on the particulars about void*, books like The C Programming Language (K&R) and The C++ Programming Language (Stroustrup). What have I learned? That void* is a generic pointer with no type inferred. It requires a cast to any defined type and printing void* just yields the address.
What else do I know? void* can't be dereferenced and thus far remains the one item in C/C++ from which I have discovered much written about but little understanding imparted.
I understand that it must be cast such as *(char*)void* but what makes no sense to me for a generic pointer is that I must somehow already know what type I need in order to grab a value. I'm a Java programmer; I understand generic types but this is something I struggle with.
So I wrote some code
typedef struct node
{
void* data;
node* link;
}Node;
typedef struct list
{
Node* head;
}List;
Node* add_new(void* data, Node* link);
void show(Node* head);
Node* add_new(void* data, Node* link)
{
Node* newNode = new Node();
newNode->data = data;
newNode->link = link;
return newNode;
}
void show(Node* head)
{
while (head != nullptr)
{
std::cout << head->data;
head = head->link;
}
}
int main()
{
List list;
list.head = nullptr;
list.head = add_new("My Name", list.head);
list.head = add_new("Your Name", list.head);
list.head = add_new("Our Name", list.head);
show(list.head);
fgetc(stdin);
return 0;
}
I'll handle the memory deallocation later. Assuming I have no understanding of the type stored in void*, how do I get the value out? This implies I already need to know the type, and this reveals nothing about the generic nature of void* while I follow what is here although still no understanding.
Why am I expecting void* to cooperate and the compiler to automatically cast out the type that is hidden internally in some register on the heap or stack?
I'll handle the memory deallocation later. Assuming I have no understanding of the type stored in void*, how do I get the value out?
You can't. You must know the valid types that the pointer can be cast to before you can dereference it.
Here are couple of options for using a generic type:
If you are able to use a C++17 compiler, you may use std::any.
If you are able to use the boost libraries, you may use boost::any.
Unlike Java, you are working with memory pointers in C/C++. There is no encapsulation whatsoever. The void * type means the variable is an address in memory. Anything can be stored there. With a type like int * you tell the compiler what you are referring to. Besides the compiler knows the size of the type (say 4 bytes for int) and the address will be a multiple of 4 in that case (granularity/memory alignment). On top, if you give the compiler the type it will perform consistency checks at compilation time. Not after. This is not happening with void *.
In a nutshell, you are working bare metal. The types are compiler directives and do not hold runtime information. Nor does it track the objects you are dynamically creating. It is merely a segment in memory that is allocated where you can eventually store anything.
The main reason to use void* is that different things may be pointed at. Thus, I may pass in an int* or Node* or anything else. But unless you know either the type or the length, you can't do anything with it.
But if you know the length, you can handle the memory pointed at without knowing the type. Casting it as a char* is used because it is a single byte, so if I have a void* and a number of bytes, I can copy the memory somewhere else, or zero it out.
Additionally, if it is a pointer to a class, but you don't know if it is a parent or inherited class, you may be able to assume one and find out a flag inside the data which tells you which one. But no matter what, when you want to do much beyond passing it to another function, you need to cast it as something. char* is just the easiest single byte value to use.
Your confusion derived from habit to deal with Java programs. Java code is set of instruction for a virtual machine, where function of RAM is given to a sort of database, which stores name, type, size and data of each object. Programming language you're learning now is meant to be compiled into instruction for CPU, with same organization of memory as underlying OS have. Existing model used by C and C++ languages is some abstract built on top of most of popular OSes in way that code would work effectively after being compiled for that platform and OS. Naturally that organization doesn't involve string data about type, except for famous RTTI in C++.
For your case RTTI cannot be used directly, unless you would create a wrapper around your naked pointer, which would store the data.
In fact C++ library contains a vast collection of container class templates that are useable and portable, if they are defined by ISO standard. 3/4 of standard is just description of library often referred as STL. Use of them is preferable over working with naked pointers, unless you mean to create own container for some reason. For particular task only C++17 standard offered std::any class, previously present in boost library. Naturally, it is possible to reimplement it, or, in some cases, to replace by std::variant.
Assuming I have no understanding of the type stored in void*, how do I get the value out
You don't.
What you can do is record the type stored in the void*.
In c, void* is used to pass around a binary chunk of data that points at something through one layer of abstraction, and recieve it at the other end, casting it back to the type that the code knows it will be passed.
void do_callback( void(*pfun)(void*), void* pdata ) {
pfun(pdata);
}
void print_int( void* pint ) {
printf( "%d", *(int*)pint );
}
int main() {
int x = 7;
do_callback( print_int, &x );
}
here, we forget thet ype of &x, pass it through do_callback.
It is later passed to code inside do_callback or elsewhere that knows that the void* is actually an int*. So it casts it back and uses it as an int.
The void* and the consumer void(*)(void*) are coupled. The above code is "provably correct", but the proof does not lie in the type system; instead, it depends on the fact we only use that void* in a context that knows it is an int*.
In C++ you can use void* similarly. But you can also get fancy.
Suppose you want a pointer to anything printable. Something is printable if it can be << to a std::ostream.
struct printable {
void const* ptr = 0;
void(*print_f)(std::ostream&, void const*) = 0;
printable() {}
printable(printable&&)=default;
printable(printable const&)=default;
printable& operator=(printable&&)=default;
printable& operator=(printable const&)=default;
template<class T,std::size_t N>
printable( T(&t)[N] ):
ptr( t ),
print_f( []( std::ostream& os, void const* pt) {
T* ptr = (T*)pt;
for (std::size_t i = 0; i < N; ++i)
os << ptr[i];
})
{}
template<std::size_t N>
printable( char(&t)[N] ):
ptr( t ),
print_f( []( std::ostream& os, void const* pt) {
os << (char const*)pt;
})
{}
template<class T,
std::enable_if_t<!std::is_same<std::decay_t<T>, printable>{}, int> =0
>
printable( T&& t ):
ptr( std::addressof(t) ),
print_f( []( std::ostream& os, void const* pt) {
os << *(std::remove_reference_t<T>*)pt;
})
{}
friend
std::ostream& operator<<( std::ostream& os, printable self ) {
self.print_f( os, self.ptr );
return os;
}
explicit operator bool()const{ return print_f; }
};
what I just did is a technique called "type erasure" in C++ (vaguely similar to Java type erasure).
void send_to_log( printable p ) {
std::cerr << p;
}
Live example.
Here we created an ad-hoc "virtual" interface to the concept of printing on a type.
The type need not support any actual interface (no binary layout requirements), it just has to support a certain syntax.
We create our own virtual dispatch table system for an arbitrary type.
This is used in the C++ standard library. In c++11 there is std::function<Signature>, and in c++17 there is std::any.
std::any is void* that knows how to destroy and copy its contents, and if you know the type you can cast it back to the original type. You can also query it and ask it if it a specific type.
Mixing std::any with the above type-erasure techinque lets you create regular types (that behave like values, not references) with arbitrary duck-typed interfaces.
I noticed that std::for_each requires it's iterators to meet the requirement InputIterator, which in turn requires Iterator and then Copy{Contructable,Assignable}.
That's not the only thing, std::for_each actually uses the copy constructor (cc) (not assignment as far as my configuration goes). That is, deleting the cc from the iterator will result in:
error: use of deleted function ‘some_iterator::some_iterator(const some_iterator&)’
Why does std::for_each need a cc? I found this particularly inconvenient, since I created an iterator which recursively iterates through files in a folder, keeping track of the files and folders on a queue. This means that the iterator has a queue data member, which would also have to be copied if the cc is used: that is unnecessarily inefficient.
The strange thing is that the cc is not called in this simple example:
#include <iostream>
#include <iterator>
#include <algorithm>
class infinite_5_iterator
:
public std::iterator<std::input_iterator_tag, int>
{
public:
infinite_5_iterator() = default;
infinite_5_iterator(infinite_5_iterator const &) {std::cout << "copy constr "; }
infinite_5_iterator &operator=(infinite_5_iterator const &) = delete;
int operator*() { return 5; }
infinite_5_iterator &operator++() { return *this; }
bool operator==(infinite_5_iterator const &) const { return false; }
bool operator!=(infinite_5_iterator const &) const { return true; }
};
int main() {
std::for_each(infinite_5_iterator(), infinite_5_iterator(),
[](int v) {
std::cout << v << ' ';
}
);
}
source: http://ideone.com/YVHph8
It however is needed compile time. Why does std::for_each need to copy construct the iterator, and when is this done? Isn't this extremely inefficient?
NOTE: I'm talking about the cc of the iterator, not of it's elements, as is done here: unexpected copies with foreach over a map
EDIT: Note that the standard does not state the copy-constructor is called at all, it just expresses the amount of times f is called. May I then assume that the cc is not called at all? Why is the use of operator++ and operator* and cc not specified, but the use of f is?
You have simply fallen victim to a specification that has evolved in bits and pieces over decades. The concept of InputIterator was invented a long time before the notion of move-only types, or movable types was conceived.
In hindsight I would love to declare that InputIterator need not be copyable. This would mesh perfectly with its single-pass behavior. But I also fear that such a change would have overwhelming backwards compatibility problems.
In addition to the flawed iterator concepts as specified in the standard, about a decade ago, in an attempt to be helpful, the gcc std::lib (libstdc++) started imposing "concepts" on things like InputIterator in the std-algorithms. I.e. because the standard says:
Requires: InputIterator shall satisfy the requirements of an input iterator (24.2.3).
then "concept checks" were inserted into the std-algorithms that require InputIterator to meet all of the requirements of input iterator whether or not the algorithm actually used all of those requirements. And in this case, it is the concept check, not the actual algorithm, that is requiring your iterator to be CopyConstructible.
<sigh>
If you write your own for_each algorithm, it is trivial to do so without requiring your iterators to be CopyConstructible or CopyAssignable (if supplied with rvalue iterator arguments):
template <class InputIterator, class Function>
inline
Function
for_each(InputIterator first, InputIterator last, Function f)
{
for (; first != last; ++first)
f(*first);
return f;
}
And for your use case I recommend either doing that, or simply writing your own loop.
I am trying to learn rvalue references, as an exercise I tried to do answer the following.
Is it possible to write a function that can tell (at least at runtime, better if at compile time) if the passed value is a value (non reference), a rvalue or an lvalue? for a generic type? I want to extract as much information about the type as possible.
An alternative statement of the problem could be:
Can I have a typeid-like function that can tell as much as possible about the calling expression?, for example (and ideally) if the type is T, T&, T const&, or T&&.
Currently, for example, typeid drops some information about the type and one can do better (as in the example the const and non-const reference are distiguished). But how much better than typeid can one possibly do?
This is my best attempt so far. It can't distinguish between a rvalue and a "constant". First and second case in the example).
Maybe distiguishing case 1 and 2 is not possible in any circumstance? since both are ultimately rvalue? the the question is Even if both are rvalues can the two cases trigger different behavior?
In any case, it seems I overcomplicated the solution as I needed to resort to rvalue conditional casts, and ended up with this nasty code and not even 100% there.
#include<iostream>
#include<typeinfo>
template<class T>
void qualified_generic(T&& t){
std::clog << __PRETTY_FUNCTION__ << std::endl;
std::clog
<< typeid(t).name() // ok, it drops any qualification
<< (std::is_const<typename std::remove_reference<decltype(std::forward<T>(t))>::type>::value?" const":"") // seems to detect constness rigth
<< (std::is_lvalue_reference<decltype(std::forward<T>(t))>::value?"&":"")
<< (std::is_rvalue_reference<decltype(std::forward<T>(t))>::value?"&&":"") // cannot distiguish between passing a constant and an rvalue expression
<< std::endl
;
}
using namespace std;
int main(){
int a = 5;
int const b = 5;
qualified_generic(5); // prints "int&&", would plain "int" be more appropriate?
qualified_generic(a+1); // prints "int&&" ok
qualified_generic(a); // print "int&", ok
qualified_generic(b); // print "int const&", ok
}
Maybe the ultimate solution to distiguish between the cases will involve detecting a constexpr.
UPDATE: I found this talk by Scott Meyers where he claims that "The Standard sometimes requires typeid to give the 'wrong' answer". http://vimeo.com/97344493 about minute 44. I wonder if this is one of the cases.
UPDATE 2015: I revisited the problem using Boost TypeIndex and the result is still the same. For example using:
template<class T>
std::string qualified_generic(T&& t){
return boost::typeindex::type_id_with_cvr<decltype(t)>().pretty_name();
// or return boost::typeindex::type_id_with_cvr<T>().pretty_name();
// or return boost::typeindex::type_id_with_cvr<T&&>().pretty_name();
// or return boost::typeindex::type_id_with_cvr<T&>().pretty_name();
}
Still it is not possible to distinguish the type of 5 and a+1 in the above example.
I am using BDS 2006 Turbo C++ for a long time now and some of my bigger projects (CAD/CAM,3D gfx engines and Astronomic computations) occasionally throw an exception (for example once in 3-12 months of 24/7 heavy duty usage). After extensive debugging I found this:
//code1:
struct _s { int i; } // any struct
_s *s=new _s[1024]; // dynamic allocation
delete[] s; // free up memory
this code is usually inside template where _s can be also class therefore delete[] this code should work properly, but the delete[] does not work properly for structs (classes looks OK). No exceptions is thrown, the memory is freed, but it somehow damages the memory manager allocation tables and after this any new allocation can be wrong (new can create overlapped allocations with already allocated space or even unallocated space hence the occasional exceptions)
I have found that if I add empty destructor to _s than suddenly seems everything OK
struct _s { int i; ~_s(){}; }
Well now comes the weird part. After I update this to my projects I have found that AnsiString class has also bad reallocations. For example:
//code2:
int i;
_s *dat=new _s[1024];
AnsiString txt="";
// setting of dat
for (i=0;i<1024;i++) txt+="bla bla bla\r\n";
// usage of dat
delete[] dat;
In this code dat contains some useful data, then later is some txt string created by adding lines so the txt must be reallocated few times and sometimes the dat data is overwritten by txt (even if they are not overlapped, I thing the temp AnsiString needed to reallocate txt is overlapped with dat)
So my questions are:
Am I doing something wrong in code1, code2 ?
Is there any way to avoid AnsiString (re)allocation errors ? (but still using it)
After extensive debugging (after posting question 2) I have found that AnsiString do not cause problems. They only occur while using them. The real problem is probably in switching between OpenGL clients. I have Open/Save dialogs with preview for vector graphics. If I disable OpenGL usage for these VCL sub-windows than AnsiString memory management errors disappears completely. I am not shore what is the problem (incompatibility between MFC/VCL windows or more likely I made some mistake in switching contexts, will further investigate). Concern OpenGL windows are:
main VCL Form + OpenGL inside Canvas client area
child of main MFC Open/Save dialog + docked preview VCL Form + OpenGL inside Canvas client area
P.S.
these errors depend on number of new/delete/delete[] usages not on the allocated sizes
both code1 and code2 errors are repetitive (for example have a parser to load complex ini file and the error occurs on the same line if the ini is not changed)
I detect these errors only on big projects (plain source code > 1MB) with combined usage of AnsiString and templates with internal dynamic allocations, but is possible that they are also in simpler projects but occurs so rarely that I miss it.
Infected projects specs:
win32 noinstall standalone (using Win7sp1 x64 but on XPsp3 x32 behaves the same)
does not meter if use GDI or OpenGl/GLSL
does not meter if use device driver DLLs or not
no OCX,or nonstandard VCL component
no DirectX
1 Byte aligned compilation/link
do not use RTL,packages or frameworks (standalone)
Sorry for bad English/grammar ...
any help / conclusion / suggestion appreciated.
After extensive debugging i finely isolated the problem.
Memory management of bds2006 Turbo C++ became corrupt after you try to call any delete for already deleted pointer. for example:
BYTE *dat=new BYTE[10],*tmp=dat;
delete[] dat;
delete[] tmp;
After this is memory management not reliable. ('new' can allocate already allocated space)
Of course deletion of the same pointer twice is bug on programmers side, but i have found the real cause of all my problems which generates this problem (without any obvious bug in source code) see this code:
//---------------------------------------------------------------------------
class test
{
public:
int siz;
BYTE *dat;
test()
{
siz=10;
dat=new BYTE[siz];
}
~test()
{
delete[] dat; // <- add breakpoint here
siz=0;
dat=NULL;
}
test& operator = (const test& x)
{
int i;
for (i=0;i<siz;i++) if (i<x.siz) dat[i]=x.dat[i];
for ( ;i<siz;i++) dat[i]=0;
return *this;
}
};
//---------------------------------------------------------------------------
test get()
{
test a;
return a; // here call a.~test();
} // here second call a.~test();
//---------------------------------------------------------------------------
void main()
{
get();
}
//---------------------------------------------------------------------------
In function get() is called destructor for class a twice. Once for real a and once for its copy because I forget to create constructor
test::test(test &x);
[Edit1] further upgrades of code
OK I have refined the initialization code for both class and struct even templates to fix even more bug-cases. Add this code to any struct/class/template and if needed than add functionality
T() {}
T(const T& a) { *this=a; }
~T() {}
T* operator = (const T *a) { *this=*a; return this; }
//T* operator = (const T &a) { ...copy... return this; }
T is the struct/class name
the last operator is needed only if T uses dynamic allocations inside it if no allocations are used you can leave it as is
This also resolves other compiler issues like this:
Too many initializers error for a simple array in bcc32
If anyone have similar problems hope this helps.
Also look at traceback a pointer in c++ code mmap if you need to debug your memory allocations...