How to create a string on the heap in D? - memory-management

I'm writing a trie in D and I want each trie object have a pointer to some data, which has a non-NULL value if the node is a terminal node in the trie, and NULL otherwise. The type of the data is undetermined until the trie is created (in C this would be done with a void *, but I plan to do it with a template), which is one of the reasons why pointers to heap objects are desirable.
This requires me to eventually create my data on the heap, at which point it can be pointed to by the trie node. Experimenting, it seems like new performs this task, much as it does in C++. However for some reason, this fails with strings. The following code works:
import std.stdio;
void main() {
string *a;
string b = "hello";
a = &b;
writefln("b = %s, a = %s, *a = %s", b, a, *a);
}
/* OUTPUT:
b = hello, a = 7FFF5C60D8B0, *a = hello
*/
However, this fails:
import std.stdio;
void main() {
string *a;
a = new string();
writefln("a = %s, *a = %s", a, *a);
}
/* COMPILER FAILS WITH:
test.d(5): Error: new can only create structs, dynamic arrays or class objects, not string's
*/
What gives? How can I create strings on the heap?
P.S. If anyone writing the D compiler is reading this, the apostrophe in "string's" is a grammatical error.

Strings are always allocated on the heap. This is the same for any other dynamic array (T[], string is only an alias to type immutable(char)[]).
If you need only one pointer there are two ways to do it:
auto str = "some immutable(char) array";
auto ptr1 = &str; // return pointer to reference to string (immutable(char)[]*)
auto ptr2 = str.ptr; // return pointer to first element in string (char*)
If you need pointer to empty string, use this:
auto ptr = &"";
Remember that you can't change value of any single character in string (because they are immutable). If you want to operate on characters in string use this:
auto mutableString1 = cast(char[])"Convert to mutable."; // shouldn't be used
// or
auto mutableString2 = "Convert to mutable.".dup; // T[].dup returns mutable duplicate of array
Generally you should avoid pointers unless you absolutely know what are you doing.
From memory point of view any pointer take 4B (8B for x64 machines) of memory, but if you are using pointers to arrays then, if pointer is not null, there are 12B (+ data in array) of memory in use. 4B if from pointer and 8B are from reference to array, because array references are set of two pointers. One to first and one to last element in array.

Remember that string is just immutable(char)[]. So you don't need pointers since string is already a dynamic array.
As for creating them, you just do new char[X], not new string.

The string contents are on the heap already because strings are dynamic arrays. However, in your case, it is better to use a char dynamic array instead as you require mutability.
import std.stdio;
void main() {
char[] a = null; // redundant as dynamic arrays are initialized to null
writefln("a = \"%s\", a.ptr = %s", a, a.ptr); // prints: a = "", a.ptr = null
a = "hello".dup; // dup is required because a is mutable
writefln("a = \"%s\", a.ptr = %s", a, a.ptr); // prints: a = "hello", a.ptr = 7F3146469FF0
}
Note that you don't actually hold the array's contents, but a slice of it. The array is handled by the runtime and it is allocated on the heap.
A good reading on the subject is this article http://dlang.org/d-array-article.html

If you can only use exactly one pointer and you don't want to use the suggestions in Marmyst's answer (&str in his example creates a reference to the stack which you might not want, str.ptr loses information about the strings length as D strings are not always zero terminated) you can do this:
Remeber that you can think of D arrays (and therefore strings) as a struct with a data pointer and length member:
struct ArraySlice(T)
{
T* ptr;
size_t length;
}
So when dealing with an array the array's content is always on the heap, but the ptr/length combined type is a value type and therefore usually kept on the stack. I don't know why the compiler doesn't allow you to create that value type on the heap using new, but you can always do it manually:
import core.memory;
import std.stdio;
string* ptr;
void alloc()
{
ptr = cast(string*)GC.malloc(string.sizeof);
*ptr = "Hello World!";
}
void main()
{
alloc();
writefln("ptr=%s, ptr.ptr=%s, ptr.length=%s, *ptr=%s", ptr, ptr.ptr, ptr.length, *ptr);
}

Related

In the following example, from where does the pointer p gets the information?

vector& vector::operator = (const vector& a)
//make this vector a copy of a
{
double* p = new double [ a.sz ]; // allocate new space
copy(a.elem, a.elem+a.sz, elem); // copy elements
delete[] elem; // deallocate old space
elem = p; // now we can reset elem
sz = a.sz;
return *this; // return a self-reference
}
I thought that the third argument of std::copy() should be the pointer p, but the book (Programming principles and practice using C++ - 2nd edition) says:
"When implementing the assignment, you could consider simplifying the code by freeing the memory for the old elements before creating the copy, but it is usually a very good idea not to throw away information before you know that you can replace it. Also, if you did that, strange things would happen if you assigned a vector to itself" - Page 635 and 636.
So, the pointer elem must be third argument of std::copy() to not let the pointer be invalid for a moment. I think...
But from where does p gets the information to be put in the array it points to, to be able to do: elem = p ?
I already know copy and swap strategy exist, you don't have to explain that.
I want to comprehend what is above.
No, that is a typo.
std::copy(a.elem, a.elem+a.sz, p);
is what the code should read.

Pointer not printing char[] array

I'm writing some code to take in a string, turn it into a char array and then print back to the user (before passing to another function).
Currently the code works up to dat.toCharArray(DatTim,datsize); however, the pointer does not seem to be working as the wile loop never fires
String input = "Test String for Foo";
InputParse(input);
void InputParse (String dat)
//Write Data
datsize = dat.length()+1;
const char DatTim[datsize];
dat.toCharArray(DatTim,datsize);
//Debug print back
for(int i=0;i<datsize;i++)
{
Serial.write(DatTim[i]);
}
Serial.println();
//Debug pointer print back
const char *b;
b=*DatTim;
while (*b)
{
Serial.print(*b);
b++;
}
Foo(*DatTim);
I can't figure out the difference between what I have above vs the template code provided by Majenko
void PrintString(const char *str)
{
const char *p;
p = str;
while (*p)
{
Serial.print(*p);
p++;
}
}
The expression *DatTim is the same as DatTim[0], i.e. it gets the first character in the array and then assigns it to the pointer b (something the compiler should have warned you about).
Arrays naturally decays to pointers to their first element, that is DatTim is equal to &DatTim[0].
The simple solution is to simply do
const char *b = DatTim;

how do I allocate memory for some of the structure elements

I want to allocate memory for some elements of a structure, which are pointers to other small structs.How do I allocate and de-allocate memory in best way?
Ex:
typedef struct _SOME_STRUCT {
PDATATYPE1 PDatatype1;
PDATATYPE2 PDatatype2;
PDATATYPE3 PDatatype3;
.......
PDATATYPE12 PDatatype12;
} SOME_STRUCT, *PSOME_STRUCT;
I want to allocate memory for PDatatype1,3,4,6,7,9,11.Can I allocate memory with single malloc? or what is the best way to allocate memory for only these elements and how to free the whole memory allocated?
There is a trick that allows a single malloc, but that also has to weighed against doing a more standard multiple malloc approach.
If [and only if], once the DatatypeN elements of SOME_STRUCT are allocated, they do not need to be reallocated in any way, nor does any other code do a free on any of them, you can do the following [the assumption that PDATATYPEn points to DATATYPEn]:
PSOME_STRUCT
alloc_some_struct(void)
{
size_t siz;
void *vptr;
PSOME_STRUCT sptr;
// NOTE: this optimizes down to a single assignment
siz = 0;
siz += sizeof(DATATYPE1);
siz += sizeof(DATATYPE2);
siz += sizeof(DATATYPE3);
...
siz += sizeof(DATATYPE12);
sptr = malloc(sizeof(SOME_STRUCT) + siz);
vptr = sptr;
vptr += sizeof(SOME_STRUCT);
sptr->Pdatatype1 = vptr;
// either initialize the struct pointed to by sptr->Pdatatype1 here or
// caller should do it -- likewise for the others ...
vptr += sizeof(DATATYPE1);
sptr->Pdatatype2 = vptr;
vptr += sizeof(DATATYPE2);
sptr->Pdatatype3 = vptr;
vptr += sizeof(DATATYPE3);
...
sptr->Pdatatype12 = vptr;
vptr += sizeof(DATATYPE12);
return sptr;
}
Then, the when you're done, just do free(sptr).
The sizeof above should be sufficient to provide proper alignment for the sub-structs. If not, you'll have to replace them with a macro (e.g. SIZEOF) that provides the necessary alignment. (e.g.) for 8 byte alignment, something like:
#define SIZEOF(_siz) (((_siz) + 7) & ~0x07)
Note: While it is possible to do all this, and it is more common for things like variable length string structs like:
struct mystring {
int my_strlen;
char my_strbuf[0];
};
struct mystring {
int my_strlen;
char *my_strbuf;
};
It is debatable whether it's worth the [potential] fragility (i.e. somebody forgets and does the realloc/free on the individual elements). The cleaner way would be to embed the actual structs rather than the pointers to them if the single malloc is a high priority for you.
Otherwise, just do the the [more] standard way and do the 12 individual malloc calls and, later, the 12 free calls.
Still, it is a viable technique, particularly on small memory constrained systems.
Here is the [more] usual way involving per-element allocations:
PSOME_STRUCT
alloc_some_struct(void)
{
void *vptr;
PSOME_STRUCT sptr;
sptr = malloc(sizeof(SOME_STRUCT));
// either initialize the struct pointed to by sptr->Pdatatype1 here or
// caller should do it -- likewise for the others ...
sptr->Pdatatype1 = malloc(sizeof(DATATYPE1));
sptr->Pdatatype2 = malloc(sizeof(DATATYPE2));
sptr->Pdatatype3 = malloc(sizeof(DATATYPE3));
...
sptr->Pdatatype12 = malloc(sizeof(DATATYPE12));
return sptr;
}
void
free_some_struct(PSOME_STRUCT sptr)
{
free(sptr->Pdatatype1);
free(sptr->Pdatatype2);
free(sptr->Pdatatype3);
...
free(sptr->Pdatatype12);
free(sptr);
}
If your structure contains the others structures as elements instead of pointers, you can allocate memory for the combined structure in one shot:
typedef struct _SOME_STRUCT {
DATATYPE1 Datatype1;
DATATYPE2 Datatype2;
DATATYPE3 Datatype3;
.......
DATATYPE12 Datatype12;
} SOME_STRUCT, *PSOME_STRUCT;
PSOME_STRUCT p = (PSOME_STRUCT)malloc(sizeof(SOME_STRUCT));
// Or without malloc:
PSOME_STRUCT p = new SOME_STRUCT();

ruby c extension how to manage garbage collection between 2 objects

I have a C extension in which I have a main class (class A for example) created with the classical:
Data_Wrap_Struct
rb_define_alloc_func
rb_define_private_method(mymodule, "initialize" ...)
This A class have an instance method that generate B object. Those B objects can only be generated from A objects and have C data wrapped that depends on the data wrapped in the A instance.
I the A object are collected by the garbage collector before a B object, this could result in a Seg Fault.
How can I tell the GC to not collect a A instance while some of his B objects are still remaining. I guess I have to use rb_gc_mark or something like that. Should I have to mark the A instance each time a B object is created ??
Edit : More specifics Informations
I am trying to write a Clang extension. With clang, you first create a CXIndex, from which you can get a CXTranslationUnit, from which you can get a CXDiagnostic and or a CXCursor and so on. here is a simple illustration:
Clangc::Index#new => Clangc::Index
Clangc::Index#create_translation_unit => Clangc::TranslationUnit
Clangc::TranslationUnit#diagnostic(index) => Clangc::Diagnostic
You can see some code here : https://github.com/cedlemo/ruby-clangc
Edit 2 : A solution
The stuff to build the "b" objects with a reference to the "a" object:
typedef struct B_t {
void * data;
VALUE instance_of_a;
} B_t;
static void
c_b_struct_free(B_t *s)
{
if(s)
{
if(s->data)
a_function_to_free_the_data(s->data);
ruby_xfree(s);
}
}
static void
c_b_mark(void *s)
{
B_t *b =(B_t *)s;
rb_gc_mark(b->an_instance_of_a);
}
VALUE
c_b_struct_alloc( VALUE klass)
{
B_t * ptr;
ptr = (B_t *) ruby_xmalloc(sizeof(B_t));
ptr->data = NULL;
ptr->an_instance_of_a = Qnil;
return Data_Wrap_Struct(klass, c_b_mark, c_b_struct_free, (void *) ptr);
}
The c function that is used to build a "b" object from an "a" object:
VALUE c_A_get_b_object( VALUE self, VALUE arg)
{
VALUE mModule = rb_const_get(rb_cObject, rb_intern("MainModule"));\
VALUE cKlass = rb_const_get(mModule, rb_intern("B"));
VALUE b_instance = rb_class_new_instance(0, NULL, cKlass);
B_t *b;
Data_Get_Struct(b_instance, B_t, b);
/*
transform ruby value arg to C value c_arg
*/
b->data = function_to_fill_the_data(c_arg);
b->instance_of_a = self;
return b_instance;
}
In the Init_mainModule function:
void Init_mainModule(void)
{
VALUE mModule = rb_define_module("MainModule");
/*some code ....*/
VALUE cKlass = rb_define_class_under(mModule, "B", rb_cObject);
rb_define_alloc_func(cKlass, c_b_struct_alloc);
}
Same usage of the rb_gc_mark can be found in mysql2/ext/mysql2/client.c ( rb_mysql_client_mark function) in the project https://github.com/brianmario/mysql2
In the mark function for your B class, you should mark the A Ruby object, telling the garbage collector not to garbage collect it.
The mark function can be specified as the second argument to Data_Wrap_Struct. You might need to modify your design somehow to expose a pointer to the A objects.
Another option is to let the A object be an instance variable of the B object. You should probably do this anyway so that Ruby code can obtain the A object from the B object. Doing this would have the side effect of making the garbage collector not collect the A before the B, but you should not be relying on this side effect because it would be possible for your Ruby code to accidentally mess up the instance variable and then cause a segmentation fault.
Edit: Another option is to use reference counting of the shared C data. Then when the last Ruby object that is using that shared data gets garbage collected, you would delete the shared data. This would involve finding a good, cross-platform, thread-safe way to do reference counting so it might not be trivial.

scanf_s throws exception

Why does the following code throw an exception when getting to the second scanf_s after entering an number to put into the struct.
This by no means represents a complete linked list implementation.
Not sure how to get onto the next scanf_s when having entered the value? Any ideas?
EDIT: Updated code with suggested solution, but still get an AccessViolationException after first scanf_s
Code:
struct node
{
char name[20];
int age;
float height;
node *nxt;
};
int FillInLinkedList(node* temp)
{
int result;
temp = new node;
printf("Please enter name of the person");
result = scanf_s("%s", temp->name);
printf("Please enter persons age");
result = scanf_s("%d", &temp->age); // Exception here...
printf("Please enter persons height");
result = scanf_s("%f", &temp->height);
temp->nxt = NULL;
if (result >0)
return 1;
else return 0;
}
// calling code
int main(array<System::String ^> ^args)
{
node temp;
FillInLinkedList(&temp);
...
You are using scanf_s with incorrect parameters. Take a look at the examples in the MSDN documentation for the function. It requires that you pass in the size of the buffer after the buffer for all string or character parameters. So
result = scanf_s("%s", temp->name);
should be:
result = scanf_s("%s", temp->name, 20);
The first call to scanf_s is reading garbage off the stack because it is looking for another parameter and possibly corrupting memory.
There is no compiler error because scanf_s uses a variable argument list - the function doesn't have a fixed number of parameters so the compiler has no idea what scanf_s is expecting.
You need
result = scanf_s("%d", &temp->age);
and
result = scanf_s("%f", &temp->height);
Reason is that sscanf (and friends) requires a pointer to the output variable so it can store the result there.
BTW, you have a similar problem with the parameter temp of your function. Since you're changing the pointer (and not just the contents of what it points to), you need to pass a double pointer so that the changes will be visible outside your function:
int FillInLinkedList(node** temp)
And then of course you'll have to make the necessary changes inside the function.
scanf() stores data into variables, so you need to pass the address of the variable (or its pointer)Example:
char string[10];
int n;
scanf("%s", string); //string actually points to address of
//first element of string array
scanf("%d", &n); // &n is the address of the variable 'n'
%19c should be %s
temp->age should be &temp-age
temp->height should be &temp->height
Your compiler should be warning you
about these errors
I believe you need to pass parameters to scanf() functions by address. i.e. &temp->age
otherwise temp-age will be interpreted as a pointer, which will most likely crash your program.

Resources