Generic Memory Allocator - memory-management

I would like to create a memory manager/allocator which allocates memory with any specified size.
I have tried using Boost and ACE memory allocator but they allocates only specif type of data
In ACE_cached_allocator if allocator is int type we can only allocate memory for int, cant get for char.
example:- i can ask any size of memory like int, char or any class/structure
please suggest any lib for such application

I think you just want malloc. You can allocate memory for an array of 20 int like this:
int *a = (int *)malloc(sizeof *a * 20);
a[0] = 1234;
a[1] = 4567;
free(a);
If you have a structure or class with type Foo it's just:
Foo *fp = (Foo *)malloc(sizeof *fp);
The answer seems simple, so did I misunderstand your question?
(Note: the casts aren't necessary in C.)

Related

vmalloc() allocates from vm_struct list

Kernel document https://www.kernel.org/doc/gorman/html/understand/understand010.html says, that for vmalloc-ing
It searches through a linear linked list of vm_structs and returns a new struct describing the allocated region.
Does that mean vm_struct list is already created while booting up, just like kmem_cache_create and vmalloc() just adjusts the page entries? If that is the case, say if I have a 16GB RAM in x86_64 machine, the whole ZONE_NORMAL i.e
16GB - ZONE_DMA - ZONE_DMA32 - slab-memory(cache/kmalloc)
is used to create vm_struct list?
That document is fairly old. It's talking about Linux 2.5-2.6. Things have changed quite a bit with those functions from what I can tell. I'll start by talking about code from kernel 2.6.12 since that matches Gorman's explanation and is the oldest non-rc tag in the Linux kernel Github repo.
The vm_struct list that the document is referring to is called vmlist. It is created here as a struct pointer:
struct vm_struct *vmlist;
Trying to figure out if it is initialized with any structs during bootup took some deduction. The easiest way to figure it out was by looking at the function get_vmalloc_info() (edited for brevity):
if (!vmlist) {
vmi->largest_chunk = VMALLOC_TOTAL;
}
else {
vmi->largest_chunk = 0;
prev_end = VMALLOC_START;
for (vma = vmlist; vma; vma = vma->next) {
unsigned long addr = (unsigned long) vma->addr;
if (addr >= VMALLOC_END)
break;
vmi->used += vma->size;
free_area_size = addr - prev_end;
if (vmi->largest_chunk < free_area_size)
vmi->largest_chunk = free_area_size;
prev_end = vma->size + addr;
}
if (VMALLOC_END - prev_end > vmi->largest_chunk)
vmi->largest_chunk = VMALLOC_END - prev_end;
}
The logic says that if the vmlist pointer is equal to NULL (!NULL), then there are no vm_structs on the list and the largest_chunk of free memory in this VMALLOC area is the entire space, hence VMALLOC_TOTAL. However, if there is something on the vmlist, then figure out the largest chunk based on the difference between the address of the current vm_struct and the end of the previous vm_struct (i.e. free_area_size = addr - prev_end).
What this tells us is that when we vmalloc, we look through the vmlist to find the absence of a vm_struct in a virtual memory area big enough to accomodate our request. Only then can it create this new vm_struct, which will now be part of the vmlist.
vmalloc will eventually call __get_vm_area(), which is where the action happens:
for (p = &vmlist; (tmp = *p) != NULL ;p = &tmp->next) {
if ((unsigned long)tmp->addr < addr) {
if((unsigned long)tmp->addr + tmp->size >= addr)
addr = ALIGN(tmp->size +
(unsigned long)tmp->addr, align);
continue;
}
if ((size + addr) < addr)
goto out;
if (size + addr <= (unsigned long)tmp->addr)
goto found;
addr = ALIGN(tmp->size + (unsigned long)tmp->addr, align);
if (addr > end - size)
goto out;
}
found:
area->next = *p;
*p = area;
By this point in the function we have already created a new vm_struct named area. This for loop just needs to find where to put the struct in the list. If the vmlist is empty, we skip the loop and immediately execute the "found" lines, making *p (the vmlist) point to our struct. Otherwise, we need to find the struct that will go after ours.
So in summary, this means that even though the vmlist pointer might be created at boot time, the list isn't necessarily populated at boot time. That is, unless there are vmalloc calls during boot or functions that explicitly add vm_structs to the list during boot as in future kernel versions (see below for kernel 6.0.9).
One further clarification for you. You asked if ZONE_NORMAL is used for the vmlist, but those are two separate memory address spaces. ZONE_NORMAL is describing physical memory whereas vm is virtual memory. There are lots of resources for explaining the difference between the two (e.g. this Stack Overflow question). The specific virtual memory address range for vmlist goes from VMALLOC_START to VMALLOC_END. In x86, those were defined as:
#define VMALLOC_START 0xffffc20000000000UL
#define VMALLOC_END 0xffffe1ffffffffffUL
For kernel version 6.0.9:
The creation of the vm_struct list is here:
static struct vm_struct *vmlist __initdata;
At this point, there is nothing on the list. But in this kernel version there are a few boot functions that may add structs to the list:
void __init vm_area_add_early(struct vm_struct *vm)
void __init vm_area_register_early(struct vm_struct *vm, size_t align)
As for vmalloc in this version, the vmlist is now only a list used during initialization. get_vm_area() now calls get_vm_area_node(), which is a NUMA ready function. From there, the logic goes deeper and is much more complicated than the linear search described above.

How Memory is allocated for member in the example

I was looking at Microsoft site about single inheritance. In the example given (code is copied at the end), I am not sure how memory is allocated to Name. Memory is allocated for 10 objects. But Name is a pointer member of the class. I guess I can assign constant string something like
DocLib[i]->Name = "Hello";
But we cannot change this string. In such situation, do I need allocate memory to even Name using new operator in the same for loop something like
DocLib[i]->Name = new char[50];
The code from Microsoft site is here:
// deriv_SingleInheritance4.cpp
// compile with: /W3
struct Document {
char *Name;
void PrintNameOf() {}
};
class PaperbackBook : public Document {};
int main() {
Document * DocLib[10]; // Library of ten documents.
for (int i = 0 ; i < 10 ; i++)
DocLib[i] = new Document;
}
Yes in short. Name is just a pointer to a char (or char array). The structure instantiation does not allocate space for this char (or array). You have to allocate space, and make the pointer(Name) point to that space. In the following case
DocLib[i]->Name = "Hello";
the memory (for "Hello") is allocated in the read only data section of the executable(on load) and your pointer just points to this location. Thats why its not modifiable.
Alternatively you could use string objects instead of char pointers.

Allocating a buffer on the heap at runtime

I am learning Rust by writing simple binary decoder.
I'm using a BufferedReader with the byteorder crate to read numbers, but I'm having problems with reading byte buffers.
I want to read byte data into buffer allocated at runtime.
Then I want to pass ownership of this buffer to a struct. When struct is no longer in use, the buffer should be deallocated.
There seems to be no way to allocate array with size determined at runtime on heap except some Vec::with_capacity() hacks. Any ideas how to implement this with proper Rust semantics?
This will create a pre-allocated mutable 500MB byte buffer of zeros stored on the heap with no need for unsafe rust:
// Correct
let mut buffer = vec![0_u8; 536870912];
Note that the following code below is not a good idea and will most likely result in a stack overflow because the buffer is created on the stack before being boxed and moved to the heap.
// Incorrect - stack used
let mut bytes: Box<[u8]> = Box::new([0_u8; 536870912])
// Incorrect - slow
let mut bytes = Vec::with_capacity(536870912);
for _ in 0..bytes.capacity() {
bytes.push(0_u8);
}
Rust is a low-level language; thus you can allocate raw memory and then fill it with objects yourself. Of course, it will require unsafe code, as all fiddling with raw memory does.
Here is a complete example:
use std::{
alloc::{self, Layout},
mem, ptr,
};
fn main() {
unsafe {
let layout = Layout::from_size_align(512 * 1024, 4 * 1024).expect("Invalid layout");
let mut raw: *mut i32 = mem::transmute(alloc::alloc(layout));
for i in 0..(512 * 1024 / 4) {
ptr::write(raw, i as i32);
raw = raw.offset(1)
}
}
}
Of course, in real code, I would just use Vec to safely manage the memory for me. It's just simpler!
I tried using box but it seems that it is experimental and I can't use it with release branch. Any ideas how to implement this with proper Rust semantics?
This is covered in The Rust Programming Language, specifically the section "Using Box<T> to Point to Data on the Heap".
Use Box::new:
fn main() {
let answer: Box<u8> = Box::new(42);
}
See also:
Allocate array onto heap with size known at runtime
Is there any way to allocate a standard Rust array directly on the heap, skipping the stack entirely?
How to allocate arrays on the heap in Rust 1.0?
Creating a fixed-size array on heap in Rust
How do I allocate an array at runtime in Rust?
Thread '<main>' has overflowed its stack when allocating a large array using Box

Does kmalloc call type constructor?

It is known that memory allocation with new calls respective type constructor and memory allocation with malloc does not. But what about kmalloc?
I am trying to develop some system calls and I need to assign memory to a structure below.
struct mailbox{
unsigned long existing_messages;
unsigned long mxid;
struct message *msg;
struct message *last_node;
mailbox(){
existing_messages = 0;
mxid = 0;
msg = NULL;
last_node = NULL;
}
};
If I allocate memory with kmalloc will it call constructor for struct mailbox at allocation time? if not what are the reasonable possible ways to get the constructor called except calling constructor explicitly. Is there any equivalent function as new for memory allocation in kernel?
kmalloc doesn't call constructor.
one way in C++ is to call "placement new".
example:
void* ptr = malloc( sizeof(T) );
T* p = new (ptr) T(); //construct object in memory
note:
you need to call destructor explicitly to avoid memory leaks in object itself,
and then call corresponding de-allocation routine for this memory.
p->~T(); //call destructor
free(ptr); //free memory

How to use arrays in program (global) scope in OpenCL

AMD OpenCL Programming Guide, Section 6.3 Constant Memory Optimization:
Globally scoped constant arrays. These arrays are initialized,
globally scoped, and in the constant address space (as specified in
section 6.5.3 of the OpenCL specification). If the size of an array is
below 64 kB, it is placed in hardware constant buffers; otherwise, it
uses global memory. An example of this is a lookup table for math
functions.
I want to use this "globally scoped constant array". I have such code in pure C
#define SIZE 101
int *reciprocal_table;
int reciprocal(int number){
return reciprocal_table[number];
}
void kernel(int *output)
{
for(int i=0; i < SIZE; i+)
output[i] = reciprocal(i);
}
I want to port it into OpenCL
__kernel void kernel(__global int *output){
int gid = get_global_id(0);
output[gid] = reciprocal(gid);
}
int reciprocal(int number){
return reciprocal_table[number];
}
What should I do with global variable reciprocal_table? If I try to add __global or __constant to it I get an error:
global variable must be declared in addrSpace constant
I don't want to pass __constant int *reciprocal_table from kernel to reciprocal. Is it possible to initialize global variable somehow? I know that I can write it down into code, but does other way exist?
P.S. I'm using AMD OpenCL
UPD Above code is just an example. I have real much more complex code with a lot of functions. So I want to make array in program scope to use it in all functions.
UPD2 Changed example code and added citation from Programming Guide
#define SIZE 2
int constant array[SIZE] = {0, 1};
kernel void
foo (global int* input,
global int* output)
{
const uint id = get_global_id (0);
output[id] = input[id] + array[id];
}
I can get the above to compile with Intel as well as AMD. It also works without the initialization of the array but then you would not know what's in the array and since it's in the constant address space, you could not assign any values.
Program global variables have to be in the __constant address space, as stated by section 6.5.3 in the standard.
UPDATE Now, that I fully understood the question:
One thing that worked for me is to define the array in the constant space and then overwrite it by passing a kernel parameter constant int* array which overwrites the array.
That produced correct results only on the GPU Device. The AMD CPU Device and the Intel CPU Device did not overwrite the arrays address. It also is probably not compliant to the standard.
Here's how it looks:
#define SIZE 2
int constant foo[SIZE] = {100, 100};
int
baz (int i)
{
return foo[i];
}
kernel void
bar (global int* input,
global int* output,
constant int* foo)
{
const uint id = get_global_id (0);
output[id] = input[id] + baz (id);
}
For input = {2, 3} and foo = {0, 1} this produces {2, 4} on my HD 7850 Device (Ubuntu 12.10, Catalyst 9.0.2). But on the CPU I get {102, 103} with either OCL Implementation (AMD, Intel). So I can not stress, how much I personally would NOT do this, because it's only a matter of time, before this breaks.
Another way to achieve this is would be to compute .h files with the host during runtime with the definition of the array (or predefine them) and pass them to the kernel upon compilation via a compiler option. This, of course, requires recompilation of the clProgram/clKernel for every different LUT.
I struggled to get this work in my own program some time ago.
I did not find any way to initialize a constant or global scope array from the host via some clEnqueueWriteBuffer or so. The only way is to write it explicitely in your .cl source file.
So here my trick to initialize it from the host is to use the fact that you are actually compiling your source from the host, which also means you can alter your src.cl file before compiling it.
First my src.cl file reads:
__constant double lookup[SIZE] = { LOOKUP }; // precomputed table (in constant memory).
double func(int idx) {
return(lookup[idx])
}
__kernel void ker1(__global double *in, __global double *out)
{
... do something ...
double t = func(i)
...
}
notice the lookup table is initialized with LOOKUP.
Then, in the host program, before compiling your OpenCL code:
compute the values of my lookup table in host_values[]
on your host, run something like:
char *buf = (char*) malloc( 10000 );
int count = sprintf(buf, "#define LOOKUP "); // actual source generation !
for (int i=0;i<SIZE;i++) count += sprintf(buf+count, "%g, ",host_values[i]);
count += sprintf(buf+count,"\n");
then read the content of your source file src.cl and place it right at buf+count.
you now have a source file with an explicitely defined lookup table that you just computed from the host.
compile your buffer with something like clCreateProgramWithSource(context, 1, (const char **) &buf, &src_sz, err);
voilĂ  !
It looks like "array" is a look-up table of sorts. You'll need to clCreateBuffer and clEnqueueWriteBuffer so the GPU has a copy of it to use.

Resources