How to free memory in the kernel - linux-kernel

I have the following lines of code in a kernel module. The KFREE line causes The OS to hang. Without it KEDR reports a memory leak. So how do I free p?
char * p = KMALLOC(100, GFP_ATOMIC);
p = dentry_path_raw(dentry, p, 100) ;
printk(KERN_EMERG " %s" , p);
KFREE(p);
KFREE is defined as follows:
# define KFREE(a) kfree((a))
KMALLOC is defined as follows:
# define KMALLOC(a,b) kmalloc((a),(b))

Related

for_each_possible_cpu macro in vmalloc_init() function, does the code run in only one cpu? or in every cpu?

this is not about programming, but I ask it here..
in linux start_kernel() function, in the mm_init() function, I see vmalloc_init() function.
inside the function I see codes like this.
void __init vmalloc_init(void)
{
struct vmap_area *va;
struct vm_struct *tmp;
int i;
/*
* Create the cache for vmap_area objects.
*/
vmap_area_cachep = KMEM_CACHE(vmap_area, SLAB_PANIC);
for_each_possible_cpu(i) {
struct vmap_block_queue *vbq;
struct vfree_deferred *p;
vbq = &per_cpu(vmap_block_queue, i);
spin_lock_init(&vbq->lock);
INIT_LIST_HEAD(&vbq->free);
p = &per_cpu(vfree_deferred, i);
init_llist_head(&p->list);
INIT_WORK(&p->wq, free_work);
}
/* Import existing vmlist entries. */
for (tmp = vmlist; tmp; tmp = tmp->next) {
va = kmem_cache_zalloc(vmap_area_cachep, GFP_NOWAIT);
if (WARN_ON_ONCE(!va))
continue;
va->va_start = (unsigned long)tmp->addr;
va->va_end = va->va_start + tmp->size;
va->vm = tmp;
insert_vmap_area(va, &vmap_area_root, &vmap_area_list);
}
/*
* Now we can initialize a free vmap space.
*/
vmap_init_free_space();
vmap_initialized = true;
}
I'm not sure if this code is run on every cpu(core) or just on the first cpu?
if this code runs on every smp core, how is this code inside for_each_possible_cpu loop run?
The smp setup seems to be done before this function.
start_kernel() calls mm_init() which calls vmalloc_init(). Only the first (boot) CPU is active at that point. Later, start_kernel() calls arch_call_rest_init() which calls rest_init().
rest_init() creates a kernel thread for the init task with entry point kernel_init(). kernel_init() calls kernel_init_freeable(). kernel_init_freeable() eventually calls smp_init() to activate the remaining CPUs.
Every macro in for_each_cpu family is just wrapper for for() loop, where iterator is a CPU index.
E.g., the core macro of this family is defined as
#define for_each_cpu(cpu, mask) \
for ((cpu) = -1; \
(cpu) = cpumask_next((cpu), (mask)), \
(cpu) < nr_cpu_ids;)
Each macro in for_each_cpu family uses its own CPUs mask, which is just a set of bits corresponded to CPU indices. E.g. mask for for_each_possible_cpu macro have bits set for every index of CPU which could ever be enabled in current machine session.

JCuda's JCublas2.cublasSdot: failed to use a device Pointer for the result Pointer parameter

In the source code's comments of JCublas2.cublasSdot, it's commented that the 'result' parameter can be a 'host or device pointer'.
public static int cublasSdot(
cublasHandle handle,
int n,
Pointer x,
int incx,
Pointer y,
int incy,
Pointer result)/** host or device pointer */
{
return checkResult(cublasSdotNative(handle, n, x, incx, y, incy, result));
}
However, I can use only a host pointer like Pointer.to(fs) with float[] fs ={0}. If I use a device pointer like 'CUdeviceptr devicePtr = new CUdeviceptr(); JCudaDriver.cuMemAlloc(devicePtr, 100 * Sizeof.FLOAT);', the program crashes with console messages like:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000007fed93af2a3, pid=9376, tid=0x0000000000003a7c
# .....
Minimization of data transfer between host and device saves time. How to use device Pointer as the 'result' argument for this method, as well as other JCuda methods with result Pointer commented with /** host or device pointer **/?
CUBLAS can write the results of certain computations (like the dot product) either to host or to device memory. The target memory type has to be set explicitly, using cublasSetPointerMode.
An example of how this can be used is shown in the JCublas2PointerModes sample.
It once writes the result of the dot product computation to host memory (which is also the default, when no pointer mode is set explicitly):
// Set the pointer mode to HOST
cublasSetPointerMode(handle, CUBLAS_POINTER_MODE_HOST);
// Prepare the pointer for the result in HOST memory
float hostResult[] = { -1.0f };
Pointer hostResultPointer = Pointer.to(hostResult);
// Execute the 'dot' function
cublasSdot(handle, n, deviceData, 1, deviceData, 1, hostResultPointer);
And then changes the pointer mode and calls the function again, this time writing the result to device memory:
cublasSetPointerMode(handle, CUBLAS_POINTER_MODE_DEVICE);
// Prepare the pointer for the result in DEVICE memory
Pointer deviceResultPointer = new Pointer();
cudaMalloc(deviceResultPointer, Sizeof.FLOAT);
// Execute the 'dot' function
cublasSdot(handle, n, deviceData, 1, deviceData, 1, deviceResultPointer);

How to replace read function for procfs entry that returned EOF and byte count read both?

I am working on updating our kernel drivers to work with linux kernel 4.4.0 on Ubuntu 16.0.4. The drivers last worked with linux kernel 3.9.2.
In one of the modules, we have a procfs entries created to read/write the on-board fan monitoring values. Fan monitoring is used to read/write the CPU or GPU temperature/modulation,etc. values.
The module is using the following api to create procfs entries:
struct proc_dir_entry *create_proc_entry(const char *name, umode_t
mode,struct proc_dir_entry *parent);
Something like:
struct proc_dir_entry * proc_entry =
create_proc_entry("fmon_gpu_temp",0644,proc_dir);
proc_entry->read_proc = read_proc;
proc_entry->write_proc = write_proc;
Now, the read_proc is implemented something in this way:
static int read_value(char *buf, char **start, off_t offset, int count, int *eof, void *data) {
int len = 0;
int idx = (int)data;
if(idx == TEMP_FANCTL)
len = sprintf (buf, "%d.%02d\n", fmon_readings[idx] / TEMP_SAMPLES,
fmon_readings[idx] % TEMP_SAMPLES * 100 / TEMP_SAMPLES);
else if(idx == TEMP_CPU) {
int i;
len = sprintf (buf, "%d", fmon_readings[idx]);
for( i=0; i < FCTL_MAX_CPUS && fmon_cpu_temps[i]; i++ ) {
len += sprintf (buf+len, " CPU%d=%d",i,fmon_cpu_temps[i]);
}
len += sprintf (buf+len, "\n");
}
else if(idx >= 0 && idx < READINGS_MAX)
len = sprintf (buf, "%d\n", fmon_readings[idx]);
*eof = 1;
return len;
}
This read function definitely assumes that the user has provided enough buffer space to store the temperature value. This is correctly handled in userspace program. Also, for every call to this function the read value is in totality and therefore there is no support/need for subsequent reads for same temperature value.
Plus, if I use "cat" program on this procfs entry from shell, the 'cat' program correctly displays the value. This is supported, I think, by the setting of EOF to true and returning read bytes count.
New linux kernels do not support this API anymore.
My question is:
How can I change this API to new procfs API structure keeping the functionality same as: every read should return the value, program 'cat' should also work fine and not go into infinite loop ?
The primary user interface for read files on Linux is read(2). Its pair in kernel space is .read function in struct file_operations.
Every other mechanism for read file in kernel space (read_proc, seq_file, etc.) is actually an (parametrized) implementation of .read function.
The only way for kernel to return EOF indicator to user space is returning 0 as number of bytes read.
Even read_proc implementation you have for 3.9 kernel actually implements eof flag as returning 0 on next invocation. And cat actually perfoms the second invocation of read for find that file is end.
(Moreover, cat performs more than 2 invocations of read: first with 1 as count, second with count equal to page size minus 1, and the last with remaining count.)
The simplest way for "one-shot" read implementation is using seq_file in single_open() mode.

Declare an lldb summary-string for a sized string type

I would like to have a formatter for the buildin string type of the nim language, but somehow I fail at providing it. Nim compilis to c, and the c representation of the string type you see here:
#if defined(__GNUC__) || defined(__clang__) || defined(_MSC_VER)
# define SEQ_DECL_SIZE /* empty is correct! */
#else
# define SEQ_DECL_SIZE 1000000
#endif
typedef char NIM_CHAR;
typedef long long int NI64;
typedef NI64 NI;
struct TGenericSeq {NI len; NI reserved; };
struct NimStringDesc {TGenericSeq Sup; NIM_CHAR data[SEQ_DECL_SIZE]; };
and here is the output of what I have tried in the lldb session:
(lldb) frame variable *longstring
(NimStringDesc) *longstring = {
Sup = (len = 9, reserved = 15)
data = {}
}
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) type summary add --summary-string "${&var[0]%s}" "NIM_CHAR []"
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) type summary add --summary-string "${var%s}" "NIM_CHAR *"
(lldb) frame variable longstring->data
(NIM_CHAR []) longstring->data = {}
(lldb) frame variable &longstring->data[0]
(NIM_CHAR *) &[0] = 0x00007ffff7f3a060 "9 - 3 - 2"
(lldb) frame variable *longstring
(lldb) type summary add --summary-string "${var.data%s}" "NimStringDesc"
(lldb) frame variable *longstring
(NimStringDesc) *longstring = NIM_CHAR [] # 0x7ffff7f3a060
(lldb) type summary add --summary-string "${&var.data[0]%s}" "NimStringDesc"
(lldb) frame variable *longstring
(NimStringDesc) *longstring = {
Sup = (len = 9, reserved = 15)
data = {}
}
(lldb)
I simply can't manage, that the output will just be data interpreted as a '\0' terminated c-string
The summary string syntax you've tried is (by design) not as syntax rich as C.
And since you're using a zero-sized array, I don't think we have any magic provision to treat that as a pointer-to string. You might want to file a bug about it, but in this case, it's arguable whether it would help you. Since your string is length-encoded it doesn't really need to be zero-terminated, and that is the only hint LLDB would understand out of the box to know when to stop reading out of a pointer-to characters.
In your case, you're going to have to resort to Python formatters
The things you need are:
the memory location of the string buffer
the length of the string buffer
a process to read memory out of
This is a very small Python snippet that does it - I'll give you enhancement suggestions as well, but let's start with the basics:
def NimStringSummary(valobj,stuff):
l = valobj.GetChildMemberWithName('Sup').GetChildMemberWithName('len').GetValueAsUnsigned(0)
s = valobj.GetChildMemberWithName('data').AddressOf()
return '"%s"'% valobj.process.ReadMemory(s.GetValueAsUnsigned(0),l,lldb.SBError())
As you can see, first of all it reads the value of the length field;
then it reads the address-of the data buffer; then it uses the process that the value comes from to read the string content, and returns it in quotes
Now, this is a proof of concept. If you used it in production, you'd quickly run into a few issues:
What if your string buffer hasn't been initialized yet, and it says the size of the buffer is 20 gigabytes? You're going to have to limit the size of the data you're willing to read. For string-like types it has builtin knowledge of (char*, std::string, Swift.String, ...) LLDB prints out the truncated buffer followed by ..., e.g.
(const char*) buffer = "myBufferIsVeryLong"...
What if the pointer to the data is invalid? You should check that s.GetValueAsUnsigned(0) isn't actually zero - if it is you might want to print an error message like "null buffer".
Also, here I just passed an SBError that I then ignore - it would be better to pass one and then check it
All in all, you'd end up with something like:
import lldb
import os
def NimStringSummary(valobj,stuff):
l = valobj.GetChildMemberWithName('Sup').GetChildMemberWithName('len').GetValueAsUnsigned(0)
if l == 0: return '""'
if l > 1024: l = 1024
s = valobj.GetChildMemberWithName('data').AddressOf()
addr = s.GetValueAsUnsigned(0)
if addr == 0: return '<null buffer>'
err = lldb.SBError()
buf = valobj.process.ReadMemory(s.GetValueAsUnsigned(0),l,err)
if err.Fail(): return '<error: %s>' % str(err)
return '"%s"' % buf
def __lldb_init_module(debugger, internal_dict):
debugger.HandleCommand("type summary add NimStringDesc -F %s.NimStringSummary" % os.path.splitext(os.path.basename(__file__))[0])
The one extra trick is the __lldb_init_module function - this function is automatically called by LLDB whenever you 'command script import' a python file. This will allow you to add the 'command script import ' to your ~/.lldbinit file and automatically get the formatter to be picked up in all debug sessions
Hope this helps!

View array in LLDB: equivalent of GDB's '#' operator in Xcode 4.1

I would like to view an array of elements pointed to by a pointer. In GDB this can be done by treating the pointed memory as an artificial array of a given length using the operator '#' as
*pointer # length
where length is the number of elements I want to view.
The above syntax does not work in LLDB supplied with Xcode 4.1.
Is there any way how to accomplish the above in LLDB?
There are two ways to do this in lldb.
Most commonly, you use the parray lldb command which takes a COUNT and an EXPRESSION; EXPRESSION is evaluated and should result in a pointer to memory. lldb will then print COUNT items of that type at that address. e.g.
parray 10 ptr
where ptr is of type int *.
Alternatively, it can be done by casting the pointer to a pointer-to-array.
For example, if you have a int* ptr, and you want to view it as an array of ten integers, you can do
p *(int(*)[10])ptr
Because it relies only on standard C features, this method works without any plugins or special settings. It likewise works with other debuggers like GDB or CDB, even though they also have specialized syntaxes for printing arrays.
Starting with the lldb in Xcode 8.0, there is a new built-in parray command. So you can say:
(lldb) parray <COUNT> <EXPRESSION>
to print the memory pointed to by the result of the EXPRESSION as an array of COUNT elements of the type pointed to by the expression.
If the count is stored in a variable available in the current frame, then remember you can do:
(lldb) parray `count_variable` pointer_to_malloced_array
That's a general lldb feature, any command-line argument in lldb surrounded in backticks gets evaluated as an expression that returns an integer, and then the integer gets substituted for the argument before command execution.
The only way I found was via a Python scripting module:
""" File: parray.py """
import lldb
import shlex
def parray(debugger, command, result, dict):
args = shlex.split(command)
va = lldb.frame.FindVariable(args[0])
for i in range(0, int(args[1])):
print va.GetChildAtIndex(i, 0, 1)
Define a command "parray" in lldb:
(lldb) command script import /path/to/parray.py
(lldb) command script add --function parray.parray parray
Now you can use "parray variable length":
(lldb) parray a 5
(double) *a = 0
(double) [1] = 0
(double) [2] = 1.14468
(double) [3] = 2.28936
(double) [4] = 3.43404
With Xcode 4.5.1 (which may or may not help you now), you can do this in the lldb console:
(lldb) type summary add -s "${var[0-63]}" "float *"
(lldb) frame variable pointer
(float *) pointer = 0x000000010ba92950 [0.0,1.0,2.0,3.0, ... ,63.0]
This example assumes that 'pointer' is an array of 64 floats: float pointer[64];
It doesn't seem to be supported yet.
You could use the memory read function (memory read / x), like
(lldb) memory read -ff -c10 `test`
to print a float ten times from that pointer. This should be the same functionality as gdb's #.
Starting with Martin R answer I improved it as follow:
If the pointer is not a simple variable, e.g.:
struct {
int* at;
size_t size;
} a;
Then "parray a.at 5" fails.
I fixed this by replacing "FindVariable" with "GetValueForVariablePath".
Now what if the elements in your array are aggregates, e.g.:
struct {
struct { float x; float y; }* at;
size_t size;
} a;
Then "parray a.at 5" prints: a.at->x, a.at->y, a.at[2], a.at[3], a.at[4] because GetChildAtIndex() returns members of aggregates.
I fixed this by resolving "a.at" + "[" + str(i) + "]" inside the loop instead of resolving "a.at" and then retrieving its children.
Added an optional "first" argument (Usage: parray [FIRST] COUNT), which is useful when you have a huge number of elements.
Made it do the "command script add -f parray.parray parray" at init
Here is my modified version:
import lldb
import shlex
def parray(debugger, command, result, dict):
args = shlex.split(command)
if len(args) == 2:
count = int(args[1])
indices = range(count)
elif len(args) == 3:
first = int(args[1]), count = int(args[2])
indices = range(first, first + count)
else:
print 'Usage: parray ARRAY [FIRST] COUNT'
return
for i in indices:
print lldb.frame.GetValueForVariablePath(args[0] + "[" + str(i) + "]")
def __lldb_init_module(debugger, internal_dict):
debugger.HandleCommand('command script add -f parray.parray parray')
I tried to add a comment but that wasn't great for posting a full answer so I made my own answer. This solves the problem with getting "No Value". You need to get the current frame as I believe lldb.frame is set at module import time so it doesn't have the current frame when you stop at a breakpoint if you load the module from .lldbinit. The other version would work if you import or reloaded the script when you stopped at the breakpoint. The version below should always work.
import lldb
import shlex
#lldb.command('parray', 'command script add -f parray.parray parray')
def parray(debugger, command, result, dict):
target = debugger.GetSelectedTarget()
process = target.GetProcess()
thread = process.GetSelectedThread()
frame = thread.GetSelectedFrame()
args = shlex.split(command)
if len(args) == 2:
count = int(args[1])
indices = range(count)
elif len(args) == 3:
first = int(args[1])
count = int(args[2])
indices = range(first, first + count)
else:
print 'Usage: parray ARRAY [FIRST] COUNT'
return
for i in indices:
print frame.GetValueForVariablePath(args[0] + "[" + str(i) + "]")
To inspect variables you can use the frame variable command (fr v is the shortest unique prefix) which has a -Z flag which does exactly what you want:
(lldb) fr v buffer -Z5
(int64_t *) buffer = 0x000000010950c000 {
(int64_t) [0] = 0
(int64_t) [1] = 0
(int64_t) [2] = 0
(int64_t) [3] = 0
(int64_t) [4] = 0
}
unfortunately expression does not support that flag
Well at that point, you may as well write your own custom C function and invoke it with:
call (int)myprint(args)

Resources