How can i know the minor on Linux module initialisation - linux-kernel

I am writing a linux kernel module.
Here is what i've done in module's init function:
register_chrdev(300 /* major */, "mydev", &fops);
It works fine. But i need to know the minor number.
I have read we cannot set this minor number. It is the kernel which gives us this number. If so, how can i know it in module's init function ?
Thanks

register_chrdev calls __register_chrdev internally.
static inline int register_chrdev(unsigned int major, const char *name,
const struct file_operations *fops)
{
return __register_chrdev(major, 0, 256, name, fops);
}
If you will see __register_chrdev function signature, it is
int __register_chrdev(unsigned int major, unsigned int baseminor,
unsigned int count, const char *name,
const struct file_operations *fops)
register_chrdev will pass your major number(300) and a base minor number 0 with a count of 256. So, it will reserve 0-255 minor number range for your device.
Also, in the definition of __register_chrdev, dev_t structure is created (contains major & minor number) for your device.
err = cdev_add(cdev, MKDEV(cd->major, baseminor), count);
MKDEV(cd->major, baseminor) creates it. So, the first device number(dev_t) will have 0 as its minor number. Besides, count(256) is the consecutive minor numbers that can be further used.
You can also dynamically get the major & minor number if you use alloc_chrdev_region. All you have to do is pass a dev_t struct
to alloc_chrdev_region. It will dynamically allocate a major and minor number to your device. To get the major and minor number in your module, you can use
major = MAJOR(dev);
minor = MINOR(dev);

Related

Run a process at the same physical memory location

For a research project, I have a long-running process that uses various buffers and stack variables. I'd like to be able to launch this process multiple times such that the physical addresses backing its heap, stack, code, and static variables are equal each time. I know the exact size of all of these variables, and the size of the heap and stack stay constant during execution. To help with this, I use some helper code to translate arbitrary virtual addresses in my program to their corresponding physical addresses (sourced from here):
struct pagemap
{
union status
{
struct present
{
unsigned long long pfn : 54;
unsigned char soft_dirty : 1;
unsigned char exclusive : 1;
unsigned char zeroes : 4;
unsigned char type : 1;
unsigned char swapped : 1;
unsigned char present : 1;
} present;
struct swapped
{
unsigned char swaptype : 4;
unsigned long long offset : 50;
unsigned char soft_dirty : 1;
unsigned char exclusive : 1;
unsigned char zeroes : 4;
unsigned char type : 1;
unsigned char swapped : 1;
unsigned char present : 1;
} swapped;
} status;
} __attribute__ ((packed));
unsigned long get_pfn_for_addr(void *addr)
{
unsigned long offset;
struct pagemap pagemap;
FILE *pagemap_file = fopen("/proc/self/pagemap", "rb");
offset = (unsigned long) addr / getpagesize() * 8;
if(fseek(pagemap_file, offset, SEEK_SET) != 0)
{
fprintf(stderr, "failed to seek pagemap to offset\n");
exit(1);
}
fread(&pagemap, 1, sizeof(struct pagemap), pagemap_file);
fclose(pagemap_file);
return pagemap.status.present.pfn;
}
unsigned long virt_to_phys(void *addr)
{
unsigned long pfn, page_offset, phys_addr;
pfn = get_pfn_for_addr(addr);
page_offset = (unsigned long) addr % getpagesize();
phys_addr = (pfn << PAGE_SHIFT) + page_offset;
return phys_addr;
}
So far, my methodology has only required that a specific buffer in my program is located at the same physical address for each run. For this, I was just able to exit and relaunch the process whenever the physical address for that buffer was wrong, and I would end up with the correct location relatively quickly each time. However, I'd like to extend my experiment to ensure that my process is loaded identically in physical memory between runs, and this try-and-restart method does not seem to work well for this. Ideally, I would like to be able to set apart some small number of physical page frames that can't be allocated to another process, or to the kernel itself. Then, I would pass a flag down to do_fork that notifies the kernel that this is my special process and to allocate specific page frames to it.
My questions are:
Is there any sort of isolation mechanism already built into the kernel that would let me set aside an exclusive physical memory space that I could launch my process in?
If not, what would be a starting point for modifying the kernel to support behavior like this?
Is there any other solution (not involving either of the two above) that I could use for my desired behavior?
This is something that the kernel, using virtual memory, is tasked to abstract from you, so I'm not sure it is even possible to do (without insane amounts of work).
May I ask what experiment requires this? Perhaps if you describe what you want to achieve, it is easier to offer advice.

How could SSCANF provide so strange results?

I am in 4-day fight with this code:
unsigned long baudrate = 0;
unsigned char databits = 0;
unsigned char stop_bits = 0;
char parity_text[10];
char flowctrl_text[4];
const char xformat[] = "%lu,%hhu,%hhu,%[^,],%[^,]\n";
const char xtext[] = "115200,8,1,EVEN,NFC\n";
int res = sscanf(xtext, xformat, &baudrate, &databits, &stop_bits, (char*) &parity_text, (char*) &flowctrl_text);
printf("Res: %d\r\n", res);
printf("baudrate: %lu, databits: %hhu, stop: %hhu, \r\n", baudrate, databits, stop_bits);
printf("parity: %s \r\n", parity_text);
printf("flowctrl: %s \r\n", flowctrl_text);
It returns:
Res: 5
baudrate: 115200, databits: 0, stop: 1,
parity:
flowctrl: NFC
Databits and parity missing !
Actually memory under the parity variable is '\0'VEN'\0',
looks like the first characters was somehow overwritten by sscanf procedure.
Return value of sscanf is 5, which suggests, that it was able to parse the input.
My configuration:
gccarmnoneeabi 7.2.1
Visual Studio Code 1.43.2
PlatformIO Core 4.3.1
PlatformIO Home 3.1.1
Lib ST-STM 6.0.0 (Mbed 5.14.1)
STM32F446RE (Nucleo-F446RE)
I have tried (without success):
compiling with mbed RTOS and without
variable types uint8_t, uint32_t
gccarm versions: 6.3.1, 8.3.1, 9.2.1
using another IDE (CLion+PlatformIO)
compiling on another computer (same config)
What actually helps:
making the variables static
compiling in Mbed online compiler
The behavior of sscanf is as whole very unpredictable, mixing the order or datatype of variables sometimes helps, but most often ends with another flaws in the output.
This took me longer than I care to admit. But like most issues it ended up being very simple.
char parity_text[10];
char flowctrl_text[4];
Needs to be changed to:
char parity_text[10] = {0};
char flowctrl_text[5] = {0};
The flowctrl_text array is not large enough at size four to hold "EVEN" and the NULL termination. If you bump it to a size of 5 you should have no problem. Just to be safe I would also initialize the arrays to 0.
Once I increased the size I had 0 issues with your existing code. Let me know if this helps.

Why can't get process id that more than 65535 by 'ntQuerySystemInformation' in Win7 64bit?

I used the 'ntQuerySystemInformation' to get all the handle information like:
NtQuerySystemInformation(SystemHandleInformation, pHandleInfor, ulSize,NULL);//SystemHandleInformation = 16
struct of pHandleInfor is:
typedef struct _SYSTEM_HANDLE_INFORMATION
{
ULONG ProcessId;
UCHAR ObjectTypeNumber;
UCHAR Flags;
USHORT Handle;
PVOID Object;
ACCESS_MASK GrantedAccess;
} SYSTEM_HANDLE_INFORMATION, *PSYSTEM_HANDLE_INFORMATION;
It works well in xp 32bit, but in Win7 64bit can only get the right pid that less than 65535. The type of processId in this struct is ULONG, I think it can get more than 65535. What's wrong with it? Is there any other API instead?
There are two enum values for NtQuerySystemInformation to get handle info:
CNST_SYSTEM_HANDLE_INFORMATION = 16
CNST_SYSTEM_EXTENDED_HANDLE_INFORMATION = 64
And correspondingly two structs: SYSTEM_HANDLE_INFORMATION and SYSTEM_HANDLE_INFORMATION_EX.
The definitions for these structs are:
struct SYSTEM_HANDLE_INFORMATION
{
short UniqueProcessId;
short CreatorBackTraceIndex;
char ObjectTypeIndex;
char HandleAttributes; // 0x01 = PROTECT_FROM_CLOSE, 0x02 = INHERIT
short HandleValue;
size_t Object;
int GrantedAccess;
}
struct SYSTEM_HANDLE_INFORMATION_EX
{
size_t Object;
size_t UniqueProcessId;
size_t HandleValue;
int GrantedAccess;
short CreatorBackTraceIndex;
short ObjectTypeIndex;
int HandleAttributes;
int Reserved;
}
As You can see, the first struct really can only contain 16-bit process id-s...
See for example ProcessExplorer project's source file ntexapi.h for more information.
Note also that the field widths for SYSTEM_HANDLE_INFORMATION_EX in my struct definitions might be different from theirs (that is, in my definition some field widths vary depending on the bitness), but I think I tested the code both under 32-bit and 64-bit and found it to be correct.
Please recheck if necessary and let us know if You have additional info.
From Raymond Chen's article Processes, commit, RAM, threads, and how high can you go?:
I later learned that the Windows NT folks do try to keep the numerical values of process ID from getting too big. Earlier this century, the kernel team experimented with letting the numbers get really huge, in order to reduce the rate at which process IDs get reused, but they had to go back to small numbers, not for any technical reasons, but because people complained that the large process IDs looked ugly in Task Manager. (One customer even asked if something was wrong with his computer.)

What is the use of 'i2c_get_clientdata" and "i2c_set_clientdata"

I have been studying I2C driver (client) code for a while.
I have seen this function "i2c_get_clientdata" and "i2c_set_clientdata" every where.
I have seen the this question here .
Use of pointer to structure instead of creating static local copy
Some times i think like it is like "container_of" macro to get a pointer to the structure.
But still i didn't understood properly why to use it and when to use it.
Below i am posting a sample code where I see its usage.
If any one could help me understand why it is used there and when we shall use it when we write our own drivers.
struct max6875_data {
struct i2c_client *fake_client;
struct mutex update_lock;
u32 valid;
u8 data[USER_EEPROM_SIZE];
unsigned long last_updated[USER_EEPROM_SLICES];
};
static ssize_t max6875_read(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf, loff_t off, size_t count)
{
struct i2c_client *client = kobj_to_i2c_client(kobj);
struct max6875_data *data = i2c_get_clientdata(client);
int slice, max_slice;
if (off > USER_EEPROM_SIZE)
return 0;
if (off + count > USER_EEPROM_SIZE)
count = USER_EEPROM_SIZE - off;
/* refresh slices which contain requested bytes */
max_slice = (off + count - 1) >> SLICE_BITS;
for (slice = (off >> SLICE_BITS); slice <= max_slice; slice++)
max6875_update_slice(client, slice);
memcpy(buf, &data->data[off], count);
return count;
}
Those functions are used to get/set the void *driver_data pointer that is part of the struct device, itself part of struct i2c_client.
This is a void pointer that is for the driver to use. One would use this pointer mainly to pass driver related data around.
That is what is happening in your example. The max6875_read is a callback getting a structu kobject. That kobject is an i2c_client which is enough to communicate with the underlying device using the driver_data pointer here allows to get back the driver related data (instead of using global variables for example).

How to use arrays in program (global) scope in OpenCL

AMD OpenCL Programming Guide, Section 6.3 Constant Memory Optimization:
Globally scoped constant arrays. These arrays are initialized,
globally scoped, and in the constant address space (as specified in
section 6.5.3 of the OpenCL specification). If the size of an array is
below 64 kB, it is placed in hardware constant buffers; otherwise, it
uses global memory. An example of this is a lookup table for math
functions.
I want to use this "globally scoped constant array". I have such code in pure C
#define SIZE 101
int *reciprocal_table;
int reciprocal(int number){
return reciprocal_table[number];
}
void kernel(int *output)
{
for(int i=0; i < SIZE; i+)
output[i] = reciprocal(i);
}
I want to port it into OpenCL
__kernel void kernel(__global int *output){
int gid = get_global_id(0);
output[gid] = reciprocal(gid);
}
int reciprocal(int number){
return reciprocal_table[number];
}
What should I do with global variable reciprocal_table? If I try to add __global or __constant to it I get an error:
global variable must be declared in addrSpace constant
I don't want to pass __constant int *reciprocal_table from kernel to reciprocal. Is it possible to initialize global variable somehow? I know that I can write it down into code, but does other way exist?
P.S. I'm using AMD OpenCL
UPD Above code is just an example. I have real much more complex code with a lot of functions. So I want to make array in program scope to use it in all functions.
UPD2 Changed example code and added citation from Programming Guide
#define SIZE 2
int constant array[SIZE] = {0, 1};
kernel void
foo (global int* input,
global int* output)
{
const uint id = get_global_id (0);
output[id] = input[id] + array[id];
}
I can get the above to compile with Intel as well as AMD. It also works without the initialization of the array but then you would not know what's in the array and since it's in the constant address space, you could not assign any values.
Program global variables have to be in the __constant address space, as stated by section 6.5.3 in the standard.
UPDATE Now, that I fully understood the question:
One thing that worked for me is to define the array in the constant space and then overwrite it by passing a kernel parameter constant int* array which overwrites the array.
That produced correct results only on the GPU Device. The AMD CPU Device and the Intel CPU Device did not overwrite the arrays address. It also is probably not compliant to the standard.
Here's how it looks:
#define SIZE 2
int constant foo[SIZE] = {100, 100};
int
baz (int i)
{
return foo[i];
}
kernel void
bar (global int* input,
global int* output,
constant int* foo)
{
const uint id = get_global_id (0);
output[id] = input[id] + baz (id);
}
For input = {2, 3} and foo = {0, 1} this produces {2, 4} on my HD 7850 Device (Ubuntu 12.10, Catalyst 9.0.2). But on the CPU I get {102, 103} with either OCL Implementation (AMD, Intel). So I can not stress, how much I personally would NOT do this, because it's only a matter of time, before this breaks.
Another way to achieve this is would be to compute .h files with the host during runtime with the definition of the array (or predefine them) and pass them to the kernel upon compilation via a compiler option. This, of course, requires recompilation of the clProgram/clKernel for every different LUT.
I struggled to get this work in my own program some time ago.
I did not find any way to initialize a constant or global scope array from the host via some clEnqueueWriteBuffer or so. The only way is to write it explicitely in your .cl source file.
So here my trick to initialize it from the host is to use the fact that you are actually compiling your source from the host, which also means you can alter your src.cl file before compiling it.
First my src.cl file reads:
__constant double lookup[SIZE] = { LOOKUP }; // precomputed table (in constant memory).
double func(int idx) {
return(lookup[idx])
}
__kernel void ker1(__global double *in, __global double *out)
{
... do something ...
double t = func(i)
...
}
notice the lookup table is initialized with LOOKUP.
Then, in the host program, before compiling your OpenCL code:
compute the values of my lookup table in host_values[]
on your host, run something like:
char *buf = (char*) malloc( 10000 );
int count = sprintf(buf, "#define LOOKUP "); // actual source generation !
for (int i=0;i<SIZE;i++) count += sprintf(buf+count, "%g, ",host_values[i]);
count += sprintf(buf+count,"\n");
then read the content of your source file src.cl and place it right at buf+count.
you now have a source file with an explicitely defined lookup table that you just computed from the host.
compile your buffer with something like clCreateProgramWithSource(context, 1, (const char **) &buf, &src_sz, err);
voilĂ  !
It looks like "array" is a look-up table of sorts. You'll need to clCreateBuffer and clEnqueueWriteBuffer so the GPU has a copy of it to use.

Resources