Difference between mutual exclusion and blocked-IO in kernel programming?

Difference between mutual exclusion and blocked-IO in kernel programming? - linux-kernel

I am unable to understand the difference between the follwing two codes. Can any body explain the difference between the following codes & also explain the differnece between semaphore and mutex with example....
Mutual exclusion:
DEFINE_SEMAPHORE(mysem);
static ssize_t dev_read(struct file *file,char *buf, size_t lbuf, loff_t *ppos)
{
int maxbytes, bytes_to_do, nbytes;
maxbytes = SIZE - *ppos;
if(maxbytes < lbuf) bytes_to_do = maxbytes;
else bytes_to_do = lbuf;
if(bytes_to_do == 0){
printk("reached end of device\n");
return -ENOSPC;
}
if(down_interruptible(&mysem))
return -ERESTARTSYS;
nbytes = bytes_to_do - copy_to_user(buf,dev_buf+*ppos,bytes_to_do);
up(&mysem);
*ppos += nbytes;
return nbytes;
}
static ssize_t dev_write(struct file *file,const char *buf, size_t lbuf,
loff_t *ppos)
{
int maxbytes, bytes_to_do, nbytes;
maxbytes = SIZE - *ppos;
if(maxbytes < lbuf) bytes_to_do = maxbytes;
else bytes_to_do = lbuf;
if(bytes_to_do == 0){
printk("reached end of device\n");
return -ENOSPC;
}
if(down_interruptible(&mysem))
return -ERESTARTSYS;
nbytes = bytes_to_do - copy_from_user(dev_buf+*ppos,buf,bytes_to_do);
ssleep(10);
up(&mysem);
*ppos += nbytes;
return nbytes;
}
Blocked IO
init_MUTEX_LOCKED(&mysem);
static ssize_t dev_read(struct file *file,char *buf, size_t lbuf, loff_t *ppos)
{
int maxbytes, bytes_to_do, nbytes;
maxbytes = SIZE - *ppos;
if(maxbytes < lbuf) bytes_to_do = maxbytes;
else bytes_to_do = lbuf;
if(bytes_to_do == 0){
printk("reached end of device\n");
return -ENOSPC;
}
if(down_interruptible(&mysem))
return -ERESTARTSYS;
nbytes = bytes_to_do - copy_to_user(buf,dev_buf+*ppos,bytes_to_do);
*ppos += nbytes;
return nbytes;
}
static ssize_t dev_write(struct file *file,const char *buf, size_t lbuf,
loff_t *ppos)
{
int maxbytes, bytes_to_do, nbytes;
maxbytes = SIZE - *ppos;
if(maxbytes < lbuf) bytes_to_do = maxbytes;
else bytes_to_do = lbuf;
if(bytes_to_do == 0){
printk("reached end of device\n");
return -ENOSPC;
}
nbytes = bytes_to_do - copy_from_user(dev_buf+*ppos,buf,bytes_to_do);
ssleep(10);
up(&mysem);
*ppos += nbytes;
return nbytes;
}

Mutex is nothing but a binary semaphore. It means that mutex can have only two states : locked and unlocked. But semaphore can have more than two counts. So number of processes which can acquire the semaphore lock is equal to the count with which semaphore is initialized.
In your example, in first code snippet, whether it is read or write, whichever is acquiring the lock is itself releasing the lock also after it completes its respective read or write. Both can not work simultaneously due to mutex.
While in second code snippet, the code exhibits blocking I/O concept which is designed to solve a problem explained in a book Linux Device Drivers(LDD) : "what to do when there's no data yet to read, but we're not at end-of-file. The default answer is go to sleep waiting for data". As you can see in the code, lock is declared as Mutex and that also in locked state. So, if any read comes when there is no data, it can not acquire a lock as mutex is already in locked state, so it will go to sleep (In short read is blocked). Whenever any write come, it first writes to device and then it releases the mutex. So, now blocked read can acquire that lock and can complete its read process. Here also, both can not work simultaneously, but lock acquiring and releasing mechanism is synchronized in such a manner that read can not progress until write does not write anything to device.

Related

sendmsg() with Unix domain socket blocks forever on Mac with specific sizes

I'm sending messages on Unix domain sockets on Mac with sendmsg(). However, it sometimes hangs forever.
I've called getsockopt(socket, SOL_SOCKET, SO_SNDBUF, ...) to get the size of the send buffer. (The default is 2048).
If I try sending a message larger than 2048 bytes, I correctly get
EMSGSIZE and know I need to send a smaller message.
If I try sending a message less than 2036 bytes, the message is sent fine.
If I try sending a message between 2036 and 2048 bytes, the
sendmsg call...hangs forever.
What's going on here? What's the correct way to deal with this? Is it safe to just subtract 13 bytes from the maximum size I try sending, or could I run into issues if e.g. there's other messages in the buffer already?
Here's the (simplified) code I'm using:
// Get the maximum message size
int MaxMessageSize(int socket) {
int sndbuf = 0;
socklen_t optlen = sizeof(sndbuf);
if (getsockopt(socket, SOL_SOCKET, SO_SNDBUF, &sndbuf, &optlen) < 0) {
return -1;
}
return sndbuf;
}
// Send a message
static int send_chunk(int socket, const char *data, size_t size) {
struct msghdr msg = {0};
char buf[CMSG_SPACE(0)];
memset(buf, '\0', sizeof(buf));
int iov_len = size;
if (iov_len > 512) {
int stat = send_size(socket, iov_len);
if (stat < 0) return stat;
}
char iov_buf[iov_len];
memcpy(iov_buf, data, size);
struct iovec io = {.iov_base = (void *)iov_buf, .iov_len = iov_len};
msg.msg_iov = &io;
msg.msg_iovlen = 1;
msg.msg_control = buf;
msg.msg_controllen = sizeof(buf);
struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS;
cmsg->cmsg_len = CMSG_LEN(0);
msg.msg_controllen = CMSG_SPACE(0);
std::cerr << "Attempting to send message of size " << iov_len << std::endl;
ssize_t ret = sendmsg(socket, &msg, 0);
std::cerr << "sendmsg returned: " << ret << std::endl;
return ret;
}

Measure the time of a child process without blocking the parent

I want to measure the execution time of a child process. I can use times function
https://linux.die.net/man/3/times. But it requires to block the parent process using wait(). On the other hand, I need to keep the parent process running in parallel to store the data being generated by the child. Once the child has exited, I want to know how much CPU time was spent in executing the child process. Something like below:
int done = FALSE;
struct timespec ts;
struct rusage ru;
struct timeval utime;
struct timeval stime;
int main(int argc, char* argv[]){
pid_t pid1, pid2; int status;
char *args[] = {"./wp"};
struct rusage usage;
int i, j,n;
signal(SIGCHLD,reaper);
pid1 = fork();
if (pid1 == 0) { // child
printf("Its child %d\n",getpid());
printf("child %d: executing target program\n", getpid());
execv(argv[1], argv+1);
}
n = 0;
while (n<20) { //parent has to do some stuff in parallel
printf("doing something...\n"); sleep(1);
if (done){ // child exited
done = FALSE;
pid_t pid2 = wait3(&status, 0, &usage);
printf("...child of %d done executing.\n",getpid());
printf("exit code for %d is %d\n", pid2, status);
if (WIFEXITED(status))
printf("The exit status is %d\n", WEXITSTATUS(status));
utime = usage.ru_utime;
stime = usage.ru_stime;
printf("RUSAGE :ru_utime => %lld [sec] : %lld [usec], :ru_stime => %lld [sec] : %lld [usec] \n",
(int64_t)utime.tv_sec, (int64_t)utime.tv_usec,
(int64_t)stime.tv_sec, (int64_t)stime.tv_usec);
}
n++;
}
printf("got my SIGCHLD, cleaning up!\n");
signal(SIGCHLD,SIG_DFL);
return 0;
}
void reaper(int sig) {
done=TRUE; //
}

Why pam_loginuid module fails on writing to /proc/self/loginuid with -EPERM?

I found that application using pam library to authenticate fails on error:
Error writing /proc/self/loginuid: Operation not permitted
By strace i found that fail is on write to the /proc/self/loginuid file.
Further inspection and adding some debug code to kernel (code below):
static ssize_t proc_loginuid_write(struct file * file, const char __user * buf,
size_t count, loff_t *ppos)
{
struct inode * inode = file_inode(file);
uid_t loginuid;
kuid_t kloginuid;
int rv;
printk(KERN_DEBUG "proc_loginuid_write\n");
printk(KERN_DEBUG "a+++ %s\n", current->comm);
printk(KERN_DEBUG "b+++ %s\n", pid_task(proc_pid(inode), PIDTYPE_PID)->comm);
printk(KERN_DEBUG "+++2++ pid = %d\n", current->pid);
printk(KERN_DEBUG "+++3++ pid = %d\n", pid_task(proc_pid(inode), PIDTYPE_PID)->pid);
rcu_read_lock();
if (current != pid_task(proc_pid(inode), PIDTYPE_PID)) {
rcu_read_unlock();
printk(KERN_ERR "proc_loginuid_write failed by permission!\n");
return -EPERM;
}
rcu_read_unlock();
if (*ppos != 0) {
/* No partial writes. */
return -EINVAL;
}
rv = kstrtou32_from_user(buf, count, 10, &loginuid);
if (rv < 0)
return rv;
/* is userspace tring to explicitly UNSET the loginuid? */
if (loginuid == AUDIT_UID_UNSET) {
kloginuid = INVALID_UID;
} else {
kloginuid = make_kuid(file->f_cred->user_ns, loginuid);
if (!uid_valid(kloginuid))
return -EINVAL;
}
rv = audit_set_loginuid(kloginuid);
if (rv < 0)
return rv;
return count;
}
showed in dmesg that:
[ 30.672242] proc_loginuid_write
[ 30.672249] a+++ testapp
[ 30.672251] b+++ testapp
[ 30.672254] +++2++ pid = 2920
[ 30.672257] +++3++ pid = 2451
[ 30.672259] proc_loginuid_write failed by permission!
Name testapp is intentionally changed name. So it looks like the file /proc/self/loginuid is file created by parent, and it is read by child thread.
I tested same code on kernel 3.14 and 4.9 and on 3.14 kernel it works and on kernel 4.9 it doesn't works. Why?

I found the solution for the problem.
Old kernel 3.14 has turned off option CONFIG_AUDITSYSCALL in config. So on there was no file /proc/self/loginuid and pam module simply don't cares when there is no such file.
On newer kernel 4.9 option is automatically selected by CONFIG_AUDIT=y.
So simplest solution is to turn off CONFIG_AUDIT option, but why in process of kernel evolution CONFIG_AUDITSYSCALL became a non controllable option is matter for other question.
Thanks!

Task switching using a queue

i'm developing my own hobby os, and now i'm stuck with a problem on the scheduler/task switching.
I planned to use a FIFO queue as structure to hold processes. I implemented it using linked list.
I also decided to use the iret method to switch from a task to another (so when the os was serving an interrupt request just before the iret i change the ESP register in order to move to the new task).
But i have a problem.
When the os start it launch two tasks:
idle
shell
And with these two i have no problem.
But if i try to launch two other tasks (with a simply printf inside), the task queue was corrupted.
If after that i try to print the queue it print only two tasks that are the 2 just created and with idle and shell disappeared, but the os continues to work (i think that in a specific moment the esp field of the new tasks was replaced with the esp content of the shell).
The task data structure is:
typedef struct task_t{
pid_t pid;
char name[NAME_LENGTH];
void (*start_function)();
task_state status;
task_register_t *registers;
unsigned int cur_quants;
unsigned int eip;
long int esp;
unsigned int pdir;
unsigned int ptable;
struct task_t *next;
}task_t;
and the tss is:
typedef struct {
unsigned int edi; //+0
unsigned int esi; //+1
unsigned int ebp; //+2
unsigned int esp; //+3 (can be null)
unsigned int ebx; //+4
unsigned int edx; //+5
unsigned int ecx; //+6
unsigned int eax; //+7
unsigned int eip; //+8
unsigned int cs; //+9
unsigned int eflags; //+10
unsigned int end;
} task_register_t;
The scheduler function is the following:
void schedule(unsigned int *stack){
asm("cli");
if(active == TRUE){
task_t* cur_task = dequeue_task();
if(cur_task != NULL){
cur_pid = cur_task->pid;
dbg_bochs_print("#######");
dbg_bochs_print(cur_task->name);
if(cur_task->status!=NEW){
cur_task->esp=*stack;
} else {
cur_task->status=READY;
((task_register_t *)(cur_task->esp))->eip = cur_task->eip;
}
enqueue_task(cur_task->pid, cur_task);
cur_task=get_task();
if(cur_task->status==NEW){
cur_task->status=READY;
}
dbg_bochs_print(" -- ");
dbg_bochs_print(cur_task->name);
dbg_bochs_print("\n");
//load_pdbr(cur_taskp->pdir);
*stack = cur_task->esp;
} else {
enqueue_task(cur_task->pid, cur_task);
}
}
active = FALSE;
return;
asm("sti");
}
The tss is initalized with the following values:
void new_tss(task_register_t* tss, void (*func)()){
tss->eax=0;
tss->ebx=0;
tss->ecx=0;
tss->edx=0;
tss->edi =0;
tss->esi =0;
tss->cs = 8;
tss->eip = (unsigned)func;
tss->eflags = 0x202;
tss->end = (unsigned) suicide;
//tss->fine = (unsigned)end; //per metterci il suicide
return;
}
And the function that creates a new task is the following:
pid_t new_task(char *task_name, void (*start_function)()){
asm("cli");
task_t *new_task;
table_address_t local_table;
unsigned int new_pid = request_pid();
new_task = (task_t*)kmalloc(sizeof(task_t));
strcpy(new_task->name, task_name);
new_task->next = NULL;
new_task->start_function = start_function;
new_task->cur_quants=0;
new_task->pid = new_pid;
new_task->eip = (unsigned int)start_function;
new_task->esp = (unsigned int)kmalloc(STACK_SIZE) + STACK_SIZE-100;
new_task->status = NEW;
new_task->registers = (task_register_t*)new_task->esp;
new_tss(new_task->registers, start_function);
local_table = map_kernel();
new_task->pdir = local_table.page_dir;
new_task->ptable = local_table.page_table;
//new_task->pdir = 0;
//new_task->ptable = 0;
enqueue_task(new_task->pid, new_task);
//(task_list.current)->cur_quants = MAX_TICKS;
asm("sti");
return new_pid;
}
I'm sure that i just forgot something, or i miss some consideration. But i cannot figure what i'm missing.
Actually i'm working only in kernel mode, and inside the same address space (pagiing is enabled, but actually i use the same pagedir for all tasks).
The ISR macros are defined here:
https://github.com/inuyasha82/DreamOs/blob/master/include/processore/handlers.h
I declared four kinds of function in order to handle ISR:
EXCEPTION
EXCEPTION_EC (an exception with an error code)
IRQ
SYSCALL
Obviously the scheduler is called by an IRQ routine, so the macro looks like:
__asm__("INT_"#n":"\
"pushad;" \
"movl %esp, %eax;"\
"pushl %eax;"\
"call _irqinterrupt;"\
"popl %eax;"\
"movl %eax, %esp;"\
"popad;"\
"iret;")
the irq handler function is:
void _irqinterrupt(unsigned int esp){
asm("cli;");
int irqn;
irqn = get_current_irq();
IRQ_s* tmpHandler;
if(irqn>=0) {
tmpHandler = shareHandler[irqn];
if(tmpHandler!=0) {
tmpHandler->IRQ_func();
#ifdef DEBUG
printf("2 - IRQ_func: %d, %d\n", tmpHandler->IRQ_func, tmpHandler);
#endif
while(tmpHandler->next!=NULL) {
tmpHandler = tmpHandler->next;
#ifdef DEBUG
printf("1 - IRQ_func (_prova): %d, %d\n", tmpHandler->IRQ_func, tmpHandler);
#endif
if(tmpHandler!=0) tmpHandler->IRQ_func();
}
} else printf("irqn: %d\n", irqn);
}
else printf("IRQ N: %d E' arrivato qualcosa che non so gestire ", irqn);
if(irqn<=8 && irqn!=2) outportb(MASTER_PORT, EOI);
else if(irqn<=16 || irqn==2){
outportb(SLAVE_PORT, EOI);
outportb(MASTER_PORT, EOI);
}
schedule(&esp);
asm("sti;");
return;
}
And these are the enqueue_task and dequeue_task functions:
void enqueue_task(pid_t pid, task_t* n_task){
n_task->next=NULL;
if(task_list.tail == NULL){
task_list.head = n_task;
task_list.tail = task_list.head;
} else {
task_list.head->next=n_task;
task_list.head = n_task;
}
}
task_t* dequeue_task(){
if(task_list.head==NULL){
return NULL;
} else {
task_t* _task;
_task = task_list.tail;
task_list.tail=_task->next;
return _task;
}
return;
}
Thanks in advance,
and let me know if you need more details!

It is hard to tell. How does your assembly part of the isr look like? What makes me think is the problem (since you can save and restore two tasks but not more) is that you don't push and pop all registers properly. You do use pusha and popa for the isr right?
I also want to add that having cli and sti like you have done there can be dangerous. In your isrs set cli as the first opcode. Then you wont need to use sti at all because iret will automatically flip this on for you (it is actually a bit in the eflags register).
Good luck!

Printing out the name of the first entry in the imports table of a PE file

I am trying to print out the name of the first entry (which I suppose is user32.dll) in the imports table of a PE file, but the program terminates unexpectedly saying "cannot read memory", can someone please explain me why??
#include<iostream>
#include<Windows.h>
#include<stdio.h>
#include<WinNT.h>
int main()
{
HANDLE hFile,hFileMapping;
LPVOID lpFileBase;
LPVOID lp;
if((hFile = CreateFile(TEXT("c:\\linked list.exe"),GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,0)) == INVALID_HANDLE_VALUE)
std::cout<<"unable to open";
if((hFileMapping = CreateFileMapping(hFile,NULL,PAGE_READONLY,0,0,NULL)) == 0)
{
CloseHandle(hFile);
std::cout<<"unable to open for mapping";
}
if((lpFileBase = MapViewOfFile(hFileMapping,FILE_MAP_READ,0,0,0))== 0)
{
CloseHandle(hFile);
CloseHandle(hFileMapping);
std::cout<<"couldn't map view of file";
}
PIMAGE_DOS_HEADER pimdh;
pimdh = (PIMAGE_DOS_HEADER)lpFileBase;
PIMAGE_NT_HEADERS pimnth;
pimnth = (PIMAGE_NT_HEADERS)((char *)pimdh + pimdh->e_lfanew);
PIMAGE_SECTION_HEADER pimsh;
pimsh = (PIMAGE_SECTION_HEADER)(pimnth + 1);
int i;
for(i = 0; i<pimnth->FileHeader.NumberOfSections; i++)
{
if(!strcmp((char *)pimsh->Name,".idata"))
{
char *p;
PIMAGE_IMPORT_DESCRIPTOR pimid;
pimid = (PIMAGE_IMPORT_DESCRIPTOR)(pimnth->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress + (char *)lpFileBase);
p = (char *)((char *)lpFileBase + pimid->Name);
printf("%s",p);
};
pimsh++;
}
}

You asked a similar question a couple of days ago and looking at your code you've read two-thirds of my answer.
The other third says that pimid->Name is not a file offset, it's a Relative Virtual Address (or RVA), which you need to convert to a file offset. That's why you're getting an error. To understand RVAs read the MSDN article. For sample code to do the conversion have a look at pedump, which is referenced in the article.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Difference between mutual exclusion and blocked-IO in kernel programming? - linux-kernel

Related

sendmsg() with Unix domain socket blocks forever on Mac with specific sizes

Measure the time of a child process without blocking the parent

Why pam_loginuid module fails on writing to /proc/self/loginuid with -EPERM?

Task switching using a queue

Printing out the name of the first entry in the imports table of a PE file

Categories

Resources