My kernel module code needs to send signal [def.] to a user land program, to transfer its execution to registered signal handler.
I know how to send signal between two user land processes, but I can not find any example online regarding the said task.
To be specific, my intended task might require an interface like below (once error != 1, code line int a=10 should not be executed):
void __init m_start(){
...
if(error){
send_signal_to_userland_process(SIGILL)
}
int a = 10;
...
}
module_init(m_start())
An example I used in the past to send signal to user space from hardware interrupt in kernel space. That was just as follows:
KERNEL SPACE
#include <asm/siginfo.h> //siginfo
#include <linux/rcupdate.h> //rcu_read_lock
#include <linux/sched.h> //find_task_by_pid_type
static int pid; // Stores application PID in user space
#define SIG_TEST 44
Some "includes" and definitions are needed. Basically, you need the PID of the application in user space.
struct siginfo info;
struct task_struct *t;
memset(&info, 0, sizeof(struct siginfo));
info.si_signo = SIG_TEST;
// This is bit of a trickery: SI_QUEUE is normally used by sigqueue from user space, and kernel space should use SI_KERNEL.
// But if SI_KERNEL is used the real_time data is not delivered to the user space signal handler function. */
info.si_code = SI_QUEUE;
// real time signals may have 32 bits of data.
info.si_int = 1234; // Any value you want to send
rcu_read_lock();
// find the task with that pid
t = pid_task(find_pid_ns(pid, &init_pid_ns), PIDTYPE_PID);
if (t != NULL) {
rcu_read_unlock();
if (send_sig_info(SIG_TEST, &info, t) < 0) // send signal
printk("send_sig_info error\n");
} else {
printk("pid_task error\n");
rcu_read_unlock();
//return -ENODEV;
}
The previous code prepare the signal structure and send it. Bear in mind that you need the application's PID. In my case the application from user space send its PID through ioctl driver procedure:
static long dev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) {
ioctl_arg_t args;
switch (cmd) {
case IOCTL_SET_VARIABLES:
if (copy_from_user(&args, (ioctl_arg_t *)arg, sizeof(ioctl_arg_t))) return -EACCES;
pid = args.pid;
break;
USER SPACE
Define and implement the callback function:
#define SIG_TEST 44
void signalFunction(int n, siginfo_t *info, void *unused) {
printf("received value %d\n", info->si_int);
}
In main procedure:
int fd = open("/dev/YourModule", O_RDWR);
if (fd < 0) return -1;
args.pid = getpid();
ioctl(fd, IOCTL_SET_VARIABLES, &args); // send the our PID as argument
struct sigaction sig;
sig.sa_sigaction = signalFunction; // Callback function
sig.sa_flags = SA_SIGINFO;
sigaction(SIG_TEST, &sig, NULL);
I hope it helps, despite the fact the answer is a bit long, but it is easy to understand.
You can use, e.g., kill_pid(declared in <linux/sched.h>) for send signal to the specified process. To form parameters to it, see implementation of sys_kill (defined as SYSCALL_DEFINE2(kill) in kernel/signal.c).
Note, that it is almost useless to send signal from the kernel to the current process: kernel code should return before user-space program ever sees signal fired.
Your interface is violating the spirit of Linux. Don't do that..... A system call (in particular those related to your driver) should only fail with errno (see syscalls(2)...); consider eventfd(2) or netlink(7) for such asynchronous kernel <-> userland communications (and expect user code to be able to poll(2) them).
A kernel module could fail to be loaded. I'm not familiar with the details (never coded any kernel modules) but this hello2.c example suggests that the module init function can return a non zero error code on failure.
People are really expecting that signals (which is a difficult and painful concept) are behaving as documented in signal(7) and what you want to do does not fit in that picture. So a well behaved kernel module should never asynchronously send any signal to processes.
If your kernel module is not behaving nicely your users would be pissed off and won't use it.
If you want to fork your experimental kernel (e.g. for research purposes), don't expect it to be used a lot; only then could you realistically break signal behavior like you intend to do, and you could code things which don't fit into the kernel module picture (e.g. add a new syscall). See also kernelnewbies.
Related
So my problem sounds like this.
I have some platform dependent code (embedded system) which writes to some MMIO locations that are hardcoded at specific addresses.
I compile this code with some management code inside a standard executable (mainly for testing) but also for simulation (because it takes longer to find basic bugs inside the actual HW platform).
To alleviate the hardcoded pointers, i just redefine them to some variables inside the memory pool. And this works really well.
The problem is that there is specific hardware behavior on some of the MMIO locations (w1c for example) which makes "correct" testing hard to impossible.
These are the solutions i thought of:
1 - Somehow redefine the accesses to those registers and try to insert some immediate function to simulate the dynamic behavior. This is not really usable since there are various ways to write to the MMIO locations (pointers and stuff).
2 - Somehow leave the addresses hardcoded and trap the illegal access through a seg fault, find the location that triggered, extract exactly where the access was made, handle and return. I am not really sure how this would work (and even if it's possible).
3 - Use some sort of emulation. This will surely work, but it will void the whole purpose of running fast and native on a standard computer.
4 - Virtualization ?? Probably will take a lot of time to implement. Not really sure if the gain is justifiable.
Does anyone have any idea if this can be accomplished without going too deep? Maybe is there a way to manipulate the compiler in some way to define a memory area for which every access will generate a callback. Not really an expert in x86/gcc stuff.
Edit: It seems that it's not really possible to do this in a platform independent way, and since it will be only windows, i will use the available API (which seems to work as expected). Found this Q here:
Is set single step trap available on win 7?
I will put the whole "simulated" register file inside a number of pages, guard them, and trigger a callback from which i will extract all the necessary info, do my stuff then continue execution.
Thanks all for responding.
I think #2 is the best approach. I routinely use approach #4, but I use it to test code that is running in the kernel, so I need a layer below the kernel to trap and emulate the accesses. Since you have already put your code into a user-mode application, #2 should be simpler.
The answers to this question may provide help in implementing #2. How to write a signal handler to catch SIGSEGV?
What you really want to do, though, is to emulate the memory access and then have the segv handler return to the instruction after the access. This sample code works on Linux. I'm not sure if the behavior it is taking advantage of is undefined, though.
#include <stdint.h>
#include <stdio.h>
#include <signal.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static void segv_handler(int, siginfo_t *, void *);
int main()
{
struct sigaction action = { 0, };
action.sa_sigaction = segv_handler;
action.sa_flags = SA_SIGINFO;
sigaction(SIGSEGV, &action, NULL);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static void segv_handler(int, siginfo_t *info, void *ucontext_arg)
{
ucontext_t *ucontext = static_cast<ucontext_t *>(ucontext_arg);
ucontext->uc_mcontext.gregs[REG_RAX] = 1234;
ucontext->uc_mcontext.gregs[REG_RIP] += 2;
}
The code to read the register is written in assembly to ensure that both the destination register and the length of the instruction are known.
This is how the Windows version of prl's answer could look like:
#include <stdint.h>
#include <stdio.h>
#include <windows.h>
#define REG_ADDR ((volatile uint32_t *)0x12340000f000ULL)
static uint32_t read_reg(volatile uint32_t *reg_addr)
{
uint32_t r;
asm("mov (%1), %0" : "=a"(r) : "r"(reg_addr));
return r;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *);
int main()
{
SetUnhandledExceptionFilter(segv_handler);
// force sigsegv
uint32_t a = read_reg(REG_ADDR);
printf("after segv, a = %d\n", a);
return 0;
}
static LONG WINAPI segv_handler(EXCEPTION_POINTERS *ep)
{
// only handle read access violation of REG_ADDR
if (ep->ExceptionRecord->ExceptionCode != EXCEPTION_ACCESS_VIOLATION ||
ep->ExceptionRecord->ExceptionInformation[0] != 0 ||
ep->ExceptionRecord->ExceptionInformation[1] != (ULONG_PTR)REG_ADDR)
return EXCEPTION_CONTINUE_SEARCH;
ep->ContextRecord->Rax = 1234;
ep->ContextRecord->Rip += 2;
return EXCEPTION_CONTINUE_EXECUTION;
}
So, the solution (code snippet) is as follows:
First of all, i have a variable:
__attribute__ ((aligned (4096))) int g_test;
Second, inside my main function, i do the following:
AddVectoredExceptionHandler(1, VectoredHandler);
DWORD old;
VirtualProtect(&g_test, 4096, PAGE_READWRITE | PAGE_GUARD, &old);
The handler looks like this:
LONG WINAPI VectoredHandler(struct _EXCEPTION_POINTERS *ExceptionInfo)
{
static DWORD last_addr;
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_GUARD_PAGE_VIOLATION) {
last_addr = ExceptionInfo->ExceptionRecord->ExceptionInformation[1];
ExceptionInfo->ContextRecord->EFlags |= 0x100; /* Single step to trigger the next one */
return EXCEPTION_CONTINUE_EXECUTION;
}
if (ExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
DWORD old;
VirtualProtect((PVOID)(last_addr & ~PAGE_MASK), 4096, PAGE_READWRITE | PAGE_GUARD, &old);
return EXCEPTION_CONTINUE_EXECUTION;
}
return EXCEPTION_CONTINUE_SEARCH;
}
This is only a basic skeleton for the functionality. Basically I guard the page on which the variable resides, i have some linked lists in which i hold pointers to the function and values for the address in question. I check that the fault generating address is inside my list then i trigger the callback.
On first guard hit, the page protection will be disabled by the system, but i can call my PRE_WRITE callback where i can save the variable state. Because a single step is issued through the EFlags, it will be followed immediately by a single step exception (which means that the variable was written), and i can trigger a WRITE callback. All the data required for the operation is contained inside the ExceptionInformation array.
When someone tries to write to that variable:
*(int *)&g_test = 1;
A PRE_WRITE followed by a WRITE will be triggered,
When i do:
int x = *(int *)&g_test;
A READ will be issued.
In this way i can manipulate the data flow in a way that does not require modifications of the original source code.
Note: This is intended to be used as part of a test framework and any penalty hit is deemed acceptable.
For example, W1C (Write 1 to clear) operation can be accomplished:
void MYREG_hook(reg_cbk_t type)
{
/** We need to save the pre-write state
* This is safe since we are assured to be called with
* both PRE_WRITE and WRITE in the correct order
*/
static int pre;
switch (type) {
case REG_READ: /* Called pre-read */
break;
case REG_PRE_WRITE: /* Called pre-write */
pre = g_test;
break;
case REG_WRITE: /* Called after write */
g_test = pre & ~g_test; /* W1C */
break;
default:
break;
}
}
This was possible also with seg-faults on illegal addresses, but i had to issue one for each R/W, and keep track of a "virtual register file" so a bigger penalty hit. In this way i can only guard specific areas of memory or none, depending on the registered monitors.
I am trying to correctly register interrupt in kernel for user interface.
Surprisingly, I did not find many examples in kernel for that.
irq handler
static irqreturn_t irq_handler(int irq, void *dev_id)
{
struct el_irq_dev *el_irq = &el_irq_devices[0];
printk("irq in\n");
spin_lock(&el->my_lock,flags);
clear_interrupt()
some_buffer[buf_wr] = ch;
el_irq->buf_wr++;
if (el_irqbuf_wr >= 16)
el_irqbuf_wr = 0;
spin_unlock(&el->my_lock,flags);
wake_up_interruptible(&el->pollw);
return IRQ_HANDLED;
}
ioctl for waiting on interrupts
static long el_device_ioctl( struct file *filp,
unsigned int ioctl_num,
unsigned long ioctl_param)
{
struct el_irq_dev *el_irq = &el_irq_devices[0];
switch (ioctl_num) {
case IOCTL_WAIT_IRQ: <<<---- using ioctl (no poll) to wait on interrupt
wait_event_interruptible(el_irq->pollw, &el_irq->buf_wr != &el_irq->buf_rd) ;
spin_lock(&el_irq->my_lock);
if (el_irq->buf_wr != &el_irq->buf_rd)
{
my_value=some_buffer[el_irq->buf_rd];
el_irq->buf_rd++;
if (el_irq->buf_rd >= 16)
el_irq->buf_rd = 0;
}
spin_unlock(&el_irq->my_lock);
copy_to_user(ioctl_param,&my_value,sizeof(my_value));
default:
break;
}
return 0;
}
My question is:
Should we put the clear of interrupts (clear_interrupt() ) in fpga in the interrupt
before or after the wake_up ? Can we event put the clearing interrupt in the userspace handler (IOCTL_WAIT_IRQ) instead of clearing the interrupt in the
interrupt handler ?
As you can see in the code, I am using cyclic buffer in order to handle cases where the userspace handler is missing interrupts. Is that really required or can we assume that there are no misses ?
In other words, Is it reasnoble to assume that there should never be missed interrupts ? so that the ioctl call should never see more than 1 waiting interrupt ? If yes - maybe I don't need buffer mechanism between the interrupt handler and the ioctl handler.
Thank you,
Ran
Short answer.
It seems reasonable to me to clear interrupts in the user-space handler. It makes some sense to do it as late as possible, after all work has been done, as long as you check again after clearing that there is really no work left to do (some more work might have arrived just before clearing).
The user-space handler might indeed miss interrupts, e.g. if several arrive between calls to IOCTL_WAIT_IRQ. Interrupts might also get "missed" in some sense though if several pieces of work arrive before interrupts are cleared. The stack (hardware and software) should be designed so that this is not a problem. An interrupt should just signal that there is work to be done, and the user-space handler should be able to just do all outstanding work before returning.
You should probably be using spin_lock_irqsave() in your IOCtl code[1].
[1] http://www.makelinux.net/ldd3/chp-5-sect-5
I'm making code to transfer string in kernel to usermode using systemcall and copy_to_user
here is my code
kernel
#include<linux/kernel.h>
#include<linux/syscalls.h>
#include<linux/sched.h>
#include<linux/slab.h>
#include<linux/errno.h>
asmlinkage int sys_getProcTagSysCall(pid_t pid, char **tag){
printk("getProcTag system call \n\n");
struct task_struct *task= (struct task_struct*) kmalloc(sizeof(struct task_struct),GFP_KERNEL);
read_lock(&tasklist_lock);
task = find_task_by_vpid(pid);
if(task == NULL )
{
printk("corresponding pid task does not exist\n");
read_unlock(&tasklist_lock);
return -EFAULT;
}
read_unlock(&tasklist_lock);
printk("Corresponding pid task exist \n");
printk("tag is %s\n" , task->tag);
/*
task -> tag : string is stored in task->tag (ex : "abcde")
this part is well worked
*/
if(copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) !=0)
;
return 1;
}
and this is user
#include<stdio.h>
#include<stdlib.h>
int main()
{
char *ret=NULL;
int pid = 0;
printf("PID : ");
scanf("%4d", &pid);
if(syscall(339, pid, &ret)!=1) // syscall 339 is getProcTagSysCall
printf("pid %d does not exist\n", pid);
else
printf("Corresponding pid tag is %s \n",ret); //my output is %s = null
return 0;
}
actually i don't know about copy_to_user well. but I think copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) is operated like this code
so i use copy_to_user like above
#include<stdio.h>
int re();
void main(){
char *b = NULL;
if (re(&b))
printf("success");
printf("%s", b);
}
int re(char **str){
char *temp = "Gdg";
*str = temp;
return 1;
}
Is this a college assignment of some sort?
asmlinkage int sys_getProcTagSysCall(pid_t pid, char **tag){
What is this, Linux 2.6? What's up with ** instead of *?
printk("getProcTag system call \n\n");
Somewhat bad. All strings are supposed to be prefixed.
struct task_struct *task= (struct task_struct*) kmalloc(sizeof(struct task_struct),GFP_KERNEL);
What is going on here? Casting malloc makes no sense whatsoever, if you malloc you should have used sizeof(*task) instead, but you should not malloc in the first place. You want to find a task and in fact you just overwrite this pointer's value few lines later anyway.
read_lock(&tasklist_lock);
task = find_task_by_vpid(pid);
find_task_by_vpid requires RCU. The kernel would have told you that if you had debug enabled.
if(task == NULL )
{
printk("corresponding pid task does not exist\n");
read_unlock(&tasklist_lock);
return -EFAULT;
}
read_unlock(&tasklist_lock);
So... you unlock... but you did not get any kind of reference to the task.
printk("Corresponding pid task exist \n");
printk("tag is %s\n" , task->tag);
... in other words by the time you do task->tag, the task may already be gone. What requirements are there to access ->tag itself?
if(copy_to_user(*tag, task->tag, sizeof(char) * task->tag_length) !=0)
;
What's up with this? sizeof(char) is guaranteed to be 1.
I'm really confused by this entire business.
When you have a syscall which copies data to userspace where amount of data is not known prior to the call, teh syscall accepts both buffer AND its size. Then you can return appropriate error if the thingy you are trying to copy would not fit.
However, having a syscall in the first place looks incorrect. In linux per-task data is exposed to userspace in /proc/pid/. Figuring out how to add a file to proc is easy and left as an exercise for the reader.
It's quite obvious from the way you fixed it. copy_to_user() will only copy data between two memory regions - one accessible only to kernel and the other accessible also to user. It will not, however, handle any memory allocation. Userspace buffer has to be already allocated and you should pass address of this buffer to the kernel.
One more thing you can change is to change your syscall to use normal pointer to char instead of pointer to pointer which is useless.
Also note that you are leaking memory in your kernel code. You allocate memory for task_struct using kmalloc and then you override the only pointer you have to this memory when calling find_task_by_vpid() and this memory is never freed. find_task_by_vpid() will return a pointer to a task_struct which already exists in memory so there is no need to allocate any buffer for this.
i solved my problem by making malloc in user
I changed
char *b = NULL;
to
char *b = (char*)malloc(sizeof(char) * 100)
I don't know why this work properly. but as i guess copy_to_user get count of bytes as third argument so I should malloc before assigning a value
I don't know. anyone who knows why adding malloc is work properly tell me
Is the following Linux device driver code safe, or do I need to protect access to interrupt_flag with a spinlock?
static DECLARE_WAIT_QUEUE_HEAD(wq_head);
static int interrupt_flag = 0;
static ssize_t my_write(struct file* filp, const char* __user buffer, size_t length, loff_t* offset)
{
interrupt_flag = 0;
wait_event_interruptible(wq_head, interrupt_flag != 0);
}
static irqreturn_t handler(int irq, void* dev_id)
{
interrupt_flag = 1;
wake_up_interruptible(&wq_head);
return IRQ_HANDLED;
}
Basically, I kick off some event in my_write() and wait for the interrupt to indicate that it completes.
If so, which form of spin_lock() do I need to use? I thought spin_lock_irq() was appropriate, but when I tried that I got a warning about the IRQ handler enabling interrupts.
Doesn't wait_event_interruptible evaluate the interrupt_flag != 0 condition? That would imply that the lock should be held while it reads the flag, right?
No lock is needed in the example given. Memory barriers are needed after the store of the flag, and before the load -- to ensure visibility to the flag -- but the wait_event_* and wake_up_* functions provide those. See the section entitled "Sleep and wake-up functions" in this document: https://www.kernel.org/doc/Documentation/memory-barriers.txt
Before adding a lock, consider what is being protected. Generally locks are needed if you're setting two or more separate pieces of data and you need to ensure that another cpu/core doesn't see an incomplete intermediate state (after you started but before you finished). In this case, there's no point in protecting the storing / loading of the flag value because stores and loads of a properly aligned integer are always atomic.
So, depending on what else your driver is doing, it's quite possible you do need a lock, but it isn't needed for the snippet you've provided.
Yes you need a lock. With the given example (that uses int and no specific arch is mentioned), the process context may be interrupted while accessing the interrupt_flag. Upon return from the IRQ, it may continue and interrupt_flag may be left in inconsistent state.
Try this:
static DECLARE_WAIT_QUEUE_HEAD(wq_head);
static int interrupt_flag = 0;
DEFINE_SPINLOCK(lock);
static ssize_t my_write(struct file* filp, const char* __user buffer, size_t length, loff_t* offset)
{
/* spin_lock_irq() or spin_lock_irqsave() is OK here */
spin_lock_irq(&lock);
interrupt_flag = 0;
spin_unlock_irq(&lock);
wait_event_interruptible(wq_head, interrupt_flag != 0);
}
static irqreturn_t handler(int irq, void* dev_id)
{
unsigned long flags;
spin_lock_irqsave(&lock, flags);
interrupt_flag = 1;
spin_unlock_irqrestore(&lock, flags);
wake_up_interruptible(&wq_head);
return IRQ_HANDLED;
}
IMHO, the code has to be written without making any arch or compiler-related assumptions (like the 'properly aligned integer' in Gil Hamilton answer).
Now if we can change the code and use atomic_t instead of the int flag, then no locks should be needed.
I'm using WPE PRO, and I can capture packets and send it back. I tried do it using WinSock 2(The same lib which WPE PRO use), but I don't know how to send packet to a existent TCP connection like WPE PRO does.
http://wpepro.net/index.php?categoryid=2
How can I do it ?
Are you asking how to make someone else's program send data over its existing Winsock connection?
I've done exactly this but unfortunately do not have the code on-hand at the moment. If you give me an hour or two I can put up a working example using C; if you need one let me know and I will.
Edit: sample DLL to test at the bottom of the page if you or anyone else wants to; I can't. All I know is that it compiles. You just need to download (or write!) a freeware DLL injector program to test it; there are tons out there.
In the meantime, what you need to research is:
The very basics of how EXEs are executed.
DLL injection
API hooking
Windows Sockets API
1. The very basics of how EXEs are executed:
The whole entire process of what I'm about to explain to you boils down to this very principal. When you double-click an executable, Windows parses it and loads its code, etc. into memory. This is the key. The compiled code is all being put into RAM. What does this imply? Well, if the application's code is all in RAM, can we change the application's code while it's running by just changing some of its memory? After all, it's just a bunch of instructions.
The answer is yes and will provide us the means of messing with another application - in this case, telling it to send some data over its open socket.
(This principal is also the reason you have to be careful writing programs in low-level languages like C since if you put bad stuff in bad parts of RAM, it can crash the program or open you up to shell code exploits).
2. DLL injection:
The problem is, how do we know which memory to overwrite? Do we have access to that program's memory, especially the parts containing the instructions we want to change? You can write to another process' memory but it's more complicated. The easiest way to change their memory (again, when I say memory, we're talking about the machine code instructions being executed) is by having a DLL loaded and running within that process. Think of your DLL as a .c file you can add to another program and write your own code: you can access the program's variables, call its functions, anything; because it's running within the process.
DLL injection can be done through numerous methods. The usual is by calling the CreateRemoteThread() API function. Do a Google search on that.
3. API Hooking
What is API hooking? To put it more generally, it's "function hooking", we just happen to be interested in hooking API calls; in this case, the ones used for Sockets (socket(), send(), etc.).
Let's use an example. A target application written in C using Winsock. Let's see what they are doing and then show an example of what we WANT to make it do:
Their original source code creating a socket:
SOCKET ConnectSocket = INVALID_SOCKET;
ConnectSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
Now, that's the original program's source code. Our DLL won't have access to that because it's loaded within an EXE and an EXE is compiled (duh!). So let's say their call to socket() looked something like this after being compiled to machine code (assembly). I don't know assembly at all but this is just for illustration:
The assembly/machine code:
PUSH 06 ; IPPROTO_TCP
PUSH 01 ; SOCK_STREAM
PUSH 02 ; AF_INET
CALL WS2_32.socket ; This is one of the parts our DLL will need to intercept ("hook").
In order for us to make that program send data (using our DLL), we need to know the socket's handle. So we need to intercept their call to the socket function. Here are some considerations:
The last instruction there would need to be changed to: CALL OurOwnDLL.socket. That CALL instruction is just a value in memory somewhere (remember?) so we can do that with WriteProcessMemory. We'll get to that.
We want to take control of the target program, not crash it or make it behave strangely. So our code needs to be transparent. Our DLL which we will inject needs to have a socket function identical to the original, return the same value, etc. The only difference is, we will be logging the return value (SocketHandle) so that we can use it later when we want to send data.
We also need to know if/when the socket connects since we can't send data unless it is (assuming we're using TCP like most applications do). This means we need to also hook the Winsock connect API function and also duplicate that in our DLL.
DLL to inject and monitor the socket and connect functions (untested):
This C DLL will have everything in place to hook and unhook functions. I can't test it at the moment and I'm not even much of a C programmer so let me know if you come across any problems.
Compile this as a Windows DLL not using Unicode and inject it into a process that you know uses WS2_32's socket() and connect() functions and let me know if it works. I have no means to test, sorry. If you need further help or fixes, let me know.
/*
SocketHookDLL.c
Author: Daniel Elkins
License: Public Domain
Version: 1.0.0
Created: May 14th, 2014 at 12:23 AM
Updated: [Never]
Summary:
1. Link to the Winsock library so we can use its functions.
2. Export our own `socket` and `connect` functions so that
they can be called by the target application instead of
the original ones from WS2_32.
3. "Hook" the socket APIs by writing over the target's memory,
causing `CALL WS2_32.socket` to `CALL SocketHookDLL.socket`, using
WriteProcessMemory.
4. Make sure to keep a copy of the original memory for when we no
no longer want to hook those socket functions (i.e. DLL detaching).
*/
#pragma comment(lib, "WS2_32.lib")
#include <WinSock2.h>
/* These functions hook and un-hook an API function. */
unsigned long hookFunction (const char * dllModule, const char * apiFunction, unsigned char * memoryBackup);
unsigned int unHookFunction (const char * dllModule, const char * apiFunction, unsigned char * memoryBackup);
/*
These functions (the ones we want to hook) are copies of the original Winsock's functions from Winsock2.h.
1. Calls OurDLL.hooked_socket() (unknowingly).
2. OurDLL.hooked_socket() calls the original Winsock.socket() function.
3. We take note of the returned SOCKET handle so we can use it later to send data.
4. OurDLL.hooked_socket() returns the SOCKET back to the target app so everthing works as it should (hopefully!).
Note: You can change return values, parameters (like data being sent/received like WPE does), just be aware it will
also (hopefully, intendingly) change the behavior of the target application.
*/
SOCKET WSAAPI hooked_socket (int af, int type, int protocol);
int WSAAPI hooked_connect (SOCKET s, const struct sockaddr FAR * name, int namelen);
/* Backups of the original memory; need one for each API function you hook (if you want to unhook it later). */
unsigned char backupSocket[6];
unsigned char backupConnect[6];
/* Our SOCKET handle used by the target application. */
SOCKET targetsSocket = INVALID_SOCKET;
/* This is the very first code that gets executed once our DLL is injected: */
BOOL APIENTRY DllMain (HMODULE moduleHandle, DWORD reason, LPVOID reserved)
{
/*
We will hook the desired Socket APIs when attaching
to target EXE and UN-hook them when being detached.
*/
switch (reason)
{
case DLL_PROCESS_ATTACH:
/* Here goes nothing! */
hookFunction ("WS2_32.DLL", "socket", backupSocket);
hookFunction ("WS2_32.DLL", "connect", backupConnect);
break;
case DLL_THREAD_ATTACH:
break;
case DLL_PROCESS_DETACH:
unHookFunction ("WS2_32.DLL", "socket", backupSocket);
unHookFunction ("WS2_32.DLL", "connect", backupConnect);
break;
case DLL_THREAD_DETACH:
break;
}
return TRUE;
}
unsigned long hookFunction (const char * dllModule, const char * apiFunction, unsigned char * memoryBackup)
{
/*
Hook an API function:
=====================
1. Build the necessary assembly (machine code) opcodes to get our DLL called!
2. Get a handle to the API we're hooking.
3. Use ReadProcessMemory() to backup the original memory to un-hook the function later.
4. Use WriteProcessMemory to make changes to the instructions in memory.
*/
HANDLE thisTargetProcess;
HMODULE dllModuleHandle;
unsigned long apiAddress;
unsigned long memoryWritePosition;
unsigned char newOpcodes[6] = {
0xE9, 0x00, 0x00, 0x00, 0x00, 0xC3 // Step #1.
};
thisTargetProcess = GetCurrentProcess ();
// Step #2.
dllModuleHandle = GetModuleHandle (dllModule);
if (!dllModuleHandle)
return 0;
apiAddress = (unsigned long) GetProcAddress (dllModuleHandle, apiFunction);
if (!apiAddress)
return 0;
// Step #3.
ReadProcessMemory (thisTargetProcess, (void *) apiAddress, memoryBackup, 6, 0);
memoryWritePosition = ((unsigned long) apiFunction - apiAddress - 5);
memcpy (&newOpcodes[1], &apiAddress, 4);
// Step #4.
WriteProcessMemory (thisTargetProcess, (void *) apiAddress, newOpcodes, 6, 0);
return apiAddress;
}
unsigned int unHookFunction (const char * dllModule, const char * apiFunction, unsigned char * memoryBackup)
{
HANDLE thisTargetProcess;
HMODULE dllModuleHandle;
unsigned long apiAddress;
unsigned long memoryWritePosition;
thisTargetProcess = GetCurrentProcess ();
dllModuleHandle = GetModuleHandleA (dllModule);
if (!dllModuleHandle)
return 0;
apiAddress = (unsigned long) GetProcAddress (dllModuleHandle, apiFunction);
if (!apiAddress)
return 0;
if (WriteProcessMemory (thisTargetProcess, (void *) apiAddress, memoryBackup, 6, 0))
return 1;
return 0;
}
/* You may want to use a log file instead of a MessageBox due to time-outs, etc. */
SOCKET WSAAPI hooked_socket (int af, int type, int protocol)
{
targetsSocket = socket (af, type, protocol);
MessageBox (NULL, "(Close this quickly)\r\n\r\nThe target's socket was hooked successfully!", "Hooked SOCKET", MB_OK);
return targetsSocket;
}
int WSAAPI hooked_connect (SOCKET s, const struct sockaddr FAR * name, int namelen)
{
MessageBox (NULL, "(Close this quickly)\r\n\r\nThe target just connected to a remote address.", "Target Connected", MB_OK);
return connect (s, name, namelen);
}