Why does Unix have fork() but not CreateProcess()? - winapi

I do not get why Unix has fork() for creating a new process. In Win32 API we have CreateProcess() which creates a new process and loads an executable into its address space, then starts executing from the entry point. However Unix offers fork for creating a new process, and I don't get why would I duplicate my process if I'd like to run another process.
So let me ask these two questions:
If fork() and then exec() is more efficient, why isn't there a function forkexec(const char *newProc) since we will call exec() after fork() almost in every case?
If it is not more efficient, why does fork() exist at all?

The fork() call is sufficient. It is also more flexible; it allows you to things like adjust the I/O redirection in the child process, rather than complicating the system call to create the process. With SUID or SGID programs, it allows the child to lose its elevated privileges before executing the other process.
If you want a complex way to create a process, lookup the posix_spawn() function.
#include <spawn.h>
int posix_spawn(pid_t *restrict pid, const char *restrict path,
const posix_spawn_file_actions_t *file_actions,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict], char *const envp[restrict]);
int posix_spawnp(pid_t *restrict pid, const char *restrict file,
const posix_spawn_file_actions_t *file_actions,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict], char *const envp[restrict]);
The difference is the posix_spawnp() does a search on PATH for the executable.
There is a whole set of other functions for handling posix_spawn_file_actions_t and posix_spawnattr_t types (follow the 'See Also' links at the bottom of the referenced man page).
This is quite a bit more like CreateProcess() on Windows. For the most part, though, using fork() followed shortly by exec() is simpler.
I don't understand what you mean. The child process code will be written by me, so what is the difference between writing if (fork() == 0) and putting this code in the beginning of child's main()?
Very often, the code you execute is not written by you, so you can't modify what happens in the beginning of the child's process. Think of a shell; if the only programs you run from the shell are those you've written, life is going to be very impoverished.
Quite often, the code you execute will be called from many different places. In particular, think of a shell and a program that will sometimes be executed in a pipeline and sometimes executed without pipes. The called program cannot tell what I/O redirections and fixups it should do; the calling program knows.
If the calling program is running with elevated privileges (SUID or SGID privileges), it is normal to want to turn those 'off' before running another program. Relying on the other program to know what to do is ... foolish.

UNIX-like operating systems (at least newer Linux and BSD kernels) generally have a very efficient fork implementation -- it is "so cheap" that there are "threaded" implementations based upon it in some languages.
In the end the forkexec function is ~n -- for some small value of n -- lines of application code.
I sure wish windows had such a useful ForkProcess :(
Happy coding.
A cnicutar mentioned, Copy-On-Write (COW) is one strategy used.

There is a function that is equivalent to forkexec - system
http://www.tutorialspoint.com/c_standard_library/c_function_system.htm
#include <stdio.h>
#include <string.h>
int main ()
{
char command[50];
strcpy( command, "ls -l" );
system(command);
return(0);
}

Related

Is there a race between starting and seeing yourself in WinApi's EnumProcesses()?

I just found this code in the wild:
def _scan_for_self(self):
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
basename = self.cmdline.split()[0]
pids = win32process.EnumProcesses()
if not pids:
UserLog.warn("WindowsProcess", "no pids", pids)
for pid in pids:
try:
handle = win32api.OpenProcess(
win32con.PROCESS_QUERY_INFORMATION | win32con.PROCESS_VM_READ,
pywintypes.FALSE, pid)
except pywintypes.error, err:
UserLog.warn("WindowsProcess", str(err))
continue
try:
modlist = win32process.EnumProcessModules(handle)
except pywintypes.error,err:
UserLog.warn("WindowsProcess",str(err))
continue
This line caught my eye:
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
It suggests that if you call EnumProcesses() too fast after starting, you won't see yourself. Is there any truth to this?
There is a race, but it's not the race the code tried to protect against.
A successful call to CreateProcess returns only after the kernel object representing the process has been created and enqueued into the kernel's process list. A subsequent call to EnumProcesses accesses the same list, and will immediately observe the newly created process object.
That is, unless the process object has since been destroyed. This isn't entirely unusual since processes in Windows are initialized in-process. The documentation even makes note of that:
Note that the function returns before the process has finished initialization. If a required DLL cannot be located or fails to initialize, the process is terminated.
What this means is that if a call to EnumProcesses immediately following a successful call to CreateProcess doesn't observe the newly created process, it does so because it was late rather than early. If you are late already then adding a delay will only make you more late.
Which swiftly leads to the actual race here: Process IDs uniquely identify processes only for a finite time interval. Once a process object is gone, its ID is up for grabs, and the system will reuse it at some point. The only reliable way to identify a process is by holding a handle to it.
Now it's anyone's guess what the author of _scan_for_self was trying to accomplish. As written, the code takes more time to do something that's probably altogether wrong1 anyway.
1 Turns out my gut feeling was correct. This is just your average POSIX developer, that, in the process of learning that POSIX is insufficient would rather call out Microsoft instead of actually using an all-around superior API.
The documentation for EnumProcesses (WIn32 API - EnumProcesses function), does not mention anything about a delay needed to see the current process in the list it returns.
The example from Microsoft how to use EnumProcess to enumerate all running processes (Enumerating All Processes), also does not contain any delay before calling EnumProcesses.
A small test application I created in C++ (see below) always reports that the current process is in the list (tested on Windows 10):
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
#include <vector>
const DWORD MAX_NUM_PROCESSES = 4096;
DWORD aProcesses[MAX_NUM_PROCESSES];
int main(void)
{
// Get the list of running process Ids:
DWORD cbNeeded;
if (!EnumProcesses(aProcesses, MAX_NUM_PROCESSES * sizeof(DWORD), &cbNeeded))
{
return 1;
}
// Check if current process is in the list:
DWORD curProcId = GetCurrentProcessId();
bool bFoundCurProcId{ false };
DWORD numProcesses = cbNeeded / sizeof(DWORD);
for (DWORD i=0; i<numProcesses; ++i)
{
if (aProcesses[i] == curProcId)
{
bFoundCurProcId = true;
}
}
std::cout << "bFoundCurProcId: " << bFoundCurProcId << std::endl;
return 0;
}
Note: I am aware that the fact that the program reported the expected result does not mean that there is no race. Maybe I just couldn't catch it manifest. But trying to run code like that can give you a hint sometimes (especially if the result would have been that there is a race).
The fact that I never had a problem running this test (did it many times), together with the lack of any mention of the need for a delay in Microsoft's documentation make me believe that it is not required.
My conclusion is that either:
There is a unique issue when using it from python (doubt it).
or:
The code you found is doing something unnecessary.
There is no race.
EnumProcesses calls a NT API function that switches to kernel mode to walk the linked list of processes. Your own process has been added to the list before it starts running.

Call a shellcode without using pointer to function?

Is there a way to get the return value of a function that is in the shellcode, without using pointer to function?
#include <stdio.h>
unsigned char code[] = "\x55\x48\x89\xe5"
"\xb8\x05\x00\x00"
"\x00\x5d\xc3";
int main(void) {
int (*p)(void) = (int(*)(void))code;
printf("%d", p());
return 0;
}
Shellcode (see Wikipedia article Shellcode as well as this presentation Introduction to Shellcode Development) is machine code that is injected into an application in order to take over the application and run your own application within that application's process.
How the shellcode is injected into the application and starts running will vary depending on how the penetration is being done.
However for testing approaches for the actual shellcode, as opposed to approaches for injecting the shellcode in the first place, the testing is typically done with a simple program that allows you to (1) create the shellcode program that is to be injected as an array of bytes and (2) start the shellcode executing.
The simplest approach for this is the source code you have posted.
You have an array of unsigned char which contains the machine code to be executed.
You have a main() which creates a function pointer to the array of unsigned char bytes and then calls the shellcode through the function pointer.
However in a real world penetration what you would normally do is to use a technique whereby you would take over an application by injecting your shellcode into its process space and then triggering the execution of that shellcode. One such approach is a buffer overflow attack. See for example COEN 152 Computer Forensics Buffer Overflow Attack as well as Wikipedia article Buffer overflow.
See also
Shellcode in C program
Re-writing a small execve shellcode
Also note that the approaches for shellcode attacks will vary depending on the operating system that is being attacked. For instance see this article Basics of Windows shellcode writing which explains some of the intricacies of writing a shellcode for accessing system calls in Windows. Compare to this article providing a way for How to write a Linux x86 shellcode.

Make a system call to get list of processes

I'm new on modules programming and I need to make a system call to retrieve the system processes and show how much CPU they are consuming.
How can I make this call?
Why would you implement a system call for this? You don't want to add a syscall to the existing Linux API. This is the primary Linux interface to userspace and nobody touches syscalls except top kernel developers who know what they do.
If you want to get a list of processes and their parameters and real-time statuses, use /proc. Every directory that's an integer in there is an existing process ID and contains a bunch of useful dynamic files which ps, top and others use to print their output.
If you want to get a list of processes within the kernel (e.g. within a module), you should know that the processes are kept internally as a doubly linked list that starts with the init process (symbol init_task in the kernel). You should use macros defined in include/linux/sched.h to get processes. Here's an example:
#include <linux/module.h>
#include <linux/printk.h>
#include <linux/sched.h>
static int __init ex_init(void)
{
struct task_struct *task;
for_each_process(task)
pr_info("%s [%d]\n", task->comm, task->pid);
return 0;
}
static void __exit ex_fini(void)
{
}
module_init(ex_init);
module_exit(ex_fini);
This should be okay to gather information. However, don't change anything in there unless you really know what you're doing (which will require a bit more reading).
There are syscalls for that, called open, and read. The information of all processes are all kept in /proc/{pid} directories. You can gather process information by reading corresponding files.
More explained here: http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html

QProcess fails to execute external executable

I am struggling to find a solution to my problem, but I simply have no clue how to solve it.
I am creating an user-interface for some programs I made (so you can through simply pressing a button start an executable).
So I thought of using qt.
So I read a lot about the QProcess and tried to use it.
At the first executable of mine I tried to start it through QProcess::start(), but it didn't work so I tried it with QProcess:execute():
QProcess *proc = new QProcess(this);
QDir::setCurrent("C:\\DIRTOTHEEXE\\");
QString program="HELLO.exe";
proc->execute(program);
This executes my program perfectly and works nice.
So I tried to do the same with my other exe, but it didn't work
QProcess *myproc = new QProcess(this);
QDir::setCurrent("C:\\DIRTOTHEEXE\\");
QString program="HelloWorld.exe";
myproc->start(program);
The called executable simply prints "Hello World" and returns 0 then.
So now my question is: What could cause this behaviour and why can't I use QProcess::start() for the first executable?
Btw: I also tried to set the workingDirectory() to the path of the exe, but also that didn't work.
Hope someone can help me.
EDIT:
So the program is executed but crashes right after printing out one line.
EDIT: Here the HelloWorld source.
#include <iostream>
using namespace std;
int main(int argc, char* argv[]) {
cout<<"HELLO WORLD!!"<<endl;
return 0;
}
QProcess has 3 functions for starting external processes, such as: -
start
execute
startDetached
The latter two, execute and startDetached are static, so don't need an instance of QProcess to call them.
If you use start, you should at least be calling waitForStarted() to let the process setup properly. The execute() function will wait for the process to finish, so calling waitForStarted is not required.
As you've only posted a small amount of code, we can't see exactly what you're trying to do afterwards. Is that code in a function that ends, or are you trying to retrieve the output of the process? If so, you definitely should be calling waitForStarted if using start().
If you only want to run the process without waiting for it to finish and your program is not bothered about interacting with the process, then use startDetached: -
QProcess::startDetached("C:\\DIRTOTHEEXE\\HELLO.exe");

How can a C/C++ program put itself into background?

What's the best way for a running C or C++ program that's been launched from the command line to put itself into the background, equivalent to if the user had launched from the unix shell with '&' at the end of the command? (But the user didn't.) It's a GUI app and doesn't need any shell I/O, so there's no reason to tie up the shell after launch. But I want a shell command launch to be auto-backgrounded without the '&' (or on Windows).
Ideally, I want a solution that would work on any of Linux, OS X, and Windows. (Or separate solutions that I can select with #ifdef.) It's ok to assume that this should be done right at the beginning of execution, as opposed to somewhere in the middle.
One solution is to have the main program be a script that launches the real binary, carefully putting it into the background. But it seems unsatisfying to need these coupled shell/binary pairs.
Another solution is to immediately launch another executed version (with 'system' or CreateProcess), with the same command line arguments, but putting the child in the background and then having the parent exit. But this seems clunky compared to the process putting itself into background.
Edited after a few answers: Yes, a fork() (or system(), or CreateProcess on Windows) is one way to sort of do this, that I hinted at in my original question. But all of these solutions make a SECOND process that is backgrounded, and then terminate the original process. I was wondering if there was a way to put the EXISTING process into the background. One difference is that if the app was launched from a script that recorded its process id (perhaps for later killing or other purpose), the newly forked or created process will have a different id and so will not be controllable by any launching script, if you see what I'm getting at.
Edit #2:
fork() isn't a good solution for OS X, where the man page for 'fork' says that it's unsafe if certain frameworks or libraries are being used. I tried it, and my app complains loudly at runtime: "The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec()."
I was intrigued by daemon(), but when I tried it on OS X, it gave the same error message, so I assume that it's just a fancy wrapper for fork() and has the same restrictions.
Excuse the OS X centrism, it just happens to be the system in front of me at the moment. But I am indeed looking for a solution to all three platforms.
My advice: don't do this, at least not under Linux/UNIX.
GUI programs under Linux/UNIX traditionally do not auto-background themselves. While this may occasionally be annoying to newbies, it has a number of advantages:
Makes it easy to capture standard error in case of core dumps / other problems that need debugging.
Makes it easy for a shell script to run the program and wait until it's completed.
Makes it easy for a shell script to run the program in the background and get its process id:
gui-program &
pid=$!
# do something with $pid later, such as check if the program is still running
If your program forks itself, this behavior will break.
"Scriptability" is useful in so many unexpected circumstances, even with GUI programs, that I would hesitate to explicitly break these behaviors.
Windows is another story. AFAIK, Windows programs automatically run in the background--even when invoked from a command shell--unless they explicitly request access to the command window.
On Linux, daemon() is what you're looking for, if I understand you correctly.
The way it's typically done on Unix-like OSes is to fork() at the beginning and exit from the parent. This won't work on Windows, but is much more elegant than launching another process where forking exists.
Three things need doing,
fork
setsid
redirect STDIN, STDOUT and STDERR to /dev/null
This applies to POSIX systems (all the ones you mention claim to be POSIX (but Windows stops at the claiming bit))
On UNIX, you need to fork twice in a row and let the parent die.
A process cannot put itself into the background, because it isn't the one in charge of background vs. foreground. That would be the shell, which is waiting for process exit. If you launch a process with an ampersand "&" at the end, then the shell does not wait for process exit.
But the only way the process can escape the shell is to fork off another child and then let its original self exit back to the waiting shell.
From the shell, you can background a process with Control-Z, then type "bg".
Backgrounding a process is a shell function, not an OS function.
If you want an app to start in the background, the typical trick is to write a shell script to launch it that launches it in the background.
#! /bin/sh
/path/to/myGuiApplication &
To followup on your edited question:
I was wondering if there was a way to put the EXISTING process into the background.
In a Unix-like OS, there really is not a way to do this that I know of. The shell is blocked because it is executing one of the variants of a wait() call, waiting for the child process to exit. There is not a way for the child process to remain running but somehow cause the shell's wait() to return with a "please stop watching me" status. The reason you have the child fork and exit the original is so the shell will return from wait().
Here is some pseudocode for Linux/UNIX:
initialization_code()
if(failure) exit(1)
if( fork() > 0 ) exit(0)
setsid()
setup_signal_handlers()
for(fd=0; fd<NOFILE; fd++) close(fd)
open("/dev/null", O_RDONLY)
open("/dev/null", O_WRONLY)
open("/dev/null", o_WRONLY)
chdir("/")
And congratulations, your program continues as an independent "daemonized" process without a controlling TTY and without any standard input or output.
Now, in Windows you simply build your program as a Win32 application with WinMain() instead of main(), and it runs without a console automatically. If you want to run as a service, you'll have to look that up because I've never written one and I don't really know how they work.
You edited your question, but you may still be missing the point that your question is a syntax error of sorts -- if the process wasn't put in the background to begin with and you want the PID to stay the same, you can't ignore the fact that the program which started the process is waiting on that PID and that is pretty much the definition of being in the foreground.
I think you need to think about why you want to both put something in the background and keep the PID the same. I suggest you probably don't need both of those constraints.
As others mentioned, fork() is how to do it on *nix. You can get fork() on Windows by using MingW or Cygwin libraries. But those will require you to switch to using GCC as your compiler.
In pure Windows world, you'd use CreateProcess (or one of its derivatives CreateProcessAsUser, CreateProcessWithLogonW).
The simplest form of backgrounding is:
if (fork() != 0) exit(0);
In Unix, if you want to background an disassociate from the tty completely, you would do:
Close all descriptors which may access a tty (usually 0, 1, and 2).
if (fork() != 0) exit(0);
setpgroup(0,getpid()); /* Might be necessary to prevent a SIGHUP on shell exit. */
signal(SIGHUP,SIG_IGN); /* just in case, same as using nohup to launch program. */
fd=open("/dev/tty",O_RDWR);
ioctl(fd,TIOCNOTTY,0); /* Disassociates from the terminal */
close(fd);
if (fork() != 0) exit(0); /* just for good measure */
That should fully daemonize your program.
The most common way of doing this under Linux is via forking. The same should work on Mac, as for Windows I'm not 100% sure but I believe they have something similar.
Basically what happens is the process splits itself into two processes, and then the original one exits (returning control to the shell or whatever), and the second process continues to run in the background.
I'm not sure about Windows, but on UNIX-like systems, you can fork() then setsid() the forked process to move it into a new process group that is not connected to a terminal.
Under Windows, the closing thing you're going to get to fork() is loading your program as a Windows service, I think.
Here is a link to an intro article on Windows services...
CodeProject: Simple Windows Service Sample
So, as you say, just fork()ing will not do the trick. What you must do is fork() and then re-exec(), as this code sample does:
#include stdio.h>
#include <unistd.h>
#include <string.h>
#include <CoreFoundation/CoreFoundation.h>
int main(int argc, char **argv)
{
int i, j;
for (i=1; i<argc; i++)
if (strcmp(argv[i], "--daemon") == 0)
{
for (j = i+1; j<argc; j++)
argv[j-1] = argv[j];
argv[argc - 1] = NULL;
if (fork()) return 0;
execv(argv[0], argv);
return 0;
}
sleep(1);
CFRunLoopRun();
CFStringRef hello = CFSTR("Hello, world!");
printf("str: %s\n", CFStringGetCStringPtr(hello, CFStringGetFastestEncoding(hello)));
return 0;
}
The loop is to check for a --daemon argument, and if it is present, remove it before re-execing so an infinite loop is avoided.
I don't think this will work if the binary is put into the path because argv[0] is not necessarily a full path, so it will need to be modified.
/**Deamonize*/
pid_t pid;
pid = fork(); /**father makes a little deamon(son)*/
if(pid>0)
exit(0); /**father dies*/
while(1){
printf("Hello I'm your little deamon %d\n",pid); /**The child deamon goes on*/
sleep(1)
}
/** try 'nohup' in linux(usage: nohup <command> &) */
In Unix, I have learned to do that using fork().
If you want to put a running process into the background, fork it twice.
I was trying the solution.
Only one fork is needed from the parent process.
The most important point is that, after fork, the parent process must die by calling _exit(0); and NOT by calling exit(0);
When _exit(0); is used, the command prompt immediately returns on the shell.
This is the trick.
If you need a script to have the PID of the program, you can still get it after a fork.
When you fork, save the PID of the child in the parent process. When you exit the parent process, either output the PID to STD{OUT,ERR} or simply have a return pid; statement at the end of main(). A calling script can then get the pid of the program, although it requires a certain knowledge of how the program works.

Resources