The Beauty of fork(2)

Learning multi-tasking and multi-threading on Windows

Windows was the first advanced OS I learnt and experienced. At that moment, I thought it was a great OS because that was not DOS could be done. From Windows, I knew about multi-tasking, many processes can be run at the same time concurrently, and virtual memory, that breaks 640K memory barrier from DOS. But at that time, I was just a pure user and new to programming realm.

Later on, I studied Java in the University. I started learning Multi-threading. Multi-threading is a built-in feature in Java. My text-book said it is a good feature among other languages. If you are using other languages such as C/C++, you will need some external library support for it and it even requires OS to support it natively, and it is not always portable codes.

From Java I knew that multi-threading could increase the responsiveness and processing concurrency of your application, it enables your application to work on other tasks, such as refreshing the windows content and responding to mouse click, while it is running some lengthy tasks like complex database query or calculation.

I had been written for Windows applications for 2 years, just after freshly graduated from my undergraduate studies for work. I began to know about writting multi-threading applications. I was using Visual C++.

So I think it is good to incorporate multi-threading in client application. But how about server application?

Server crashing experience

I had written server application on Windows serving requests from client over the network. Since it was my first server application, I made some mistakes on not writing it well. It crashes occasionally. Later on, I found that it was some badly written code (by me!) that writes to a freed memory region. OK, I fixed it, but in some rare cases, it crashed. I discovered that it was hit by some message patterns. I fixed the message parsing code. And it runs longer, but again, in some very rare occasions, it crashed. I traced the code and fix the problem and even some potential bugs. It ran well and had gone for production. But I was still not comfortable about the code I've written so far and felt that it will crash someday.

Then it comes to another project. It was run on Solaris. This time, I was new on UNIX platform programming, and everything was started from scratch. We hit the same problem again - the server, that handles external TCP requests from external clients, get crashed easily. The server was run as a single process that uses threads intensively to serve multiple clients concurrently. The problem was, sometimes, some misbehave clients sent invalid messages to our severs and some abused using features provided by the server. That hits some potential bugs and consumes unreasonably high memory resource, thus making the server unstable. As all clients make keep-alive connection to the server and receive real-time data continuously from the connection, if the server crashed unexpectedly, it will affect all the connected clients. We had gone through a very long and tedious debugging process to make it robust enough before it went to production status. That process took a lot of months to complete.

At the same, I was trying Apache HTTP Server in my leisure time. Apache HTTP Server has about 60% share in the web server market and has long been recognized as a robust and secure server than others such as IIS and iPlanet. Not to mention its popularity, but I wondered why it works so great. I downloaded the source code and followed the instruction to build, install and run it. And it works. As it is a server software that handle TCP (HTTP) request, I would like to know the architectural differences between ours and Apache HTTP Server. Luckly, Apache HTTP Server comes with source code. After reading the code, documentation and online articles (See Resources ), I noticed that it uses parent-child process model instead of single process, multi-threading model. The essence of parent-child process model is that it makes use of fork(), which is a traditional UNIX system call that exists in UNIX systems over 30 years (See Resources ).

fork(2) in UNIX software

One major different between UNIX and Windows system is that UNIX processes have parent-and-child relationship. All child processes are created by fork(2) system call, which are invoked by parent process. The below shows the descriptions of fork(2) manpage from Mac OS X Jaguar:

Fork() causes creation of a new process.  The new process (child process)
is an exact copy of the calling process (parent process) except for the
following:

    o   The child process has a unique process ID.

    o   The child process has a different parent process ID (i.e., the
        process ID of the parent process).

    o   The child process has its own copy of the parent's descriptors.
        These descriptors reference the same underlying objects, so
        that, for instance, file pointers in file objects are shared
        between the child and the parent, so that an lseek(2) on a
        descriptor in the child process can affect a subsequent read or
        write by the parent.  This descriptor copying is also used by
        the shell to establish standard input and output for newly cre-
        ated processes as well as to set up pipes.

    o   The child processes resource utilizations are set to 0; see
        setrlimit(2).

Parent-child process relationship

Child processes could be able to access parent's resources through file descriptors, which also includes pipes and sockets. That means, if the parent opens a file or creates a socket before the creation of the child, the child process will have its own copy of the parent's file descriptiors. Child process can make use of the parent's copy of descriptor to do what it supposed to do: For example, the parent accepts and creates a client socket. Next, it creates a new child. The child process can access this client socket by calling read() and write(). This limits parent's functions to accept new connections while child process to handle read and write operation.

Moreover, communication channels can be established between parent and its children. They are called inter-process communication, or IPC in short, which includes pipes, semaphore, shared Memory, message queue, etc. They are all very powerful and sophisicated features in UNIX systems.

Through fork(), it will copy all the memory pages owned by the parent to its child. Therefore, child processes could be able to access static data such as configuration settings and even pre-defined constant variables. Since child process runs in a separate address space, a modification to its memory region does not affect the original copy in the parent process.

Apache httpd 1.3 processing model

Apache HTTP Server 1.3 uses fork() in a very genius way. First, parent process is very light weight. It only has codes for loading configuration, creating server socket and manage those forked child processes. This makes the parent very robust and is unlikely to be exposed to security vulnerability. Only the parent is run as root users. Running root user is required if the port is configured as 80. All its child processes will be run as "nobody" user, with very limited privilege assigned to it. With a thin parent process, all complex code, including HTTP request handling and module invoking, is placed in child process. Even if there are some faulty code inside particular module, only the child process is affected but not the parent. Moreover, when the child process suffered from abnormal exit such as bus error or segmentation fault, the parent process could detect it from waitpid() call. Parent process could be able to log such error to a file and respawn another child process in order resume the service. Even if the client requests are not properly handled because of the crash of child processes, the web server is still able to keep its service running.

The below code shows a typical parent-child process model in a general TCP server:

	listenfd = socket(...);	/* create server socket */
	bind(listenfd, ...); 	/* bind the socket */
	listen(listenfd, ...);	/* listen the socket */
	while(1)
	{
		connfd = accept(listenfd, ...);
		
		if ( (childpid = fork()) == 0) /* child process */
		{
			close(listenfd);
			doconn(connfd); /* process the connection */
			close(connfd);
			exit(0);
		}
		close(connfd);
	}

The code shows the server parent first creates the socket, calls bind(), followed by a listen(). Then it will run into a infinite loop and call accept(). When client connects to server, accept() wakes up with the new socket for client connection. The parent process then fork() a new child. It will then call close() to disconnect the accepted client connection socket. On the other hand, the child process closes the listen socket and process the connection. After handling the connection, it will close the client socket and exit.

One might argue the above model is very slow. Every time a new connection comes in, it calls fork() to create a new process. This copies all the memory pages owned by the parent. Moreover, fork() is a system call that must be run in kernel mode which involves user-to-kernel space context switching. By all these, the fork() is a heavy weight call. Hence, this model does not fit the HTTP server requirements, as it not only has to response to the connection fast, but also be able to handle high loading of requests.

Indeed, Apache 1.3 implements a pre-forking model that addresses this issue. The parent process forks some spare child processes to cater for upcoming connections. Those child processes are stayed as idle most of the time. But they all listen to the same socket. When the client connects, one of the child processes will wake up, accept the connection, and process the HTTP request. After handling the request, it closes the connection. Next, it will put itself to sleep and wait for next connection coming in. Since there are spare child processes, if one is handling HTTP request, all other idle child processes could be able to serve any new HTTP request.

The child processes will not be alive forever. It could be specified in the configuration file that the child process could ends itself after handled more than a pre-configured number of connections. So, if there are some bugs in the client process that has memory leakage, it will die after some amount of connections. The memory will be freed and reclaimed by the OS. The parent process detects the peaceful end of the child process and it will respawn a new one. But on the other hand, the drawback is, it is very difficult to detect memory leakage problem in the running envirnoment.

To implement this child process management function, it requires communications between child processes and the parent. In Apache 1.3, a scoreboard serves such kind of facilities. It uses some kinds on inter-process communication service in the UNIX OS. The build utility in Apache 1.3 detects suitable IPC services, such as System V semaphore and shared memory available in the OS to be used for scoreboard functions.

New processing modules in Apache httpd 2.0

The processing model of the new Apache HTTP Server 2.0 is an evolution to its predecessors. It developed a new model called MPM - Multi-Processing Modules (See Resources ).

By using this in UNIX platform, sites that need high scalability can choose to use worker MPM. The worker MPM implements a hybrid multi-process multi-threaded server. It uses a constant number of child processes to serve large number of requests by using threads. That is, the web server could be configured to run 5 child processes with each of which having maximum 150 idle threads for processing requests. Since no. of child processes is limited, the resource requirement could be controlled easily. A single child process provides a high scalablity and responsiveness to client as thread is much more light weight than process. Moreover, reliability still remains as the parent process manages multiple child processes in the same manner as in Apache HTTP Server 1.3's pre-forking model.

While on the UNIX platform, the administrator can choose different implementation of MPM modules in order to suit their needs on different running envirnoment. When the sites requiring stability and compatibility with older modules, prefork MPM, which is a similar processing model implemented in Apache HTTP 1.3, could be used. Moreover, MPM provides a powerful and flexible way to make the implementation to various OS platform, such as BeOS, OS/2, Windows, etc., easier. The developers can use some OS specific features in the modules to make the web server more efficient and native. In Windows platform, as there is no parent-child process concept, the web server is forced to run with "winnt" MPM, which implements a single-process multi-threaded model. The MPM concept makes the implementation of the web server to all platforms possible.

Conclusion

As Apache HTTP server implementation shows, by using fork() properly and geniusly, the server could be more robust and even more scalable than a single process, multi-threaded server. Even if the light-weight POSIX thread introduced in mid-90's, fork(), which has long been in UNIX world for 30 years, is still a very important and irreplaceable system call. By mixing multi-process and multi-threading in the server, robustness and responsiveness can be achieved easily while scalability remains high.

Resources