Signal 9 (SIGKILL) problems

I once had the problem that my openser crashed after heavy load. After some debugging I found out that one of the openser processes were killed by Linux with SIGKILL (signal 9) (which leads to openser's main process shuts down by killing all child processes and itself).

Of course the question was: Where did this SIGKILL came from and why?

It took me quite some time to find out where a SIGKILL may came from, thus I try to write down my conclusions (no responsibility is taken for the correctness of this information):

Memory Problems

If you have a bad memory (bad hardware) in your PC and the openser process tries to access the broken memory area, the kernel will kill openser.

Memory Overcommit

Linux uses memory overcommit. Thus, it commits memory allocation requests even if there is no memory left. Then, if a process tries to use the overcommited memory, the out-of-memory killer (OOM killer) will kill a certain process to free memory. Thus, for example if openser has a memory leak and consumes all the memory, and overcommit is activated, the OOM-killer may kill a process (e.g. your openser process) to free memory.

To avoid memory overcommit /proc/sys/vm/overcommit_memory has to be set to “2”.

Conclusion

If opener gets killed regularly with SIGKILL probably there might be a memory leak (leaking conventional memory, not openser's internal private and shared memory) inside openser. Use top to watch the memory usage of openser.

Further readings