Sunday 4 August 2013

Debugging a hanged/frozen network application

Recently I read a question on a forum asking about how to debug a process(network application) which is freezing:

A process which network data via tcpip. After running the process for a while while tracking network load, it seems that application gets into freeze state and the process does not getting data, there are other processes in the system that using the network interfaces and they work properly . Application comes out of this hanged situation by itself after several minutes.
Without knowing the OS, the nature of the application what it is doing(is it a chat client, ssh , ftp client), what networking libraries is it using, is it a multi-threaded code, and such details it would be hard to advice any specific steps. 
My answer to it was as below:

  1. top Check top to see ow much resources(CPU, memory) your process is using and if there is anything abnormally high values in CPU usage for it.
  2. pstack This should stack frames of the process executing at time of the problem.
  3. netstat Run this with necessary options (tcp/udp) to check what is the stae of the network sockets opened by your process
  4. gcore -s -c This forces your process to core when the mentioned problem happens, and then analyze that core file using gdb gdb and then use command where at gdb prompt to get full back trace of the process (which functions it was executing last and previous function calls.

No comments: