Thursday, January 21, 2010

Scalable timeout support for GHC's I/O manager

During the past month, Bryan O'Sullivan and I have worked on an event library that can replace GHC’s existing I/O manager. Bryan has focused on getting the epoll and kqueue back ends working, while I've worked on adding scalable support for timeouts. I'm now fairly happy with the timeout support and can report some early benchmark results.
Many real-world problems involve some kind of timeout, usually for error recovery. For example, while processing an HTTP request an HTTP server needs to close the connection and release any associated resources if there's no activity on the connection for some time.
I wrote a simple benchmark to measure the performance of our library compared to GHC's I/O manager. The benchmark forks off n threads that sleep for one millisecond and then waits for all threads to wake up. I ran the same benchmark for some different values of n. Here are the results:
Inserting and dispatching 10,000 timeouts takes 0.13 seconds with our library and 3.03 seconds with GHC's I/O manager. For 20,000 timeouts the times are 0.33 seconds and 22.52 seconds, respectively.
Our library scales better due to the use of a more efficient data structure, a priority search queue, that stores the pending timeouts. Priority search queues are an interesting blend of priority queues and search trees, supporting both efficient update by key and efficient access to the minimum value. Efficient update by key is important when we want to adjust a timeout. More on that later.
In this particular benchmark the cost of inserting a new timeout dominates. The priority search queue supports insertion in O(log n) time, so inserting n timeouts takes O(n log n) time. GHC's I/O manager maintains timeouts in a sorted list, which makes insertion an O(n) operation and inserting n timeouts an O(n2) operation.
Currently, the Haskell standard libraries don't offer much in the way of setting and modifying timeouts, being limited to the threadDelay function in Control.Concurrent. In a server implementation you would like to be able associate a timeout of e.g. ten seconds with each connection and then reset that timeout every time there's some activity on the connection. As mentioned above, priority search queues support efficient update of a value given a key so we already support this use case, even though there's no function in the Haskell standard libraries that lets you take advantage of this yet.


  1. * link to Priority search queue doc.
    * logaritmic scale for times so that the graph does not look like a bug.

  2. Diego,

    I've added a link to the priority search queue paper by Hinze and changed the graph to use a logarithmic scale for the y-axis.