Tuesday, December 7, 2010

ByteString support in network

Not so long ago, I merged the network-bytestring package into the network package. This addresses two problems with the old network API:

  • Performance: String is implemented as a linked list of integers, which isn't a very efficient way to store binary data. ByteString was designed with efficiency in mind and has an memory overhead of just a few words per string, which is acceptable given a typical network message size of e.g. 4096 bytes.

  • Correctness: String is for storing Unicode text, not binary data. This can lead to subtle errors if you send text over the network in an encoding other than ISO 8859-1.

The new API exposes two new modules: Network.Socket.ByteString and Network.Socket.ByteString.Lazy. Both define ByteString versions of the different variants of send and recv.

While not formally deprecated, I would advice against using the String based recv and send functions in new code for the reason given above.

In addition, the new modules also add support of scatter/gather I/O. Scatter/gather I/O allows you to e.g. send several small chunks of data with one system call, minimizing the number of context switches, and without first concatenating the data, avoiding unnecessary user space data copying.