Monday, February 21, 2011

New release of hashable

I've just made a new release of the hashable package. New in this version is the possibility to provide a custom salt when hashing a value. This is useful if you need to compute several different hash values for the same input, for example when implementing cuckoo hashing.

There are also a few improvements and bug fixes in this release. Most notable is the increased resilience to badly distributed hashes due to leading zeroes in input strings or tuples.

Thanks to Jan-Willem Maessen and Antoine Latter, who both contributed to this release.

Friday, February 18, 2011

A new, faster containers library

I am delighted to announce the release of preview versions of two new packages:

unordered-containers 0.1
Efficient hashing-based container types.

hashable 1.1
Efficient hashing of common types.

These packages address a need for faster container types in Haskell, especially for string types like ByteString and Text. The package achieves better performance by using hashing and a Patricia trie (also known as a radix tree), instead of a balanced tree, in its implementation.

The provided HashMap type is three times faster than Data.Map for lookups and twice as fast for inserts, for all key types I've tried, including ByteString, String, and Int.

I'm calling this a preview release, even though the code has been well tested both for correctness and performance, as the API is quite basic and doesn't provide a HashSet type yet. One thing you'll notice is that the API is split into a value-strict and a value-lazy version, making it easier to use in a strict setting.

If you want to contribute, please get copies of the source trees from here:

  • git clone https://github.com/tibbe/unordered-containers.git
  • git clone https://github.com/tibbe/hashable.git

Alternatively, just fork the projects on GitHub.

As I mentioned in my talk on hashing-based containers earlier this week, I'm working on an even faster version, based on hash-array mapped tries, but it won't be ready until GHC 7.2.

Tuesday, February 15, 2011

Slides from my hashing-based containers talk at Galois

Today I gave a talk at Galois, titled "Faster persistent data structures through hashing". I presented the results from my recent work on trying to create a faster map data type for Haskell.

The talk was recorded so expect it to be posted on the Galois blog some time in the future. In the mean time, here are the slides I presented.

Many thanks to Don Stewart and Iavor Diatchki for organizing. Also, a big thank you to everyone who attended. I really enjoyed all the discussions we had after the talk.

Saturday, February 12, 2011

Slides from my talk on reasoning about laziness at BayHac

Here are my slides for today's tutorial on reasoning about laziness:

(There's a downloadable PDF version, too, if you find that an easier format to deal with.)

I've also posted a Git repository of the slide source code, in case anyone would like to use the slides for their own purposes.