Gsoc08-hashindex

Project Info

Improve hash index performance. This patch implements the item in the ToDo's list:

In hash indexes, consider storing the hash value with or instead of the key itself

There's two basic ideas in the patch.

We can keep more tuples in a bucket and reduce the index size. It also means all hash indexscans become lossy and have to be rechecked at the heap.

Keep the contents of each index page ordered by hash value and use binary instead of linear search to find the matching item(s) during an indexscan.

Here is some results of hash index with patch vs. btree index. You can see Details About The Test for more information about the test.

Linux = 2.6.24-16-generic

Processor = Intel(R) Core(TM)2Duo CPU T7500@2.20GHz

Memory = 1GB

Table size - 39916800 unique words, 1529MB

Query - 2000 equality query

Hash index without the patch is huge. Here are some simple results of index building on a smaller workload.

table Size - 3628800 unique words, 139MB