Hash test

From PostgreSQL wiki
Jump to navigationJump to search

This page is about the detail of the test step by step.

Before Test

Create table

postgres=# CREATE TABLE dict (word varchar(100));
postgres=# COPY dict FROM '/tmp/words';

Clear cache - shutdown the database and clear the kernel cache

$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches

Create a table to collect the io statistics

CREATE TABLE hash_stat(
	heap_blks_read integer,
	heap_blks_hit integer,
	idx_blks_read integer,
	idx_blks_hit integer
);

Test Hash Index

Create Index and get the build time

postgres=# \timing
postgres=# CREATE INDEX wordhash ON dict USING hash (word);
504650.100 ms

Get the stastics before query

postgres=# INSERT INTO 
hash_stat (heap_blks_read, heap_blks_hit, idx_blks_read, idx_blks_hit)
SELECT
heap_blks_read,
heap_blks_hit,
idx_blks_read,
idx_blks_hit
FROM pg_statio_all_tables
WHERE relname = 'dict';

Clear cache again and run pgbench. Get the query time of millisecond by 1000/tps.

$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches
$ pgbench -U postgres -n -f /tmp/query.sql  dict
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
tps = 0.192666 (including connections establishing)
tps = 0.192684 (excluding connections establishing)

Get the statistics after query

postgres=# INSERT INTO ...

Test Btree Index

Create btree index, get the build time and get the statistics before query

postgres=# DROP INDEX wordhash;
postgres=# \timing
postgres=# CREATE INDEX wordbtree ON dict USING btree (word);
844495.867 ms

It's the same with hash index test after that. For pgbench test, we get

$ pgbench -U postgres -n -f /tmp/query.sql  dict
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
tps = 0.176786 (including connections establishing)
tps = 0.176802 (excluding connections establishing)

Result of IO Statistics

state heap_blks_read heap_blks_hit idx_blks_read idx_blks_hit
hash - before query 394317 39916824 0 0
hash - after query 414394 39916947 19724 20276
btree - before query 610065 39916947 0 0
btree - after query 629956 39917056 21736 58425

we can get blocks_read during the query by (heap_blks_read + idx_blks_read). You've saw the result here.