Hash test
From PostgreSQL wiki
Jump to navigationJump to searchThis page is about the detail of the test step by step.
Before Test
Create table
postgres=# CREATE TABLE dict (word varchar(100)); postgres=# COPY dict FROM '/tmp/words';
Clear cache - shutdown the database and clear the kernel cache
$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches
Create a table to collect the io statistics
CREATE TABLE hash_stat( heap_blks_read integer, heap_blks_hit integer, idx_blks_read integer, idx_blks_hit integer );
Test Hash Index
Create Index and get the build time
postgres=# \timing postgres=# CREATE INDEX wordhash ON dict USING hash (word); 504650.100 ms
Get the stastics before query
postgres=# INSERT INTO hash_stat (heap_blks_read, heap_blks_hit, idx_blks_read, idx_blks_hit) SELECT heap_blks_read, heap_blks_hit, idx_blks_read, idx_blks_hit FROM pg_statio_all_tables WHERE relname = 'dict';
Clear cache again and run pgbench. Get the query time of millisecond by 1000/tps.
$ sudo echo 3 | sudo tee /proc/sys/vm/drop_caches $ pgbench -U postgres -n -f /tmp/query.sql dict transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 0.192666 (including connections establishing) tps = 0.192684 (excluding connections establishing)
Get the statistics after query
postgres=# INSERT INTO ...
Test Btree Index
Create btree index, get the build time and get the statistics before query
postgres=# DROP INDEX wordhash; postgres=# \timing postgres=# CREATE INDEX wordbtree ON dict USING btree (word); 844495.867 ms
It's the same with hash index test after that. For pgbench test, we get
$ pgbench -U postgres -n -f /tmp/query.sql dict transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 tps = 0.176786 (including connections establishing) tps = 0.176802 (excluding connections establishing)
Result of IO Statistics
state | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit |
hash - before query | 394317 | 39916824 | 0 | 0 |
hash - after query | 414394 | 39916947 | 19724 | 20276 |
btree - before query | 610065 | 39916947 | 0 | 0 |
btree - after query | 629956 | 39917056 | 21736 | 58425 |
we can get blocks_read during the query by (heap_blks_read + idx_blks_read). You've saw the result here.