Performance QA Testing

From PostgreSQL wiki

(Difference between revisions)
Jump to: navigation, search
(Added link to perflab mailing list)
(add some free datasets that could be used for benchmarking - inspired by http://ronaldbradford.com/blog/seeking-public-data-for-benchmarks-2009-08-28/)
Line 44: Line 44:
  
 
* Implement the [http://www.cs.umb.edu/~poneil/StarSchemaB.PDF Star Schema Benchmark].
 
* Implement the [http://www.cs.umb.edu/~poneil/StarSchemaB.PDF Star Schema Benchmark].
 +
 +
== Datasets ==
 +
 +
Some public datasets that could be used to get realistic data for various kind of benchmarks:
 +
 +
* [http://www.freebase.com/docs/data_dumps Freebase] - Various wiki style data on places/people/things - ~600MB compressed
 +
* [http://www.imdb.com/interfaces#plain IMDB] - the IMDB database - see also http://code.google.com/p/imbi/
 +
* [http://www.data.gov/ ] - US federal government data collection see also [http://www.sunlightlabs.com/ sunlightlabs]
 +
* [http://wiki.dbpedia.org/Downloads DBpedia] - wikipedia data export project
 +
* [http://linux.dell.com/dvdstore/ Dell DVDstore] - Dells DVD Store context data
 +
* [http://www.eoddata.com/ eoddata] - historic stock market data (requires reigstration - licence?)
 +
* [http://www.transtats.bts.gov/Tables.asp?DB_ID=120&DB_Name=Airline%20On-Time%20Performance%20Data&DB_Short_Name=On-Time RITA] - Airline On-Time Performance Data
 +
* [http://wiki.openstreetmap.org/wiki/Planet.osm Openstreetmap] - Openstreetmap source data
 +
  
 
== Information ==
 
== Information ==

Revision as of 12:52, 30 August 2009

This page centralizes the efforts on performances QA testing: available hardware, available tools, continuous benchmarking effort...

The PostgreSQL Performance lab is being created to allow community members of the Open Source database PostgreSQL to have enterprise class hardware to test on.

The testing that will occur includes industry standard workloads such as OLTP, DSS and BI. Furthermore we will also use the hardware for other practical and customer oriented testing to improve scalability (processor utilization, i/o, load balancing, etc.) and managing large data sets (loading, backups, restores, replication, etc).

Contents

Donations

For donation inquiries, please contact Josh Berkus <josh @t postgresql.org> and Joshua Drake <jdrake @t postgresql.org>.

Mailing List

There is a mailing list available to discuss administrative aspects of community equipment. Please continue to use the -hackers and -performance mailing lists for performance and technical discussions.

QA platforms

Tools

Ideas

  • look into sysbench - it has some issues with locking on postgresql but at least read-only it seems to work fine
  • collecting all the various small samples and testcases posted over the last few years on -performance, -hackers & -bugs and put them into a test set
  • consider doing tests using pgbench -M (simple|extended|prepared) to catch regressions in one of those modes
  • resurrect Jan Wiecks tpc-w implementation available on pgfoundry
  • add full text search benchmarking by using ftsbench from teodor
  • XML benchmarking ?

Datasets

Some public datasets that could be used to get realistic data for various kind of benchmarks:


Information

Personal tools