Sample Databases

From PostgreSQL wiki

Jump to: navigation, search

Many database systems provide sample databases with the product. A good intro to popular ones that includes discussion of samples available for other databases is Sample Databases for PostgreSQL and More

One trivial sample that PostgreSQL ships with is the Pgbench. This has the advantage of being built-in and supporting a scalable data generator--you can make databases of any size ranging from 16MB to 600GB (approximately) with the current version.

PgFoundry Samples

The latest collection of PostgreSQL compatible database samples is at PgFoundry Sample Databases. It includes three commonly used benchmark databases:

  • World: Based on the MySQL World sample. Has a list of Cities, Countries, and what language they speak.
  • dellstore2: PostgreSQL port of a database-neutral e-commerce test application developed by Dell. The original code supports three size scales in their data generator (10MB, 1GB, 100GB), currently only the normal, smallest sized data set has been ported to PostgreSQL. PostgreSQL 8.4: Windowing Functions uses this test data to show some advanced queries.
  • Pagila: Based on MySQL's replacement for World, Sakila, which is itself inspired by the Dell DVD Store.

There are some other sample databases there as well, such as a USDA Food database and a large list of country data via ISO-3166 standards.

Other Samples

  • The land registry file from is 3.2GB. A CSV file that can be loaded into PostgreSQL that has organic data. No registration required.
  • AdventureWorks 2014 for Postgres - Scripts to set up the OLTP part of the go-to database used in training classes and for sample apps on the Microsoft stack. The result is 68 tables containing HR, sales, product, and purchasing data organized across 5 schemas. It represents a fictitious bicycle parts wholesaler with a hierarchy of nearly 300 employees, 500 products, 20000 customers, and 31000 sales each having an average of 4 line items. So it's big enough to be interesting, but not unwieldy. In addition to being a well-rounded OLTP sample, it is also a good choice to demonstrate ETL into a data warehouse.
  • Book Town - used for the examples in Practical PostgreSQL
  • Mouse Genome sample data set. See instructions. Custom format dump, 1.3GB compressed, but restored database is tens of GB in size. MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease. MGI use PostgreSQL in production [1], providing direct protocol access to researchers, so the custom format dump is not an afterthought. Apparently updated frequently.
  • Benchmarking databases such as DBT-2 or TPC-H can be used as samples.
  • Freebase - Various wiki style data on places/people/things - ~600MB compressed
  • IMDB - the IMDB database - see also
  • [2] - US federal government data collection see also sunlightlabs
  • DBpedia - wikipedia data export project
  • eoddata - historic stock market data (requires registration - licence?)
  • RITA - Airline On-Time Performance Data
  • Openstreetmap - Openstreetmap source data
  • NCBI - biological annotation from NCBI's ENTREZ system (daily updated)
Personal tools