PostgreSQL For Development With Vagrant

From PostgreSQL wiki
Jump to navigationJump to search

Why Run PostgreSQL in a VM?

There are many ways of installing and running PostgreSQL on OS X and Linux. This document is to make that process simpler by offering a single, consistent solution that works on both platforms and allows for multiple, isolated versions of PostgreSQL.

Running PostgreSQL in a virtual machine (VM) standardizes the underlying operating system (no more Linux vs OS X issues), provides a consistent environment (standard apt-get install), and completely isolates unrelated databases from each other. After the initial installation is complete, it takes about a minute to create a fresh PostgreSQL server.

Finally, the setup described in this guide can be used in addition to any other type of PostgreSQL installation you already have locally. As everything will be running in a virtual machine it will not interfere with other ways you may have installed PostgreSQL already.

The approach described in this document is to run PostgreSQL in a virtual machine automated using Vagrant.

Who is this for?

  • You want to learn SQL using PostgreSQL
  • You're developing web applications using PostgreSQL
  • You're developing desktop applications that will connect to PostgreSQL
  • You want a database on your local machine for data analysis
  • You work on team that uses both OS X and Linux (or you use both yourself) and want everybody to be able to run the same version of PostgreSQL
  • You want to run multiple copies of PostgreSQL concurrently
  • You want to be able to easily turn on and off your databases
  • You want your database to be running in a different timezone than your local desktop (i.e. UTC)

If you fall into any of these categories then this guide is probably for you.

Who is this not for?

  • You're deploying a production database

    This should be pretty obvious ... don't run production databases on developer laptops!

  • You need multiple people to access the same database

    This is a bad idea for development databases anyway. It's much easier for multiple developers to work independently with their own self contained system on their local machines. Integration testing should happen in a completely separate environment.

Installation

Virtual Box

Virtual Box is a hyper visor for x86 hardware. It's an application that runs on your computer (referred to as the "host") that runs virtual machines. Virtual machines allow you to simulate an entire virtual server on your desktop or laptop.

Mac OS X Installation

On Mac OS X, install it by downloading the installer and running it.

Linux Installation

On Linux, follow the instructions for your distribution. Generally this will be done by installing a system package for it. For example on

Vagrant

Vagrant is an automation wrapper for starting up and provisioning virtual servers. It is used to automate virtual machine creation. It can be used with a number of virtualization platforms, one of which is Virtual Box.

Vagrant makes it easy to reproducibly run the same build process to provision a server. It also provides a simple way to forward ports on the virtual server to your local machine.

Mac OS X Installation

On Mac OS X, install it by downloading the installer and running it.

Linux Installation

On Linux, install it by downloading the installer for you distribution and running it.

Note: Most Linux distributions have very out of date packages for Vagrant so you're best off downloading and installing it directly from the Vagrant website.

Database Setup

Once you have both Virtual Box and Vagrant installed you can spin up virtual machines using vagrant by simply running vagrant up. This command looks in your current directory for a file name Vagrantfile that describes the configuration and provisioning of the virtual machine.

Cloning the Sample

To try it out, clone the base pg-app-dev-vm git repo. Replace myapp with the name of your project.

# Clone the repository:
$ git clone https://github.com/jackdb/pg-app-dev-vm myapp

# Remove the .git, README, and LICENSE files:
$ cd myapp
$ rm -rf .git README.md LICENSE

This repository has two main parts, a Vagrantfile that specifies the shell script to initialize the virtual machine and the shell script itself (in Vagrant-setup/). Once started it will:

  • Create a fresh install of Ubuntu 12.04 LTS
  • Update the system packages
  • Add the PGDG apt repository and signing key
  • Install PostgreSQL version 9.3
  • Install the contrib packages for PostgreSQL version 9.3 (this allows for simpler extension installation via CREATE EXTENSION...)
  • Create a database user for your application
  • Create a database with your database user as the owner
  • Forward the database server port of 5432 to 15432 on your local machine

To change the database username and password, edit the file Vagrant-setup/bootstrap.sh. The defaults are myapp and dbpass respectively.

Starting Up The Virtual Machine

To start up the virtual machine enter the directory with the Vagrantfile and run:

$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
[... truncated ...]
Successfully created PostgreSQL dev virtual machine."

Your PostgreSQL database has been setup and can be accessed on your local machine on the forwarded port (default: 15432)
  Host: localhost
  Port: 15432
  Database: myapp
  Username: myapp
  Password: dbpass

Admin access to postgres user via VM:
  vagrant ssh
  sudo su - postgres

psql access to app database user via VM:
  vagrant ssh
  sudo su - postgres
  PGUSER=myapp PGPASSWORD=dbpass psql -h localhost myapp

Env variable for application development:
  DATABASE_URL=postgresql://myapp:dbpass@localhost:15432/myapp

Local command to access the database via psql:
  PGUSER=myapp PGPASSWORD=dbpass psql -h localhost -p 15432 myapp

The first time you run this it will take a few minutes as the Ubuntu 12.04 LTS base image will need to be downloaded. Subsequently it will be much faster as the image will be cached locally. Once the image is downloaded you should see the build process running and upon completion it will print a couple of ways you can access your database.

Shutting It Down

To stop the virtual machine run:

$ vagrant halt
[default] Removing cache buckets symlinks...
[default] Attempting graceful shutdown of VM...

This will "power off" the virtual machine. When you are not using your database you should do this so that it does not use up local system resources (having a lot of VMs running concurrently can use up a lot of memory).

Starting It Up Again

Now start it up again:

$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
[... truncated ...]
VM was already provisioned at: Sat Jan 11 14:55:20 UTC 2014
To run system updates manually login via 'vagrant ssh' and run 'apt-get update && apt-get upgrade'

Your PostgreSQL database has been setup and can be accessed on your local machine on the forwarded port (default: 15432)
  Host: localhost
  Port: 15432
  Database: myapp
  Username: myapp
  Password: dbpass

Admin access to postgres user via VM:
  vagrant ssh
  sudo su - postgres

psql access to app database user via VM:
  vagrant ssh
  sudo su - postgres
  PGUSER=myapp PGPASSWORD=dbpass psql -h localhost myapp

Env variable for application development:
  DATABASE_URL=postgresql://myapp:dbpass@localhost:15432/myapp

Local command to access the database via psql:
  PGUSER=myapp PGPASSWORD=dbpass psql -h localhost -p 15432 myapp

This second time you start it up it should be relatively quick (just a couple of seconds) as everything is already installed.

Accessing Your Database

Upon startup the Vagrant initalization script will print a number of ways to access your database.

From a Webapp You Are Developing

If you are developing a web application then you can use the following URL style environment variable:

DATABASE_URL=postgresql://myapp:dbpass@localhost:15432/myapp

Via psql as the postgres Admin User

To perform administrative tasks on your database, connect to the postgres admin user within the virtual machine:

# Change directory to your application's project directory containing the Vagrantfile
$ cd myapp

# Connect via SSH to the virtual machine
$ vagrant ssh

# Become the postgres user:
$ sudo su - postgres

# Then you can run psql, createdb and the other PostgreSQL client commands
postgres@precise64:~$ id
uid=106(postgres) gid=112(postgres) groups=112(postgres),111(ssl-cert)

postgres@precise64:~$ which psql
/usr/bin/psql

Via psql on your local machine

If you have the PostgreSQL client libraries installed on your local machine you can connect to your virtual machine database via:

$ PGUSER=myapp PGPASSWORD=dbpass psql -h localhost -p 15432 myapp

For other local tools and GUIs, use the database server details that gets printed on startup:

Your PostgreSQL database has been setup and can be accessed on your local machine on the forwarded port (default: 15432)
  Host: localhost
  Port: 15432
  Database: myapp
  Username: myapp
  Password: dbpass

Limitations

  • Client Libraries Not Installed Locally.

    As the database and all binaries are installed in the virtual machine, you will not have them on your local $PATH. This means that you cannot execute psql or pg_dump from the command line without first connecting to the virtual machine (via vagrant ssh).

    One workaround for this is to install that latest PostgreSQL client (9.3 as of writing this) on your local machine. You can then use it to connect to your virtual machine databases by connecting to the forwarded port in the same way that any applications you are developing would.

  • Increased Memory Usage

    As each database is running in it's own virtual machine if you run a lot of them concurrently you'll need quite a bit of memory. The simplest work around is only to start up the virtual machines that you are currently using and stop them via vagrant halt when you are not using them.

Gotchas

  • Conflicting Forwarded Ports

    If you are going to be running multiple virtual machines simulataneously then make sure to edit the forwarded port in Vagrantfile from 15432 to something unique per virtual machine. If you are only using the virtual machines one at a time then this is not necessary.