Using psycopg2 with PostgreSQL

From PostgreSQL wiki

Jump to: navigation, search

Psycopg2 is a fairly mature driver for interacting with PostgreSQL from the Python scripting language. It is written in C and provides to efficiently perform the full range of SQL operations against Postgres databases. This page deals with the 2nd version of the driver, not much older psycopg driver.

Contents

Overview

Links

Features

  • Multiple connections / connection objects (Does not force the use of a singleton)
  • Transaction management/Methods
  • Return columns as a Python Dictionary (Hash) with column names
  • Automatic filtering
  • Cursor object
  • Connection pooling (example to be added later)
  • Asynchronous queries (Thread Safe)

Complaints

  • Hard to find documentation
  • Sometimes hard to find distribution packages (python-Psycopg2)
  • Sometimes examples are not updated with code modifications

Examples

I will not assume you are perfectly familiar with Python in the below examples; therefore I have provided an excessive amount of comments and each example is a full-fledged script instead of just a snippet.

You may want to copy the examples into your favorite code editor for syntax highlighting!

All code examples were tested on Python 2.6 running OpenSUSE Linux.

Connect to Postgres

Connect to the Postgres Database using authentication. Catch and print a connection error if one occurs.

#!/usr/bin/python
import psycopg2
import sys
 
def main():
	#Define our connection string
	conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
 
	# print the connection string we will use to connect
	print "Connecting to database\n	->%s" % (conn_string)
 
	# get a connection, if a connect cannot be made an exception will be raised here
	conn = psycopg2.connect(conn_string)
 
	# conn.cursor will return a cursor object, you can use this cursor to perform queries
	cursor = conn.cursor()
	print "Connected!\n"
 
if __name__ == "__main__":
	main()

Perform a Select

This example shows how to connect to a database, and then obtain and use a cursor object to retrieve records from a table.

In this example we will assume your database is named "my_database" in the public schema and the table you are selecting from is named "my_table".

#!/usr/bin/python
import psycopg2
import sys
import pprint
 
def main():
	conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
	# print the connection string we will use to connect
	print "Connecting to database\n	->%s" % (conn_string)
 
	# get a connection, if a connect cannot be made an exception will be raised here
	conn = psycopg2.connect(conn_string)
 
	# conn.cursor will return a cursor object, you can use this cursor to perform queries
	cursor = conn.cursor()
 
	# execute our Query
	cursor.execute("SELECT * FROM my_table")
 
	# retrieve the records from the database
	records = cursor.fetchall()
 
	# print out the records using pretty print
	# note that the NAMES of the columns are not shown, instead just indexes.
	# for most people this isn't very useful so we'll show you how to return
	# columns as a dictionary (hash) in the next example.
	pprint.pprint(records)
 
if __name__ == "__main__":
	main()

Select/Fetch Records with Column Names

In this example we will perform a select just like we did above but this time we will return columns as a Python Dictionary so column names are present.

We will also use Psycopg2's prinf-style variable replacement, as well as a different fetch method to return a row (fetchone).

#!/usr/bin/python
import psycopg2
#note that we have to import the Psycopg2 extras library!
import psycopg2.extras
import sys
 
def main():
	conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
	# print the connection string we will use to connect
	print "Connecting to database\n	->%s" % (conn_string)
 
	# get a connection, if a connect cannot be made an exception will be raised here
	conn = psycopg2.connect(conn_string)
 
	# conn.cursor will return a cursor object, you can use this query to perform queries
	# note that in this example we pass a cursor_factory argument that will
	# dictionary cursor so COLUMNS will be returned as a dictionary so we
	# can access columns by their name instead of index.
	cursor = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
 
	# tell postgres to use more work memory
	work_mem = 2048
 
	# by passing a tuple as the 2nd argument to the execution function our
	# %s string variable will get replaced with the order of variables in
	# the list. In this case there is only 1 variable.
	# Note that in python you specify a tuple with one item in it by placing
	# a comma after the first variable and surrounding it in parentheses.
	cursor.execute('SET work_mem TO %s', (work_mem,))
 
	# Then we get the work memory we just set -> we know we only want the
	# first ROW so we call fetchone.
	# then we use bracket access to get the FIRST value.
	# Note that even though we've returned the columns by name we can still
	# access columns by numeric index as well - which is really nice.
	cursor.execute('SHOW work_mem')
 
	# Call fetchone - which will fetch the first row returned from the
	# database.
	memory = cursor.fetchone()
 
	# access the column by numeric index:
	# even though we enabled columns by name I'm showing you this to
	# show that you can still access columns by index and iterate over them.
	print "Value: ", memory[0]
 
	# print the entire row 
	print "Row:	", memory
 
if __name__ == "__main__":
	main()

Fetch Records using a Server-Side Cursor

If you have an extremely large result set to retrieve from your database, or you would like to iterate through a tables records without first retrieving the entire table a cursor is exactly what you need.

A cursor keeps the database connection open and retrieves database records 1 by 1 as you request them.

There are several ways to accomplish this in Psycopg2, I will show you the most basic example.

For advanced usage see the documentation on the use of cursor.scroll() see:

#!/usr/bin/python
import psycopg2
#note that we have to import the Psycopg2 extras library!
import psycopg2.extras
import sys
 
def main():
	conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
	# print the connection string we will use to connect
	print "Connecting to database\n	->%s" % (conn_string)
 
	conn = psycopg2.connect(conn_string)
 
	# HERE IS THE IMPORTANT PART, by specifying a name for the cursor
	# psycopg2 creates a server-side cursor, which prevents all of the
	# records from being downloaded at once from the server.
	cursor = conn.cursor('cursor_unique_name', cursor_factory=psycopg2.extras.DictCursor)
	cursor.execute('SELECT * FROM my_table LIMIT 1000')
 
	# Because cursor objects are iterable we can just call 'for - in' on
	# the cursor object and the cursor will automatically advance itself
	# each iteration.
	# This loop should run 1000 times, assuming there are at least 1000
	# records in 'my_table'
	row_count = 0
	for row in cursor:
		row_count += 1
		print "row: %s    %s\n" % (row_count, row)
 
if __name__ == "__main__":
	main()
Personal tools