8.1.4 et. al. Security Release User's Guide
The PostgreSQL Global Development Group today released updates for all active branches. These updates patch a serious SQL-injection security hole. All PostgreSQL users running systems that are exposed to untrusted users or networks are advised to upgrade at the earliest possible opportunity. Note that a full fix may require upgrading client and driver libraries as well.
Why Was This Done?
A few weeks ago, members of our Japanese developer community contacted us with news of a SQL injection exploit for PostgreSQL with PHP in Far Eastern character encodings such as SJIS. It seemed that a clever attacker could exploit knowledge of how multi-byte encodings and string escaping work inside PostgreSQL in order to sneak injected SQL strings past all commonly used client-side safeguards. Subsequent investigation showed that related attacks would work in all multi-byte encodings, in particular UTF8 which is widely used world-wide.
See "Technical Details on the Security Hole" below for additional technical information on how the exploit works.
What Does This Mean To Me?
First, this means if you are running a PostgreSQL server which is exposed to untrusted input ... either from the web, or from internal users in security-conscious applications ... then you need to upgrade PostgreSQL at the nearest opportunity.
In addition to PostgreSQL itself, many of the PostgreSQL drivers will be upgraded this week (Java, PHP, Perl, Ruby, etc.) You will want to upgrade them at the same time. Unfortunately, some legacy applications will need to be repaired as well. The server changes will help you find out about these applications, but will not by themselves fix the code. This means that you will need to test your application on the new version before upgrading production software.
It is important to understand that the fundamental bug being dealt with is on the client side: a client that needs to "escape" a string value in order to insert it into a SQL query must do that escaping with awareness of the character set encoding it is using. Applications that rely on a library or driver function to do escaping will be safe after the library or driver has been upgraded. Applications that do their own "homebrew" escaping, for instance by using regexp substitution to double quotes and backslashes, are at serious risk if they are used in any multibyte character set encoding.
An example of an application at risk is a PHP program that uses addslashes() or magic_quotes. We note that these tools have been deprecated by the PHP group since version 4.0. Some other programming languages which do not yet have an updated driver may also be vulnerable. When in doubt, test or contact your database driver project.
Since it is unlikely that all applications at risk can be fixed immediately, we have put changes into the PostgreSQL server to cause it to reject queries that could have been corrupted by SQL injection attacks. With an updated server you can be sure that no such attacks will succeed in executing unwanted SQL commands --- but an attack would cause your application to receive an error from the server, which it might or might not be able to handle gracefully. So you will want to move forward with ensuring that any application coding deficiencies are fixed soon.
What Changed Exactly?
In order to completely close off this security hole, we have had to make changes in both the client and the server. Most of these changes have also been backported to versions 8.0, 7.4 and 7.3.
Changes In The Server
Two changes have been made in the PostgreSQL server to cause it to reject potentially-corrupted queries:
1) Whenever the client is using a multibyte encoding, the server will reject strings containing invalidly encoded multibyte characters. While PostgreSQL has been moving in this direction for some time, the checks are now back-ported to all supported versions, applied uniformly to all encodings, and cannot be disabled. This may cause problems for applications that formerly permitted "garbage characters" in string input. If you need to store data that is not in any recognized encoding, we recommend using SQL_ASCII encoding on both ends of the connection.
2) When the client is using a "client only" encoding (SJIS, BIG5, GBK, GB18030, or UHC) the server furthermore rejects uses of "\'" to represent a single quote mark in a SQL string literal. This historical usage has been deprecated for some time in favor of the SQL-standard representation "" (two single quote marks). However, it is still extremely common, and this change will doubtless cause issues for some applications. We have therefore added a server parameter "backslash_quote" that can be set to control how strict the server is:
* backslash_quote = on : Allow \' always (old behavior; INSECURE * backslash_quote = off : Reject \' always * backslash_quote = safe_encoding : Allow \' if client_encoding is safe
"safe_encoding" is the new default, because many practical applications operate with server-safe encodings and therefore need not be forced to change immediately. "on" can be used when absolutely necessary for application compatibility, but it is unsafe to use with clients exposed to untrusted input.
Client and Libpq Changes
If your application uses libpq (the C-language client library included with the core PostgreSQL distribution), then the important changes are in PQescapeString and PQescapeBytea, the two routines provided by that library for escaping strings. If your code uses only one PostgreSQL database connection at a time, then the updated versions of PQescapeString and PQescapeBytea will work fine for you. If your application uses multiple connections concurrently, you should adjust your code to use the new functions PQescapeStringConn and PQescapeByteaConn, which make sure that they do the escaping correctly for a particular database connection.
If your code is doing escaping "by hand", for instance by doubling quotes and backslashes, you really need to fix it to use the library routines instead. If you're avoiding the need for escaping at all, by sending variable strings as out-of-line parameters, then you've saved yourself a whole lot of trouble and can stop worrying.
Most other client-side libraries have comparable facilities for dealing with insertion of variable strings into SQL queries. Whatever you are using, you need to make sure that inserted strings are processed by a correctly updated function.