From PostgreSQL wiki
Revision as of 05:30, 8 June 2012 by Ringerc
This page is for tracking my (Peter Eisentraut) current R & D project on Usability Challenges in the PostgreSQL system.
Core server management and configuration
- Too much tuning
- Memory management: too complex, there are few useful guidelines, most things could be automated
- Vacuum: should be automatic -- yay, autovacuum
- Background writer configuration: Who needs that?
- Write-ahead log configuration: too complicated, should be automatic
- Free-space map: The server knows full well how much FSM it needs; see also memory management.
- Managability is lacking
- User accounts: still no good way to manage pg_hba.conf from SQL
- Statistics: too much data but most people don't know what to make of it
- Configuration files: too long, too many options that most people don't need
- Plugins: Using external modules is complicated, sometimes risky, hard to manage.
- Logging: Logging configurability is great but the default configuration is less than useful for new users.
- Tracing: Everything notwithstanding, it is still really hard at times to know what is happening, such as in nested PL/pgSQL calls, in cascaded foreign key actions, and other nested and cascaded contexts.* Clients:
- Health checking: Most people cannot even tell that their system might be misconfigured. Help them.
- No out-of-band monitoring is supported. If pg_ctl launched the postmaster but the postmaster can't start properly functioning backends, the only diagnostics are free-form text logs. This stinks for people trying to manage and automate PostgreSQL installs. An out-of-band monitoring tool is needed that can report things like the port(s) Pg is listening on, any errors produced when trying to start backends, memory status, running queries (w/o having to start a new backend just to query pg_stat_activity), lock status, etc. A command that chats with the postmaster by passing messages over shared memory would make sense, since if the postmaster successfully launched it bound to shm successfully.
- Database dumps aren't portable between different OSes because Pg uses OS-specific locale names and localisation implementation from the C library. A database created on one operating system may not necessarily be restorable to a different operating system without encoding conversion (which pg_restore and psql offers no tools for) or at minimum a manual
CREATE DATABASEwith a hopefully-compatible locale name.
- psql isn't good for robust, reliable automation and scripting. It's great for interactive use, but try to configure it for noninteractive use with proper error checking and useful output. --- It's a mess. Also noted in this research paper commentary.
- The default encodings/locales selected on Windows and Linux (UTF8) systems are incompatible with each other, so running
pg_dump -Fc -f dbname.backup dbnameon Linux then
pg_restore -C --dbname postgres dbname.backupon Windows (or vice versa) will fail to create the database and abort the restore! with an unhelpful error like
could not execute query: ERROR: invalid locale name en_US.UTF-8. Not cool! We need an encoding/locale equivalence table, and for future versions need some OS-independent locales or mappings in the server its self. See point above about OS dependent locales in core server.
- There's no database that
pg_restore --createcan connect to that's guaranteed to always exist. The user shouldn't have to specify a database for either of these commands - it's actively counter-intuitive - but since both
template1can be dropped and
template0can't be connected to, there's no DB they can always connect to by default. This leads to weird command lines like
pg_restore --create --dbname postgres mydb.backupto restore to a newly created database, probably but not necessarily called
mydb, not to the
postgresdatabase. If the user omits
--dbnamethey get a dump of the SQL of the restore to the console! This isn't good usability; these tools should have a read-only, non-droppable, always-available database they can always connect to when doing database-independent operations.
pg_dumpdoesn't global objects a database depends on, even when in
-Fcmode. This means that by default PostgreSQL database dumps cannot be restored correctly unless the user dumps additional information separately!.
pg_dumpshould include global objects like roles that are referred to by the database being dumped, so that backups are complete and correct by default.
pg_dumpalldoesn't support the custom format. You can't make an archive containing all databases on a cluster, or have it spit out one dump file per database plus a globals file. This must be done manually using scripting, and that's rather less than user friendly. Backups need to be easy to get right by default!
pg_restore -Cdoesn't let you choose what name you want for the new database, it forces it to the name the DB was dumped as. If you want a different name you have to manually create the DB with the correct encoding, permissions, ownership, roles, etc.
pg_restoredoesn't encourage the use of the
.backupfile extension expected by PgAdmin-III
PgAdmin-III (First point of contact for most newbies)
- PgAdmin-III usability may be somewhat lacking
- Using the "Restore" dialog with PgAdmin-III and pointing it a .sql dump produces an unhelpful error message. It should offer to run the SQL dump against the target database, at least when faced with a
pg_dumpsql dump containing its handy "--- PostgreSQL database dump" header (though that might get localised, need to check)
- PgAdmin-III offers no easy way to run an SQL script that may contain psql commands like
pg_dumpemits psql commands in dumps by default, that means users can't use the PgAdmin-III gui to restore an SQL dump.
- SQL scripts loaded into PgAdmin-III's SQL editor are wrapped in a transaction automatically and silently by PgAdmin-III. If any statement fails, the whole script fails. Because
pg_dumplikes to generate statements like
CREATE ROLE postgres;in dumps, this means dumps have to be edited by hand before they can be restored.
- PgAdmin-III uses the unhelpful
.backupsuffix for backups it creates with
pg_dump -Fcbehind the scenes. Backup of what? There's nothing in
pg_restorethat says files should have a .backup extension, nor does it encourage them to be created as such, so users who want to restore a backup created from the command line via PgAdmin-III often have to rename the file or change the filter before they can even see it in the file list to restore.
- It doesn't register a file extension handler for .backup files
- It doesn't understand .backup files if you drag and drop them onto PgAdmin-III or pass them as command-line arguments to the pgadmin3 executable, and doesn't offer a file->open menu item
- You have to select a database in PgAdmin-III before you can restore a .backup file - even if you intend to check the "Create database" option that uses
-Coption. That's really counter-intuitive; you should just be able to select the server you want to restore to, or use the restore item in the menu and be prompted for the target server.
- You don't get a choice of the database name to use for the newly created database with PgAdmin-III's Restore, Create Database option, it silently uses the db name in the backup file (and doesn't give you any indication of what it is). This is a
- PgAdmin-III's user interface blocks whenever its running a
pg_dump<code> or <code>pg_restorejob, so it feels like it's crashed/hung to a normal user.
- PgAdmin-III gives no guidance about why restore commands failed, just spitting out the command line output from pg_restore.
- Built-in replication can't replicate only some databases of a cluster; you have to replicate both my-critically-important-10MB-database and my-totally-unimportant-50GB-database with the same settings, same priority, etc. This is a usability challenge because it means people have to create and manage multiple clusters to control replication groups, and multiple clusters are hard to manage and configure.
See also Usability reviews