Usability Challenges
From PostgreSQL wiki
Jump to navigationJump to searchThis page is for tracking my (Peter Eisentraut) current R & D project on Usability Challenges in the PostgreSQL system.
Core server management and configuration
- Too much tuning
- Memory management: too complex, there are few useful guidelines, most things could be automated
- Vacuum: should be automatic -- yay, autovacuum
- Background writer configuration: Who needs that?
- Write-ahead log configuration: too complicated, should be automatic
- Free-space map: The server knows full well how much FSM it needs; see also memory management.
- Managability is lacking
- User accounts: still no good way to manage pg_hba.conf from SQL
- Statistics: too much data but most people don't know what to make of it
- Configuration files: too long, too many options that most people don't need
- Plugins: Using external modules is complicated, sometimes risky, hard to manage.
- Logging: Logging configurability is great but the default configuration is less than useful for new users.
- Tracing: Everything notwithstanding, it is still really hard at times to know what is happening, such as in nested PL/pgSQL calls, in cascaded foreign key actions, and other nested and cascaded contexts.* Clients:
- Health checking: Most people cannot even tell that their system might be misconfigured. Help them.
- No out-of-band monitoring is supported. If pg_ctl launched the postmaster but the postmaster can't start properly functioning backends, the only diagnostics are free-form text logs. This stinks for people trying to manage and automate PostgreSQL installs. An out-of-band monitoring tool is needed that can report things like the port(s) Pg is listening on, any errors produced when trying to start backends, memory status, running queries (w/o having to start a new backend just to query pg_stat_activity), lock status, etc. A command that chats with the postmaster by passing messages over shared memory would make sense, since if the postmaster successfully launched it bound to shm successfully.
- Database dumps aren't portable between different OSes because Pg uses OS-specific locale names and localisation implementation from the C library. A database created on one operating system may not necessarily be restorable to a different operating system without encoding conversion (which pg_restore and psql offers no tools for) or at minimum a manual
CREATE DATABASE
with a hopefully-compatible locale name.
psql
- psql isn't good for robust, reliable automation and scripting. It's great for interactive use, but try to configure it for noninteractive use with proper error checking and useful output. --- It's a mess. Also noted in this research paper commentary.
Backups, pg_dump
, pg_dumpall
and pg_restore
- The default encodings/locales selected on Windows and Linux (UTF8) systems are incompatible with each other, so running
pg_dump -Fc -f dbname.backup dbname
on Linux thenpg_restore -C --dbname postgres dbname.backup
on Windows (or vice versa) will fail to create the database and abort the restore! with an unhelpful error likecould not execute query: ERROR: invalid locale name en_US.UTF-8
. Not cool! We need an encoding/locale equivalence table, and for future versions need some OS-independent locales or mappings in the server its self. See point above about OS dependent locales in core server. - There's no database that
pg_dumpall
andpg_restore --create
can connect to that's guaranteed to always exist. The user shouldn't have to specify a database for either of these commands - it's actively counter-intuitive - but since bothpostgres
andtemplate1
can be dropped andtemplate0
can't be connected to, there's no DB they can always connect to by default. This leads to weird command lines likepg_restore --create --dbname postgres mydb.backup
to restore to a newly created database, probably but not necessarily calledmydb
, not to thepostgres
database. If the user omits--dbname
they get a dump of the SQL of the restore to the console! This isn't good usability; these tools should have a read-only, non-droppable, always-available database they can always connect to when doing database-independent operations. pg_dump
doesn't global objects a database depends on, even when in-Fc
mode. This means that by default PostgreSQL database dumps cannot be restored correctly unless the user dumps additional information separately!.pg_dump
should include global objects like roles that are referred to by the database being dumped, so that backups are complete and correct by default.pg_dumpall
doesn't support the custom format. You can't make an archive containing all databases on a cluster, or have it spit out one dump file per database plus a globals file. This must be done manually using scripting, and that's rather less than user friendly. Backups need to be easy to get right by default!pg_restore -C
doesn't let you choose what name you want for the new database, it forces it to the name the DB was dumped as. If you want a different name you have to manually create the DB with the correct encoding, permissions, ownership, roles, etc.pg_restore
doesn't encourage the use of the.backup
file extension expected by PgAdmin-III
PgAdmin-III (First point of contact for most newbies)
- PgAdmin-III usability may be somewhat lacking
- Using the "Restore" dialog with PgAdmin-III and pointing it a .sql dump produces an unhelpful error message. It should offer to run the SQL dump against the target database, at least when faced with a
pg_dump
sql dump containing its handy "--- PostgreSQL database dump" header (though that might get localised, need to check) - PgAdmin-III offers no easy way to run an SQL script that may contain psql commands like
\connect
. Sincepg_dump
emits psql commands in dumps by default, that means users can't use the PgAdmin-III gui to restore an SQL dump. - SQL scripts loaded into PgAdmin-III's SQL editor are wrapped in a transaction automatically and silently by PgAdmin-III. If any statement fails, the whole script fails. Because
pg_dump
likes to generate statements like\connect dbname
andCREATE ROLE postgres;
in dumps, this means dumps have to be edited by hand before they can be restored. - PgAdmin-III uses the unhelpful
.backup
suffix for backups it creates withpg_dump -Fc
behind the scenes. Backup of what? There's nothing inpg_restore
that says files should have a .backup extension, nor does it encourage them to be created as such, so users who want to restore a backup created from the command line via PgAdmin-III often have to rename the file or change the filter before they can even see it in the file list to restore. - It doesn't register a file extension handler for .backup files
- It doesn't understand .backup files if you drag and drop them onto PgAdmin-III or pass them as command-line arguments to the pgadmin3 executable, and doesn't offer a file->open menu item
- You have to select a database in PgAdmin-III before you can restore a .backup file - even if you intend to check the "Create database" option that uses
pg_restore
's-C
option. That's really counter-intuitive; you should just be able to select the server you want to restore to, or use the restore item in the menu and be prompted for the target server. - You don't get a choice of the database name to use for the newly created database with PgAdmin-III's Restore, Create Database option, it silently uses the db name in the backup file (and doesn't give you any indication of what it is). This is a
pg_restore
limitation. - PgAdmin-III's user interface blocks whenever its running a
pg_dump
or
pg_restore
job, so it feels like it's crashed/hung to a normal user. - PgAdmin-III gives no guidance about why restore commands failed, just spitting out the command line output from pg_restore.
Replication
- Built-in replication can't replicate only some databases of a cluster; you have to replicate both my-critically-important-10MB-database and my-totally-unimportant-50GB-database with the same settings, same priority, etc. This is a usability challenge because it means people have to create and manage multiple clusters to control replication groups, and multiple clusters are hard to manage and configure.
See also Usability reviews