Installation and Administration Best practices
From PostgreSQL wiki
|Line 153:||Line 153:|
Revision as of 01:45, 11 September 2009
This page is under construction.
This page will contain information about the best way to install and maintain a Postgresql database, including environment variables, paths and other relevant stuff.
Fell free to add your suggestions and knowledge to make it a 'hands on' resorce.
Self compiled vs. package distributed
Before installing, consider whether to use packages distributed with your operating system or whether to roll your own. (Compiling PostgreSQL on Windows might be a difficult task. It's quite easy on Linux/Unix boxes.)
Let's compare using a pre-built package and compiling PostgreSQL yourself. (Note: This is a rather Linux-centric view. For Windows, you almost always might want to use the binary packages provided for each release.)
|Using pre-built package from distribution||Compiling yourself.|
|Very easy to install - just use your package manager.||You might need to install gcc and some development packages just for building PostgreSQL.|
|Installation is dependent on distribution (location of config files, initial tablespace).||You may install everything in one place, just where you want it.|
|Startup-scripts are included and supposed to work.||You need to provide your own system startup scripts.|
|The packages might be out of date or new minor versions might not become available frequently.||You are free to use the latest stable version and perform upgrades at your will.|
|The package management knows about the PostgreSQL installation and will update it.||Your package management doesn't know anything about the installation. Dependent libraries might get uninstalled or replaced by newer, incompatible versions. (Note: This is rather unlikely. I've never seen it happen. PostgreSQL doesn't depend on any strange or fast-evolving packages.)|
Compiling and installing in Solaris
TODO: Add a workaround for the most common issues.
Compiling in Solaris using Sun Studio 12
./configure --prefix=/usr/local/pgsql84 CC=/opt/SUNWspro/bin/cc 'CFLAGS=-xO3 -xarch=native -xspace -W0,-Lt -W2,-Rcond_elim -Xa -xildoff -xc99=none -xCC' --datadir=/usr/local/pgsql84/data84 --enable-dtrace --enable-cassert --with-perl --with-python --with-libxml --with-libxslt --with-ossp-uuid --without-readline
- OSol don't have readline library
- Maybe you don't need --enable-cassert option.
UUID: There is a problem with UUID library in Open Solaris 200911. If some one have a workaround in this, please post it.
Multiple Versions on the same host
If you have to install multiple PostgreSQL versions at the same host, compile from source and call configure like this:
./configure --prefix=/opt/postgresql-8.2.11 --with-pgport=8200
That way, you never need to worry what version you are talking with - you just look at the port number.
Other way is changing port in postgresql.conf. Beware of that if you have am own init script, remeber to change values of PGDATA and PGUSER.
Making sure it starts up at system boot time
TODO: Provide a default init script (if there's not already one in contrib/).
Recommended values to be changed in big servers
In Linux, the SHMMAX value is setted in a historical value. So, depending on your server, the first value to be changed is SHMMAX and SHMALL.
Example of a high configuration:
fs.file-max = 32768 kernel.shmmax = 1073741824 kernel.shmall = 536870912
How to calculate? One of the equations is:
250kb + 8.2kb * shared_buffers +14.2kb * max_connections
Why this value is important? This value govern shared memory allocation with other values. If you try to assign high values in Postgresql but you don't touch this values, Postgresql may not run.
One interest link could be visited here.
Directories Location Recommended
Logs: Sometimes logs could be your best wy to determine an issue, but sometimes could add some overhead to your servers. If your RAID isn't enough powerful, you could have a separated storage for them (a disc or something else). Usually , Postgresql logs are in pg_log directory inside the PGDATA. You can have a soft link to this storage or directly an absolute path in postgresql.conf.
WAL: WAL is an important part of our databases, so it need the same level of attention. If you don't have RAID, maybe the first thing to think about is to store WAL files in other discs.
Configuration: If you have the PGDATA in very different location, you maybe want to have a /etc/postgresql/data<number>. Or, if you have several versions for test porpuses you maybe want to have 'debian' like tree (/etc/postgresql/<version>/<pgdata_number>). The default way is have config files inside the PGDATA.
Versioning sql scripts and configuration files
As you are doing right now (versioning the sql scripts), other best practice is to version the configuration files. Not only a simple versioning, remember that you have several envoronments (Development, Test, Production).
Other good practice, is to have versioned the DBA modifications in a separated script (SET STORAGE modifications, special indexes and rules, etc).
TODO: Paste a example.
Backup and Recovery strategies
TODO: How to perform those tasks.
Based on the article  and traduction of Gilberto Castillo (Cuba)
The users allowed to stablish connections to the BD's are controled by a simple file called pg_hba.conf. It ussually creates with default values (total access in local mode). But you can change the currently values using the --auth option when you create the cluster with initdb. By default the method is trust.
In Red Hat, initdb the starts like this:
initdb --pgdata='$PGDATA' --auth= 'ident sameuser'
ONe recommended installation for access by host mode is the md5 method for encrypted password. We can draw something like this:
host all all 192.168.1.0/24 md5
This mode, reduces the the access to the IP addreses that are included in 192.168.1.x and using the correct password for the user. Adding to this restriction, you must remember that in the postgres.conf you should modify listen_addresses variable listen_addresses.
Remember that you can create users wiht VALID UNTIL option. When you create a user you can calculate the timestamp with something like this:
You might replace 1 with the number of days that you want to enable the user(7=1 week, 30=month, etc.).
Remember: The first step before change trust for other method based in passwords, you should assign one to the user.
Changes in the pg_hba.conf only need reload signal.
You can create superusers for the databases, but remember that if youwant to restrict the access to the databases, the superusers still have permissions to drop the other BD's. Try to reduce the number of superusers or almost one.
TODO: explain the best ways to put on work this method
Replace your actual method with this:
Monitoring Index and table accesing and use
TODO: Explain how to monitor the use of tables and indexes to apply modification to the way to storage them.