PostgreSQL wiki - User contributions [en]

Simple Configuration Recommendation

2013-09-13T13:20:39Z

Rstephan:

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes should be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/pg_xlog.
|-
|/pglog
|The location of the server log files.
|-
|/pgdata-system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata-temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata-''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

File system delegated administration is an advantage for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2013-09-13T13:18:18Z

Rstephan:

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes should be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pglog
|The location of the server log files.
|-
|/pgdata-system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata-temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata-''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

File system delegated administration is an advantage for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Database Schema Recommendations for an Application

2013-02-22T17:46:51Z

Rstephan:

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

Note: Seperation at the database level can be used for multi-tenant hosting when combined with the db_user_namespace configuration parameter. Schema seperation again becomes the important feature for seperating the database components within each of the hosted databases.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

Database Schema Recommendations for an Application

2012-10-11T17:58:14Z

Rstephan:

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

Note: Seperation at the databse level can be used for multi-tenant hosting when combined with the db_user_namespace configuration parameter. Schema seperation again becomes the important feature for seperating the database components within each of the hosted databases.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

File:Oracle-better-than-postgres.pdf

2012-09-12T21:10:16Z

Rstephan: uploaded a new version of "File:Oracle-better-than-postgres.pdf"

Database Schema Recommendations for an Application

2012-08-03T16:59:37Z

Rstephan: Undo revision 17977 by Rstephan (Talk)

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

Database Schema Recommendations for an Application

2012-08-03T16:58:37Z

Rstephan: Undo revision 17978 by Rstephan (Talk)

==Database Schema Recommendations for an Application==

Concur test 1

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

Database Schema Recommendations for an Application

2012-08-03T16:58:05Z

Rstephan:

==Database Schema Recommendations for an Application==

Concur test 2

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

Database Schema Recommendations for an Application

2012-08-03T16:57:59Z

Rstephan:

Database Schema Recommendations for an Application

2012-07-25T13:12:29Z

Rstephan:

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

[[Category:Administration]]

Database Schema Recommendations for an Application

2012-07-25T13:11:04Z

Rstephan: /* Database Schema Recommendations for an Application */

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

===Disclaimer===

These are the recommendations from an experienced DBA. They can be followed or ignored. Personally by following these few simple methods of managing application schemas, it has made my life easier.

Database Schema Recommendations for an Application

2012-07-25T13:00:57Z

Rstephan: Created page with "==Database Schema Recommendations for an Application== Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.post…"

==Database Schema Recommendations for an Application==

Within a PostgreSQL database cluster the basic methods for separating and name spacing objects is through [http://www.postgresql.org/docs/current/static/manage-ag-overview.html Managing Databases] and [http://www.postgresql.org/docs/current/static/ddl-schemas.html Schema Data Definitions].

The recommendation is to create a single database with multiple named schemas. This is different than a common (and older) practice of creating multiple databases and storing the objects within the "public" schema. Additionally, it is recommended to remove the public schema.

These are some of the advantages to following this recommendation:

* Cross schema object access is possible from a single database connection.
* Granting access to a schema is executed through a GRANT statement versus a reconfiguration of the pg_hba.conf file.
* Schemas are the ANSI standard for object separation and name spacing.
* Managing only one database within a single server (PostgreSQL cluster).

An advantage to using the separate database method is sessions connected to a database cannot cross database boundaries. Separation of objects is more complete. However, if the desire is have more separation of the objects (i.e., different environments), it is recommended to create a new server.

===Creating Tablespaces===

Create a set of tablespaces appropriate for each application. This provides for the physical separation of storage from one application to another. It also can provide for separation of storage within an application. Different storage of tables and indexes would enhance management flexibility and could provide additional performance.

PostgreSQL expects the host operating system to provide device management. Create a file system for each tablespace to separate and control the tablespace storage. The following is an example using ZFS with storage pools named pgdatapool and pgindexpool. Two 10 GB tablespaces will be used for an application, one for tables and one for indexes.

Unix commands:

zfs create -o mountpoint=/pgdata/app1_data pgdatapool/app1_data
zfs set refquota=10G pgdatapool/app1_data
zfs set refreference=10G pgdatapool/app1_data
zfs create -o mountpoint=/pgdata/app1_index pgindexpool/app1_index
zfs set refquota=10G pgindexpool/app1_index
zfs set refreference=10G pgindexpool/app1_index

SQL commands:

create tablespace app1_data location '/pgdata/app1_data';
create tablespace app1_index location '/pgdata/app1_index';

===Creating Accounts and Roles===

PostgreSQL database objects have an account (role) ownership. Create an account synonymous with the schema that will name space the objects. This account will be used as the owner for the objects. For security purposes do not allow anyone to connect to the account owning the application database objects. This account will need access to the tablespaces previously created. When application objects are created make sure the ownership is properly set.

create user app1 nologin;
grant create on tablespace app1_data to app1;
grant create on tablespace app1_index to app1;

Grant database object privileges to users or application accounts through the use of roles. Create a set of roles appropriate for an application. For simplicity create at least two roles, one to use for read/write privileges for the application and one for query privileges. As database objects are created for the application perform the appropriate grants to the roles.

create role app1_role;
create role app1_query_role;

Applications generally connect to the database using pooled (shared) accounts from application servers. Limit the knowledge of those accounts and their credentials. If possible, use definitions within the pg_hba.conf to make sure those accounts can only connect to the database from the defined application servers.

create user app1_pool password 'app1_pool';
grant app1_role to app1_pool;

===Create Application Schemas===

Create the application schema and grant the application users and roles the appropriate privileges. Make sure to create the application objects within the defined schema and set the appropriate object ownership.

create schema app1 authorization app1;
grant usage on schema app1 to app1_role;
grant usage on schema app1 to app1_query_role;

Category:Administration

2012-07-25T12:59:07Z

Rstephan: /* General Admin Topics */

== General Admin Topics ==

*[[Client Authentication]] (pg_hba.conf)
*[[Binary Replication Tutorial]]
*[[Planner Statistics]]
*[[Warm Standby]]
* [[Replication, Clustering, and Connection Pooling]]
* [[Shared Database Hosting]]
* [http://www.ibm.com/developerworks/opensource/library/os-postgresecurity/index.html Total security in a PostgreSQL database] 2009-11-17
* [[Simple Configuration Recommendation]]
* [[Database Schema Recommendations for an Application]]

== [[:Category:Backup|Backup]] ==

See the [[:Category:Backup|Backup]] category.

[http://www.postgresonline.com/journal/archives/186-postgresql90_pg_dumprestore.html Backup and Restore cheatsheet for PostgreSQL 9.0] Postgres OnLine Journal 2010-11-21

== Authentication ==
* [[Client Authentication]]

== Restoration and Recovery ==
* [http://svana.org/kleptog/pgsql/pgfsck.html PostgreSQL table checker and dumper tool] by Martijn van Oosterhout
* [[Adventures in PostgreSQL, Episode 1]] by Josh Berkus (2002-05)

== Routine maintenance and monitoring ==
*[[Vacuuming]]
*[[Monitoring]]
*[[Lock Monitoring]]
*[[Index Maintenance]]
*[[Disk Usage]]

== [[:Category:Windows|Windows-specific]] ==

[[:Category:Windows|Windows category]]
[[Category:General articles and guides]]

Simple Configuration Recommendation

2012-06-08T12:15:58Z

Rstephan: /* Software Location and Ownership */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes should be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

File system delegated administration is an advantage for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-06-08T12:13:44Z

Rstephan: /* Physical Database Backups */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

File system delegated administration is an advantage for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-06-06T16:24:26Z

Rstephan: /* Account Management */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

User:Rstephan

2012-06-01T18:26:13Z

Rstephan:

My name is Richard Stephan. I currently work at the [http://www.nyiso.com NYISO], and started using PostgreSQL 8.0 in 2005 for my own needs. In January of 2009, I finally convinced management to use PostgreSQL instead of Oracle for one of the applications. It was successful, and others have been created since.

As an advocate for PostgreSQL, I started a local users group in the Capital District of New York State - [[NYCDPUG]].

User:Rstephan

2012-06-01T18:26:01Z

Rstephan:

User:Rstephan

2012-06-01T18:23:37Z

Rstephan:

Simple Configuration Recommendation

2012-06-01T18:18:59Z

Rstephan: /* Single Cluster and Database per Server */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones and FreeBSD jails or hypervisors like [http://en.wikipedia.org/wiki/Xen Xen] and [http://en.wikipedia.org/wiki/Kernel-based_Virtual_Machine KVM] make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-06-01T18:12:57Z

Rstephan: /* File System Layouts */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. [http://en.wikipedia.org/wiki/ZFS ZFS] is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-06-01T18:09:49Z

Rstephan: /* File System Layouts */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-31T12:00:35Z

Rstephan: /* Physical Database Backups */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/current/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-31T11:54:43Z

Rstephan:

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

These recommendations are to standardize and simplify PostgreSQL database configurations.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is easier to manage.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. An alternate and more manageable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/9.1/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-30T19:48:41Z

Rstephan: /* Physical Database Backups */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is preferable.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. The preferable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary and undesirable. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/9.1/static/continuous-archiving.html Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-30T19:46:12Z

Rstephan: /* Physical Database Backups */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is preferable.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. The preferable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary and undesirable. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the ''Continuous Archiving and Point-In-Time Recovery'' chapter in the PostgreSQL reference manual - http://www.postgresql.org/docs/9.1/static/continuous-archiving.html

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-30T19:38:20Z

Rstephan: /* Physical Database Backups */

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is preferable.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. The preferable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary and undesirable. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the [http://www.postgresql.org/docs/9.1/static/continuous-archiving.html|Continuous Archiving and Point-In-Time Recovery] chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-30T19:30:23Z

Rstephan:

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is preferable.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. The preferable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary and undesirable. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the Continuous Archiving and Point-In-Time

Recovery chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

[[Category:Administration]]

Simple Configuration Recommendation

2012-05-30T19:26:01Z

Rstephan: Created page with "==PostgreSQL Configuration Recommendations== Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closel…"

==PostgreSQL Configuration Recommendations==
Administration of database environments requires resources from separate disciplines. Database administrators (DBA) must work closely with the system and storage administrators. PostgreSQL relies heavily on the host operating system (OS) for storage management. It does not have the advanced and complicated features of Oracle for storage management.

===Software Location and Ownership===
The common location for PostgreSQL software on Linux is /usr/local/pgsql with the executables, source, and data existing in various subdirectories. However, PostgreSQL is open source software and whoever the distributor, packager, or supporter will have their
recommendations as to where to place the software and what account owns the software.

Some package management software place the executables, libraries, man pages, and contrib files in various places. Avoid these solutions. Having a standard simple configuration for the software installation is preferable.

The owner and group of the software, database files, and server processes will be postgres:dba. The UID and GID have to be worked out with system administration.

Create a base software destination directory:
/opt/postgres

Define the software installation using the first 2 digits of the software version (9.0 as the example):
/opt/postgres/9.0

Be advised upgrading with the third digit in the version number usually entails stopping the server, switching to the new software, and restarting the server. However, upgrading the first or second digit requires an upgrade of all of the data files. Keeping the software versions separate helps with upgrades.

===Single Cluster and Database per Server===
The following database objects are cluster wide within PostgreSQL, having only one database per cluster is preferable:
* Configuration files
* WAL (on-line and archived) files
* Tablespaces
* User accounts and roles
* Server log file

An older style of database object separation was through the use of multiple databases. The preferable method to separate database objects within a single database server is through the use of schemas.

To separate PostgreSQL clusters within a server different data areas and IP port numbers need to be created. However, the virtualization capabilities of the OSes like Solaris’s zones or hypervisors like Xen or KVM make creation of multiple clusters within a single host unnecessary and undesirable. The recommendation is to have only one PostgreSQL cluster per virtualized host.

===File System Layouts===
To create the most flexible and manageable environment, separate the various database components into their own file systems. Create the following file systems (mount points):

{| border="1"
|/pgarchive
|DB Archive location containing the archive log files.
|-
|/pgbackup
|DB Backup location containing the physical and logical backups. For logical backups (pg_dump), use EXPORTS as a sub directory. For

physical backups, use FULL{date}, convention for the sub directory. However, physical backups could be handled with file system

snapshots. More on this later.
|-
|/pgcluster/data
|PostgreSQL Cluster data directory (PGDATA environment variable). This will contain all of the configuration files and directories of a PostgreSQL cluster. Note: the on-line WAL files would be located in /pgcluster/data/pg_xlog.
|-
|/pgcluster/log
|The location of the server log files.
|-
|/pgdata/system
|The location of a database’s default tablespace. This is to be used at the creation of the database. The database catalog information will be stored.
|-
|/pgdata/temp
|The location of the database’s default temporary tablespace. This is needed for temporary sort information. Note: The pgsql_tmp directory within the default tablespace will be used if a temporary tablespace is not defined.
|-
|/pgdata/''app_tblspc''
|Application tablespaces. For every schema there should be a minimum of two tablespaces. One for tables and one for indexes.
|}

PostgreSQL does not have declarative size limitations for its tablespaces and database objects; the OS is expected to manage the size of used devices. This is why it is recommended to create a separate mount point (file system) for every tablespace. This adds a layer of complexity especially in organizations that segregate storage and OS management from database management. However, that level of complexity is outweighed by the advantage of separation and segregation of database objects.

It is desirable for the file system growth and management to be in the form of distributed administration. A DBA would be given a set of disk groups within a volume manager and then carve up the file systems accordingly. ZFS is an example of a file system that has delegated administration.

===Server Configuration Information===
Use "Continuous Archiving" for Point-In-Time Recovery (PITR).
archive_mode = on
archive_command = 'cp %p /pgarchive/%f'
wal_level = 'archive'

Setup a server log file rotation. (7 days or 10MB, whichever comes first)
log_directory = '/pgcluster/log'
log_filename = 'postgresql-%Y%m%d_%H%M%S.log'
log_rotation_age = 7d
log_rotation_size = 10MB
log_truncate_on_rotation = off
log_line_prefix = '%t c% '

Gather connection information in server log file.
log_connections = on
log_disconnections = on

Log DDL transactions.
log_statement = 'ddl'

Enable SSL traffic.
ssl = on
ssl_ciphers = 'ALL'

Either drop the default postgres database or deny remote connections to it.

Create a database to place application schemas within. Drop the public schema.

===Account Management===
Avoid connecting to the database server as the database superuser, postgres. Management processes, like backups, will most likely still

use the postgres account; however users and applications should not. Allow only local connections to the postgres database user. Note: In version 9.1 using the authentication model within pg_hba.conf of local with auth-option peer is the most preferable.

Create individual accounts for all the users that will be connecting directly to the database. DBAs will need superuser privileges, deployment representatives will need privileges to manipulate schema object definitions, developers will need select privileges on application objects to diagnose production issues.

Where possible use centralized enterprise accounts (i.e., LDAP) for user account authentication.

Create accounts to be synonymous with application schemas. Avoid connecting to those schema accounts. In fact where possible make the account NOLOGIN.

When users are deploying object definitions into the application schemas, they will need to have the appropriate privileges. Granting those users the role of the application schema is sufficient to allow this activity. Make sure that for any newly created object the ownership is set to the account that matches the schema.

To ease management of accounts, use roles for granting privileges to users versus direct grants.

Generally applications connect to the database using pooled (shared) accounts. Make sure those accounts can only connect to the database from the defined application servers. Users should not be allowed to log directly into the database using those pooled accounts.

===Physical Database Backups===
To perform on-line backups it is important that the database be in archive log mode. Refer to the Continuous Archiving and Point-In-Time

Recovery chapter in the PostgreSQL reference manual.

Using an advanced file system like ZFS that has snapshot/rollback capabilities has some significant advantages. Placing the database in hot backup mode, snapshoting the file systems that make up the database storage, and taking the database out of backup mode is preferable to using tar or cpio to copy all of the data files to an alternate location during the backup process.

After the snapshots have been taken coping the data files to an alternate location for safe keeping is still an option; however, the database is only in hot backup mode for a short amount of time while the snapshot is taken. For most recovery situations using the on-line backups (the snapshots) is used instead of "pulling from tape".

Delegation of file system administration is once again a necessity for management of the file system snapshots. DBAs will have to coordinate with the system and storage administration to facilitate the best practices.

Category:Administration

2012-05-30T19:05:53Z

Rstephan: /* General Admin Topics */

== General Admin Topics ==

*[[Client Authentication]] (pg_hba.conf)
*[[Binary Replication Tutorial]]
*[[Planner Statistics]]
*[[Warm Standby]]
* [[Replication, Clustering, and Connection Pooling]]
* [[Shared Database Hosting]]
* [http://www.ibm.com/developerworks/opensource/library/os-postgresecurity/index.html Total security in a PostgreSQL database] 2009-11-17
* [[Simple Configuration Recommendation]]

== [[:Category:Backup|Backup]] ==

See the [[:Category:Backup|Backup]] category.

[http://www.postgresonline.com/journal/archives/186-postgresql90_pg_dumprestore.html Backup and Restore cheatsheet for PostgreSQL 9.0] Postgres OnLine Journal 2010-11-21

== Authentication ==
* [[Client Authentication]]

== Restoration and Recovery ==
* [http://svana.org/kleptog/pgsql/pgfsck.html PostgreSQL table checker and dumper tool] by Martijn van Oosterhout
* [[Adventures in PostgreSQL, Episode 1]] by Josh Berkus (2002-05)

== Routine maintenance and monitoring ==
*[[Vacuuming]]
*[[Monitoring]]
*[[Lock Monitoring]]
*[[Index Maintenance]]
*[[Disk Usage]]

== [[:Category:Windows|Windows-specific]] ==

[[:Category:Windows|Windows category]]
[[Category:General articles and guides]]

NYCDPUG

2012-05-24T20:06:40Z

Rstephan:

The New York Capital District PostgreSQL Users Group no longer exists. The following contains a listing of past presentations:
* January 7, 2010 - Introduction to PostgreSQL [http://wiki.postgresql.org/images/4/46/NYCDPUG_NYISO_PostgreSQL.pdf NYCDPUG_NYISO_PostrgreSQL.pdf]
* March 4, 2010 - [http://momjian.us/main/writings/pgsql/inside_shmem.pdf Bruce Momjian's Inside PostgreSQL Shared Memory]
* May 6, 2010 - Separation of Duties [http://wiki.postgresql.org/images/1/1c/SeparationofDuties.pdf NYISO_Separation_of_Duties.pdf]
* March 3, 2011 - Securing data in transit using SSL and SSL/TLS LDAP.
* May 5, 2011 - Jim Mlodgenski's Scaling PostgreSQL with Stado
* November 3, 2011 - Why is Oracle Better than PostgreSQL? [http://wiki.postgresql.org/images/2/27/Oracle-better-than-postgres.pdf Oracle-better-than-postgres.pdf]

[[Category:Users group]]

NYCDPUG

2012-05-24T20:04:18Z

Rstephan:

File:Oracle-better-than-postgres.pdf

2012-05-24T20:00:43Z

Rstephan:

NYCDPUG

2012-05-24T19:58:58Z

Rstephan:

NYCDPUG

2012-05-24T19:58:41Z

Rstephan:

NYCDPUG

2011-08-05T18:49:22Z

Rstephan: Undo revision 15081 by Rstephan (Talk)

The New York Capital District PostgreSQL Users Group home page can be found at [http://nycdpug.x10hosting.com http://nycdpug.x10hosting.com]. The following contains a listing of past presentations:
* January 7, 2010 - Introduction to PostgreSQL [http://wiki.postgresql.org/images/4/46/NYCDPUG_NYISO_PostgreSQL.pdf NYCDPUG_NYISO_PostrgreSQL.pdf]
* March 4, 2010 - [http://momjian.us/main/writings/pgsql/inside_shmem.pdf Bruce Momjian's Inside PostgreSQL Shared Memory]
* May 6, 2010 - Separation of Duties [http://wiki.postgresql.org/images/1/1c/SeparationofDuties.pdf NYISO_Separation_of_Duties.pdf]
* March 3, 2011 - Securing data in transit using SSL and SSL/TLS LDAP.
* May 5, 2011 - Jim Mlodgenski's Scaling PostgreSQL with Stado

NYCDPUG

2011-08-05T18:48:24Z

Rstephan: Undo revision 15080 by Rstephan (Talk)

NYCDPUG

2011-08-05T18:48:11Z

Rstephan:

PostgreSQL for Oracle DBAs

2011-08-05T18:42:04Z

Rstephan:

= Introduction =

The following article contains information to help an Oracle DBA understand
some terms and the management of a PostgreSQL database. This article is
intended to be an introduction to PostgreSQL, not a tutorial or a complete
definition of how to administer a PostgreSQL database. For complete
documentation refer to the [http://www.postgresql.org/docs/manuals/ PostgreSQL manuals].

= Oracle =

== Brief description: ==

* An Oracle database server consists of an Oracle instance and an Oracle database.
* An Oracle instance consists of the Oracle background processes and the allocated memory within the shared global area (SGA) and the program global area (PGA).
* The Oracle background processes consist of the following:
** Database Writer Process (DBWn)
** Log Writer Process (LGWR)
** Checkpoint Process (CKPT)
** System Monitor Process (SMON)
** Process Monitor Process (PMON)
** Recoverer Process (RECO)
** Archiver Processes (ARCn)
* An Oracle database consists of the database datafiles, control files, redo log files, archive log files, and parameter file.
* To remotely access an Oracle database, there exists a separate process referred to as the Oracle listener.
* In the Dedicated Server configuration (versus the Shared Server configuration) every established database session has its own process executing on the server.

To keep things simple any comparisons with an Oracle database will always refer to a single instance managing a single database, RAC and Data Guard will not be mentioned. Note: PostgreSQL also has the concept of a warm standby (since 8.2) with the shipping of archive logs (introduced in 8.0).

= PostgreSQL =

== Database Server Processes ==

The database server program postgres are all of the server processes. There are no separately named processes like in Oracle for the different duties within the database environment. If you were to look at the process list (ps) the name of the processes would be postgres. However, on most platforms, PostgreSQL modifies its command title so that individual server processes can readily be identified. You may need to adjust the parameters used for commands such as ps and top to show these updated titles in place of the process name ("postgres").

The processes seen in a process list can be some of the following:

* Master process - launches the other processes, background and session processes.
* Writer process - background process that coordinates database writes, log writes and checkpoints.
* Stats collector process - background process collecting information about server activity.
* User session processes.

The server processes communicate with each other using semaphores and shared memory to ensure data integrity throughout concurrent data access.

== PostgreSQL Database Cluster ==

Within a server, one or more Oracle instances can be built. The databases are separate from one another usually sharing only the Oracle listener process. PostgreSQL has the concept of a ''database cluster''. A database cluster is a collection of databases that is stored at a common file system location (the "data area"). It is possible to have multiple database clusters, so long as they use different data areas and different communication ports.

The processes along with the file system components are all shared within the database cluster. All the data needed for a database cluster is stored within the cluster's data directory, commonly referred to as ''PGDATA'' (after the name of the environment variable that can be used to define it). The PGDATA directory contains several subdirectories and configuration files.

The following are some of the cluster configuration files:

* postgresql.conf - Parameter or main server configuration file.
* pg_hba.conf - Client authentication configuration file.
* pg_ident.conf - Map from OS account to PostgreSQL account file.

The cluster subdirectories:

* base - Subdirectorycontaining per-database subdirectories
* global - Subdirectory containing cluster-wide tables
** pg_auth - Authorization file containing user and role definitions.
** pg_control - Control file.
** pg_database - Information of databases within the cluster.
* pg_clog - Subdirectory containing transaction commit status data
* pg_multixact - Subdirectory containing multitransaction status data (used for shared row locks)
* pg_subtrans - Subdirectory containing subtransaction status data
* pg_tblspc - Subdirectory containing symbolic links to tablespaces
* pg_twophase - Subdirectory containing state files for prepared transactions
* pg_xlog - Subdirectory containing WAL (Write Ahead Log) files

By default, for each database in the cluster there is a subdirectory within PGDATA/base, named after the database's OID (object identifier) in pg_database. This subdirectory is the default location for the database's files; in particular, its system catalogs are stored there. Each table and index is stored in a separate file, named after the table or index's filenode number, which can be found in pg_class.relfilenode.

Several components that Oracle DBAs usually equate to one database are shared between databases within a PostgreSQL cluster, including the parameter file, control file, redo logs, tablespaces, accounts, roles, and background processes.

== Tablespaces and Object Data Files ==

PostgreSQL introduced tablespace management in version 8.0. The physical representation of a tablespace within PostgreSQL is simple: it is a directory on the file system, and the mapping is done via symbolic links.

When a database is created, the default tablespace is where by default all of the database objects are stored. In Oracle this would be similar to the System, User, and Temporary tablespaces. If no default tablespace is defined during creation, the data files will go into a subdirectory of the PGDATA/base. Preferably the location of the system catalog information and the application data structures would reside in separately managed tablespaces. This is available.

As in Oracle, the definition of a PostgreSQL table determines which tablespace the object resides. However, there exists no size limitation except physical boundaries placed on the device by the OS.

The individual table's data is stored within a file within the tablespace (or directory). The database software will split the table across multiple datafiles in the event the table's data surpasses 1 GB.

Since version 8.1, it's possible to partition a table over separate (or the same) tablespaces. This is based on PostgreSQL's table inheritance feature, using a capability of the query planner referred to as constraint exclusion.

There exists no capacity for separating out specific columns (like LOBs) into separately defined tablespaces. However, in addition to the data files that represent the table (in multiples of 1 GB) there is a separation of data files for columns within a table that are TOASTed. The PostgreSQL storage system called TOAST (The Oversized-Attribute Storage Technique) automatically stores values larger than a single database page into a secondary storage area per table. The TOAST technique allows for data columns up to 1 GB in size.

As in Oracle, the definition of an index determines which tablespace it resides within. Therefore, it is possible to gain the performance advantage of separating the disks that a table's data versus its indexing reside, relieving I/O contention during data manipulation.

In Oracle there exists temporary tablespaces where sort information and temporary evaluation space needed for distinct statements and the like are used. PostgreSQL does not have this concept of a temporary tablespace; however it does require storage to be able to perform these activities as well. Within the "default" tablespace of the database (defined at database creation) there is a directory called pgsql_tmp. This directory holds the temporary storage needed for the evaluation. The files that get created within the directory exist only while the SQL statement is executing. They grow very fast, and are most likely not designed for space efficiency but rather speed. Be aware that disk fragmentation could result from this, and there needs to be sufficient space on the disk to support the user queries. With the release of 8.3, there are definitions of temporary tablespaces using the parameter ''temp_tablespaces''.

== REDO and Archiving ==

PostgreSQL uses ''Write-Ahead Logging'' (WAL) as its approach to transaction logging. WAL's central concept is that changes to data files (where tables and indexes reside) must be written only after those changes have been logged, that is, when log records describing the changes have been flushed to permanent storage. If we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages can be redone from the log records. (This is roll-forward recovery, also known as REDO.)

PostgreSQL maintains its (WAL) in the ''pg_xlog'' subdirectory of the cluster's data directory.

WAL was introduced into PostgreSQL in version 7.1. To maintain database consistency in case of a failure, previous releases forced all data modifications to disk before each transaction commit. With WAL, only one log file must be flushed to disk, greatly improving performance while adding capabilities like Point-In-Time Recovery and transaction archiving.

A PostgreSQL system theoretically produces an indefinitely long sequence of WAL records. The system physically divides this sequence into WAL segment files, which are normally 16MB apiece. The system normally creates a few segment files and then "recycles" them by renaming no-longer-needed segment files to higher segment numbers. If you were to perform a listing of the pg_xlog directory there would always be a handful of files changing names over time.

To add archiving of the WAL files there exists a parameter within the parameter file where a command is added to execute the archival process. Once this is done, Operation System "on-line" backups even become available by executing the ''pg_start_backup'' and the ''pg_stop_backup'' commands, which suspend and resume writing to the datafiles while continuing to write the transactions to the WAL files and executing the archival process.

Inclusion of WAL archiving and the on-line backup commands were added in version 8.0.

== Rollback or Undo ==

It is interesting how the dynamic allocation of disk space is used for the storage and processing of records within tables. The files that represent the table grow as the table grows. It also grows with transactions that are performed against it. In Oracle there is a concept of rollback or undo segments that hold the information for rolling back a transaction. In PostgreSQL the data is stored within the file that represents the table. So when deletes and updates are performed on a table, the file that represents the object will contain the previous data. This space gets reused but to force recovery of used space, a maintenance process called ''vacuum'' must be executed.

== Server Log File ==

Oracle has the alert log file. PostgreSQL has the server log file. A configuration option would even have the connection information we normally see within the Oracle's listener.log appear in PostgreSQL's server log. The parameters within the server configuration file (postgresql.conf) determine the level, location, and name of the log file.

To help with the maintenance of the server log file (it grows rapidly), there exists functionality for rotating the server log file. Parameters can be set to determine when to rotate the file based on the size or age of the file. Management of the old files is then left to the administrator.

== Applications ==

The command ''initdb'' creates a new PostgreSQL database cluster.

The command ''psql'' starts the terminal-based front-end to PostgreSQL or SQL command prompt. Queries and commands can be executed interactively or through files. The psql command prompt has several attractive features:

* Thorough on-line help for both the psql commands and the SQL syntax.
* Command history and line editing.
* SQL commands could exist on multiple lines and are executed only after the semi-colon (;).
* Several SQL commands separated by semi-colons could be entered on a single line.
* Flexible output formatting.
* Multiple object description commands that are superior to Oracle's DESCRIBE.

Depending on the security configurations of the environments, connections can be established locally or remotely through TCP/IP. Due to these separate security connections passwords may or may not be required to connect.

The command ''pg_ctl'' is a utility for displaying status, starting, stopping, or restarting the PostgreSQL database server (postgres). Although the server can be started through the postgres executable, pg_ctl encapsulates tasks such as redirecting log output, properly detaching from the terminal and process group, and providing options for controlled shutdown.

The commands ''pg_dump'' and ''pg_restore'' are utilities designed for exporting and importing the contents of a PostgreSQL database. Dumps can be output in either script or archive file formats. The script file format creates plain-text files containing the SQL commands required to reconstruct the database to the state it was at the time it was generated. The archive file format creates a file to be used with pg_restore to rebuild the database.

The archive file formats are designed to be portable across architectures. Historically, any type of upgrade to the PostgreSQL software would require a pg_dump of the database prior to the upgrade. Then a pg_restore after the upgrade. Now, for minor releases (i.e., the third decimal – 8.2.x) upgrades can be done in place. However, changing versions at the first or second decimal still requires a pg_dump/pg_restore.

There exists a graphical tool called [http://www.pgadmin.org/ ''pgAdmin III''] developed separately. It is distributed with the Linux and Windows versions of PostgreSQL. Connection to a database server can be established remotely to perform administrative duties. Because the tool is designed to manage all aspects of the database environment, connection to the database must be through a super user account.

The pgAdmin III tool has the following standard attractive features:

* Intuitive layout
* Tree structure for creating and modifying database objects
* Reviewing and saving of SQL when altering or creating objects

[[Category:Oracle]]

User:Rstephan

2011-01-20T18:04:01Z

Rstephan: New page: My name is Richard Stephan. I currently work at the [http://www.nyiso.com NYISO], and started using PostgreSQL 8.0 in 2005 for my own needs. In January of 2009, I finally convinced manag...

NYCDPUG

2010-05-07T16:50:14Z

Rstephan:

NYCDPUG

2010-05-07T16:49:35Z

Rstephan:

File:SeparationofDuties.pdf

2010-05-07T16:48:28Z

Rstephan: uploaded a new version of "Image:SeparationofDuties.pdf"

File:SeparationofDuties.pdf

2010-05-07T16:46:41Z

Rstephan:

NYCDPUG

2010-05-07T16:45:32Z

Rstephan:

NYCDPUG

2010-03-30T16:46:11Z

Rstephan:

NYCDPUG

2010-03-30T16:43:14Z

Rstephan:

The New York Capital District PostgreSQL Users Group home page can be found at [http://nycdpug.x10hosting.com http://nycdpug.x10hosting.com]. The following contains a listing of past presentations:
* January 7, 2010 - Introduction to PostgreSQL [[Image:NYCDPUG_NYISO_PostgreSQL.pdf]]
* March 4, 2010 - [http://momjian.us/main/writings/pgsql/inside_shmem.pdf Bruce Momjian's Inside PostgreSQL Shared Memory]

File:NYCDPUG NYISO PostgreSQL.pdf

2010-03-30T16:38:44Z

Rstephan:

NYCDPUG

2010-03-30T16:36:04Z

Rstephan:

The New York Capital District PostgreSQL Users Group home page can be found at [http://nycdpug.x10hosting.com http://nycdpug.x10hosting.com]. The following contains a listing of past presentations:
* January 7, 2010 - Introduction to PostgreSQL
* March 4, 2010 - [http://momjian.us/main/writings/pgsql/inside_shmem.pdf Bruce Momjian's Inside PostgreSQL Shared Memory]

NYCDPUG

2010-03-30T16:29:56Z

Rstephan:

The New York Capital District PostgreSQL Users Group home page can be found at [http://nycdpug.x10hosting.com http://nycdpug.x10hosting.com]. The following list contains a listing of presentations:
* January 7, 2010 - PostgreSQL at the NYISO
* March 4, 2010 - [http://momjian.us/main/writings/pgsql/inside_shmem.pdf Bruce Momjian's Inside PostgreSQL Shared Memory]