9.1第五章
数据定义(Data Definition)
本章介绍如何创建一个保存数据的数据库结构。在关系型数据库里,裸数据是存储在表中的,因此本章的大部分内容都将用于介绍如何创建表以及如何修改他们,以及在控制表中存储的数据上有什么可以获得的特性。随后,我们讨论表是如何组织成模式的,以及如何给表赋予权限。最后,我们将简单查看一下影响数据存储的其它因素,比如继承、视图、函数、触发器。
表的基础 (Table Basics)
A table in a relational database is much like a table on paper: It consists of rows and columns. The number and
order of the columns is fixed, and each column has a name. The number of rows is variable — it reflects how much
data is stored at a given moment. SQL does not make any guarantees about the order of the rows in a table. When a
table is read, the rows will appear in an unspecified order, unless sorting is explicitly requested. This is
covered in Chapter 7. Furthermore, SQL does not assign unique identifiers to rows, so it is possible to have
several completely identical rows in a table. This is a consequence of the mathematical model that underlies SQL
but is usually not desirable. Later in this chapter we will see how to deal with this issue.
关系型数据库中的表非常类似纸面上的一张表:它由行和列组成。字段的数目是固定的,每个字段都有一个名字。行的数目是变化的(它反映了任意时刻存储的数据量)。SQL 对行的顺序没有任何承诺(除非你明确地要求排序)。这些内容在章7里介绍。另外,SQL 并不给行赋予唯一的标识,因此我们很可能在一个表中有好几个完全相同的行。这是作为 SQL 基础的下层数学模型的必然结果,但是这通常是我们不愿意看到的。本章稍后的部分将讨论如何处理这个问题。
Each column has a data type. The data type constrains the set of possible values that can be assigned to a column
and assigns semantics to the data stored in the column so that it can be used for computations. For instance, a
column declared to be of a numerical type will not accept arbitrary text strings, and the data stored in such a
column can be used for mathematical computations. By contrast, a column declared to be of a character string type
will accept almost any kind of data but it does not lend itself to mathematical calculations, although other
operations such as string concatenation are available.
每个字段都有一个数据类型。数据类型控制着一个字段所有可能值的集合,并且控制着字段中数据的语义,这样它就可以用于计算。比如,一个声明为数值类型的字段不会接受任意文本字符串,而存储在这种字段里的数据可以用于数学计算。相比之下,一个声明为字符串类型的字段接受几乎任意类型的数据,但是它们不能进行数学计算(不过可以进行像字符串连接之类的操作)。
PostgreSQL includes a sizable set of built-in data types that fit many applications. Users can also define their
own data types. Most built-in data types have obvious names and semantics, so we defer a detailed explanation to
Chapter 8. Some of the frequently used data types are integer for whole numbers, numeric for possibly fractional
numbers, text for character strings, date for dates, time for time-of-day values, and timestamp for values
containing both date and time.
PostgreSQL 包含一套可剪裁的内置数据类型,这些类型可以适用于许多应用。用户也可以定义它们自己的数据类型。大多数内置的数据类型有显而易见的名字和语义,因此我们把详细的解释放在了章8。常用的数据类型有:用于整数的 integer 、用于可能为分数的numeric 、用于字符串的 text 、用于日期的 date 、用于时间的 time 、用于时间戳的 timestamp 。
To create a table, you use the aptly named CREATE TABLE command. In this command you specify at least a name for
the new table, the names of the columns and the data type of each column. For example:
要创建一个表,可用使用 CREATE TABLE 命令。在这个命令里,你至少要为新表声明一个名字,还有各字段的名字以及其数据类型。比如:
CREATE TABLE my_first_table ( first_column text, second_column integer );
This creates a table named my_first_table with two columns. The first column is named first_column and has a data
type of text; the second column has the name second_column and the type integer. The table and column names follow
the identifier syntax explained in Section 4.1.1. The type names are usually also identifiers, but there are some
exceptions. Note that the column list is comma-separated and surrounded by parentheses.
这样就创建了一个有两个字段的名为 my_first_table 的表。第一个字段的名字是 first_column ,数据类型为 text ;第二个字段的名字是 second_column ,数据类型是 integer 。表和字段的名字遵循节4.1.1里面解释的标识符语法。类型名通常也是标识符(但是有一些例外)。请注意字段列表是逗号分隔的,并且用圆括弧包围。
Of course, the previous example was heavily contrived. Normally, you would give names to your tables and columns
that convey what kind of data they store. So let's look at a more realistic example:
当然,前面只是一个非常虚构的例子。通常,你会给表和字段取一个有意义的名字,所以还是让我们给一个比较现实的例子:
CREATE TABLE products ( product_no integer, name text, price numeric );
(The numeric type can store fractional components, as would be typical of monetary amounts.)
(numeric 类型可以存储分数部分,金额很可能有这样的分数部分。)
Tip: When you create many interrelated tables it is wise to choose a consistent naming pattern for the tables and
columns. For instance, there is a choice of using singular or plural nouns for table names, both of which are
favored by some theorist or other.
【提示】如果你创建了许多相互关联的表,那么最好为表和字段选择一致的命名模式。比如,表名字可以统一选择单数或者复数,两种选择都有这样那样的理论家支持。
There is a limit on how many columns a table can contain. Depending on the column types, it is between 250 and
1600. However, defining a table with anywhere near this many columns is highly unusual and often a questionable
design.
一个表能包含的字段数目是有限制的。根据字段类型的不同,这个数目可能在 250 到 1600 之间。不过,不管是哪一端的数字,如果你设计的表包含那么多的字段好像都很不可能发生,否则是设计上有问题的表现。
If you no longer need a table, you can remove it using the DROP TABLE command. For example:
如果你不再需要一个表,那么可以用 DROP TABLE 命令删除它。像这样:
DROP TABLE my_first_table; DROP TABLE products;
Attempting to drop a table that does not exist is an error. Nevertheless, it is common in SQL script files to
unconditionally try to drop each table before creating it, ignoring any error messages, so that the script works
whether or not the table exists. (If you like, you can use the DROP TABLE IF EXISTS variant to avoid the error
messages, but this is not standard SQL.)
试图删除一个不存在的表是一个错误。不过,在 SQL 脚本文件里,我们通常在创建表之前无条件删除它并忽略错误信息。当然你还可以使用 DROP TABLE IF EXISTS 来避免警告信息,不过这并不符合 SQL 标准。
If you need to modify a table that already exists, see Section 5.5 later in this chapter.
如果你需要修改一个已经存在的表,那么可以看看本章稍后的节5.5。
With the tools discussed so far you can create fully functional tables. The remainder of this chapter is concerned
with adding features to the table definition to ensure data integrity, security, or convenience. If you are eager
to fill your tables with data now you can skip ahead to Chapter 6 and read the rest of this chapter later.
使用到目前为止讨论的工具我们可以创建功能完整的表。本章剩下的部分是有关向表定义中增加特性、保证数据完整性、安全性或便利性的内容。如果你急于给表填充数据,那么你可以忽略余下的部分直接到章6,然后在稍后的时候再回来阅读本章。
默认值 (Default Values)
A column can be assigned a default value. When a new row is created and no values are specified for some of the
columns, those columns will be filled with their respective default values. A data manipulation command can also
request explicitly that a column be set to its default value, without having to know what that value is. (Details
about data manipulation commands are in Chapter 6.)
一个字段可以赋予缺省值。如果新创建了一个数据行,而有些字段的数值没有声明,那么这些字段将被填充为它们各自的缺省值。一条数据修改命令也可以明确地要求把一个字段设置为它的缺省值,而不用事先知道这个缺省值是什么。有关数据操作的命令在章6。
If no default value is declared explicitly, the default value is the null value. This usually makes sense because a
null value can be considered to represent unknown data.
如果没有明确声明缺省值,那么缺省值是 NULL 。这么做通常是合理的,因为 NULL 表示"未知"。
In a table definition, default values are listed after the column data type. For example:
在一个表定义里,缺省值是在字段数据类型后面列出的。比如:
CREATE TABLE products ( product_no integer, name text, price numeric DEFAULT 9.99 );
The default value can be an expression, which will be evaluated whenever the default value is inserted (not when
the table is created). A common example is for a timestamp column to have a default of CURRENT_TIMESTAMP, so that
it gets set to the time of row insertion. Another common example is generating a "serial number" for each row. In
PostgreSQL this is typically done by something like:
缺省值可以是一个表达式,它会在插入缺省值的时候计算(不是在创建表的时候)。一个常见的例子是一个 timestamp 字段可能有缺省值CURRENT_TIMESTAMP ,它表示插入行的时刻。另外一个常见的例子是为每一行生成一个"序列号"。在 PostgreSQL 里,通常是用类似下面这样的方法生成的:
CREATE TABLE products ( product_no integer DEFAULT nextval('products_product_no_seq'), ... );
where the nextval() function supplies successive values from a sequence object (see Section 9.15). This arrangement is sufficiently common that there's a special shorthand for it:
这里的 nextval() 从一个序列对象(参阅节9.12)提供后继的数值。这种做法非常普遍,以至于我们有一个专门的缩写用于此目的:
CREATE TABLE products ( product_no SERIAL, ... );
The SERIAL shorthand is discussed further in Section 8.1.4.
SERIAL 缩写在节8.1.4里有进一步描述。
约束(Constraints)
Data types are a way to limit the kind of data that can be stored in a table. For many applications, however, the
constraint they provide is too coarse. For example, a column containing a product price should probably only accept
positive values. But there is no standard data type that accepts only positive numbers. Another issue is that you
might want to constrain column data with respect to other columns or rows. For example, in a table containing
product information, there should be only one row for each product number.
数据类型是限制我们可以在表里存储什么数据的一种方法。不过,对于许多应用来说,这种限制实在是太粗糙了。比如,一个包含产品价格的字段应该只接受正数。但是没有哪种标准数据类型只接受正数。另外一个问题是你可能需要根据其它字段或者其它行的数据来约束字段数据。比如,在一个包含产品信息的表中,每个产品编号都应该只有一行。
To that end, SQL allows you to define constraints on columns and tables. Constraints give you as much control over
the data in your tables as you wish. If a user attempts to store data in a column that would violate a constraint,
an error is raised. This applies even if the value came from the default value definition.
对于这些问题,SQL 允许你在字段和表上定义约束。约束允许你对数据施加任意控制。如果用户企图在字段里存储违反约束的数据,那么就会抛出一个错误。这种情况同时也适用于数值来自缺省值的情况。
检测约束(Check Constraints)
A check constraint is the most generic constraint type. It allows you to specify that the value in a certain column
must satisfy a Boolean (truth-value) expression. For instance, to require positive product prices, you could use:
检查约束是最常见的约束类型。它允许你声明在某个字段里的数值必须使一个布尔表达式为真。比如,要强制一个正数的产品价格,你可以用:
CREATE TABLE products ( product_no integer, name text, price numeric CHECK (price > 0) );
As you see, the constraint definition comes after the data type, just like default value definitions. Default
values and constraints can be listed in any order. A check constraint consists of the key word CHECK followed by an
expression in parentheses. The check constraint expression should involve the column thus constrained, otherwise
the constraint would not make too much sense.
如你所见,约束定义在数据类型之后,就好像缺省值定义一样。缺省值和约束可以按任意顺序排列。一个检查约束由一个关键字 CHECK 后面跟一个放在圆括弧里的表达式组成。检查约束表达式应该包含受约束的字段,否则这个约束就没什么意义了。
You can also give the constraint a separate name. This clarifies error messages and allows you to refer to the
constraint when you need to change it. The syntax is:
你还可以给这个约束取一个独立的名字。这样就可以令错误信息更清晰,并且在你需要修改它的时候引用这个名字。语法是:
CREATE TABLE products ( product_no integer, name text, price numeric CONSTRAINT positive_price CHECK (price > 0) );
So, to specify a named constraint, use the key word CONSTRAINT followed by an identifier followed by the constraint
definition. (If you don't specify a constraint name in this way, the system chooses a name for you.)
因此,要声明一个命名约束,使用关键字 CONSTRAINT 后面跟一个标识符(作为名字),然后再跟约束定义。如果你不用这个方法声明约束,那么系统会自动为你选择一个名字。
A check constraint can also refer to several columns. Say you store a regular price and a discounted price, and you
want to ensure that the discounted price is lower than the regular price:
一个检查约束也可以引用多个字段。假设你存储一个正常价格和一个折扣价,并且你想保证折扣价比正常价低。
CREATE TABLE products ( product_no integer, name text, price numeric CHECK (price > 0), discounted_price numeric CHECK (discounted_price > 0), CHECK (price > discounted_price) );
The first two constraints should look familiar. The third one uses a new syntax. It is not attached to a particular
column, instead it appears as a separate item in the comma-separated column list. Column definitions and these
constraint definitions can be listed in mixed order.
头两个约束看上去很面熟。第三个使用了一个新的语法。它没有附着在某个字段上,而是在逗号分隔的字段列表中以一个独立行的形式出现。字段定义和约束定义可以按照任意顺序列出。
We say that the first two constraints are column constraints, whereas the third one is a table constraint because
it is written separately from any one column definition. Column constraints can also be written as table
constraints, while the reverse is not necessarily possible, since a column constraint is supposed to refer to only
the column it is attached to. (PostgreSQL doesn't enforce that rule, but you should follow it if you want your
table definitions to work with other database systems.) The above example could also be written as:
我们称头两个约束是"字段约束",而第三个约束是"表约束"(和字段定义分开写)。字段约束也可以写成表约束,而反过来很可能不行,因为系统假设字段约束只引用它所从属的字段。PostgreSQL 并不强制这条规则,但是如果你希望自己的表定义可以和其它数据库系统兼容,那么你最好还是遵循这条规则。上面的例子也可以这么写:
CREATE TABLE products ( product_no integer, name text, price numeric, CHECK (price > 0), discounted_price numeric, CHECK (discounted_price > 0), CHECK (price > discounted_price) );
or even:
或者是
CREATE TABLE products ( product_no integer, name text, price numeric CHECK (price > 0), discounted_price numeric, CHECK (discounted_price > 0 AND price > discounted_price) );
It's a matter of taste.
这只是风格的不同。
Names can be assigned to table constraints in the same way as column constraints:
和字段约束一样,我们也可以给表约束赋予名称,方法也相同:
CREATE TABLE products ( product_no integer, name text, price numeric, CHECK (price > 0), discounted_price numeric, CHECK (discounted_price > 0), CONSTRAINT valid_discount CHECK (price > discounted_price) );
It should be noted that a check constraint is satisfied if the check expression evaluates to true or the null
value. Since most expressions will evaluate to the null value if any operand is null, they will not prevent null
values in the constrained columns. To ensure that a column does not contain null values, the not-null constraint
described in the next section can be used.
我们还要注意的是,当约束表达式计算结果为 NULL 的时候,检查约束会被认为是满足条件的。因为大多数表达式在含有 NULL 操作数的时候结果都是 NULL ,所以这些约束不能阻止字段值为 NULL 。要确保一个字段值不为 NULL ,可以使用下一节介绍的非空约束。
非空约束(Not-Null Constraints)
A not-null constraint simply specifies that a column must not assume the null value. A syntax example:
非空约束只是简单地声明一个字段必须不能是 NULL 。下面是一个例子:
CREATE TABLE products ( product_no integer NOT NULL, name text NOT NULL, price numeric );
A not-null constraint is always written as a column constraint. A not-null constraint is functionally equivalent to
creating a check constraint CHECK (column_name IS NOT NULL), but in PostgreSQL creating an explicit not-null
constraint is more efficient. The drawback is that you cannot give explicit names to not-null constraints created
this way.
一个非空约束总是写成一个字段约束。非空约束在功能上等效于创建一个检查约束 CHECK (column_name IS NOT NULL) ,但在 PostgreSQL 里,创建一个明确的非空约束效率更高。缺点是你不能给它一个明确的名字。
Of course, a column can have more than one constraint. Just write the constraints one after another:
当然,一个字段可以有多个约束。只要一个接着一个写就可以了:
CREATE TABLE products ( product_no integer NOT NULL, name text NOT NULL, price numeric NOT NULL CHECK (price > 0) );
The order doesn't matter. It does not necessarily determine in which order the constraints are checked.
它们的顺序无所谓。顺序并不影响约束检查的顺序。
The NOT NULL constraint has an inverse: the NULL constraint. This does not mean that the column must be null, which
would surely be useless. Instead, this simply selects the default behavior that the column might be null. The NULL
constraint is not present in the SQL standard and should not be used in portable applications. (It was only added
to PostgreSQL to be compatible with some other database systems.) Some users, however, like it because it makes it
easy to toggle the constraint in a script file. For example, you could start with:
NOT NULL 约束有个相反的约束:NULL 约束。它并不意味着该字段必须是空,因为这样的字段也没用。它只是定义了该字段可以为空的这个缺省行为。在 SQL 标准里没有定义 NULL 约束,因此不应该在可移植的应用中使用它。在 PostgreSQL 里面增加这个约束只是为了和其它数据库系统兼容。不过,有些用户喜欢它,因为这个约束可以让他们很容易在脚本文件里切换约束。比如,你可以从下面这样开始
CREATE TABLE products ( product_no integer NULL, name text NULL, price numeric NULL );
and then insert the NOT key word where desired.
然后在需要的时候插入 NOT 关键字。
Tip: In most database designs the majority of columns should be marked not null.
【提示】在大多数数据库设计里,主要的字段都应该标记为非空。
唯一约束(Unique Constraints)
Unique constraints ensure that the data contained in a column or a group of columns is unique with respect to all
the rows in the table. The syntax is:
唯一约束保证在一个字段或者一组字段里的数据与表中其它行的数据相比是唯一的。它的语法是:
CREATE TABLE products ( product_no integer UNIQUE, name text, price numeric );
when written as a column constraint, and:
上面是写成字段约束,下面这个则写成表约束:
CREATE TABLE products ( product_no integer, name text, price numeric, UNIQUE (product_no) );
when written as a table constraint.
当写一个表约束的时候
If a unique constraint refers to a group of columns, the columns are listed separated by commas:
如果一个唯一约束引用一组字段,那么这些字段用逗号分隔列出:
CREATE TABLE example ( a integer, b integer, c integer, UNIQUE (a, c) );
This specifies that the combination of values in the indicated columns is unique across the whole table, though any
one of the columns need not be (and ordinarily isn't) unique.
这样就声明了特定字段值的组合在整个表范围内是唯一的。但是这些字段中的某个单独值可以不必是(并且通常也确实不是)唯一的。
You can assign your own name for a unique constraint, in the usual way:
你也可以给唯一约束赋予一个自己定义的名字,方法与前面相同:
CREATE TABLE products ( product_no integer CONSTRAINT must_be_different UNIQUE, name text, price numeric );
Adding a unique constraint will automatically create a unique btree index on the column or group of columns used in
the constraint.
添加一个唯一约束会在一个使用约束的列或列组中自动创建中使用唯一的btree索引
In general, a unique constraint is violated when there is more than one row in the table where the values of all of
the columns included in the constraint are equal. However, two null values are not considered equal in this
comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that
contain a null value in at least one of the constrained columns. This behavior conforms to the SQL standard, but we
have heard that other SQL databases might not follow this rule. So be careful when developing applications that are
intended to be portable.
通常,如果包含在唯一约束中的那几个字段在表中有多个相同的行,就违反了唯一约束。但是在这种比较中,NULL 被认为是不相等的。这就意味着,在多字段唯一约束的情况下,如果在至少一个字段上出现 NULL ,那么我们还是可以存储同样的这种数据行。这种行为遵循 SQL 标准,但是我们听说其它 SQL 数据库可能不遵循这个标准。因此如果你要开发可移植的程序,那么最好仔细些。
主键(Primary Keys)
Technically, a primary key constraint is simply a combination of a unique constraint and a not-null constraint. So,
the following two table definitions accept the same data:
从技术上讲,主键约束只是唯一约束和非空约束的组合。所以,下面两个表定义是等价的:
CREATE TABLE products ( product_no integer UNIQUE NOT NULL, name text, price numeric );
CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric );
Primary keys can also constrain more than one column; the syntax is similar to unique constraints:
主键也可以约束多于一个字段;其语法类似于唯一约束:
CREATE TABLE example ( a integer, b integer, c integer, PRIMARY KEY (a, c) );
A primary key indicates that a column or group of columns can be used as a unique identifier for rows in the table.
(This is a direct consequence of the definition of a primary key. Note that a unique constraint does not, by
itself, provide a unique identifier because it does not exclude null values.) This is useful both for documentation
purposes and for client applications. For example, a GUI application that allows modifying row values probably
needs to know the primary key of a table to be able to identify rows uniquely.
主键表示一个或多个字段的组合可以用于唯一标识表中的数据行。这是定义一个主键的直接结果。请注意:一个唯一约束实际上并不能提供一个唯一标识,因为它不排除 NULL 。这个功能对文档目的和客户应用都很有用。比如,一个可以修改行数值的 GUI 应用可能需要知道一个表的主键才能唯一地标识每一行。
Adding a primary key will automatically create a unique btree index on the column or group of columns used in the
primary key. 添加一个主健将在使用主健的列或列组中自动创建一个唯一的btree索引
A table can have at most one primary key. (There can be any number of unique and not-null constraints, which are
functionally the same thing, but only one can be identified as the primary key.) Relational database theory
dictates that every table must have a primary key. This rule is not enforced by PostgreSQL, but it is usually best
to follow it.
一个表最多可以有一个主键(但是它可以有多个唯一和非空约束)。关系型数据库理论告诉我们,每个表都必须有一个主键。PostgreSQL 并不强制这个规则,但我们最好还是遵循它。
外键(Foreign Keys)
A foreign key constraint specifies that the values in a column (or a group of columns) must match the values
appearing in some row of another table. We say this maintains the referential integrity between two related tables.
Say you have the product table that we have used several times already:
外键约束声明一个字段(或者一组字段)的数值必须匹配另外一个表中出现的数值。我们把这个行为称为两个相关表之间的参照完整性。
假设你有个产品表,我们可能使用了好几次:
CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric );
Let's also assume you have a table storing orders of those products. We want to ensure that the orders table only
contains orders of products that actually exist. So we define a foreign key constraint in the orders table that
references the products table:
假设你有一个存储这些产品的订单的表。我们想保证订单表只包含实际存在的产品。因此我们在订单表中定义一个外键约束引用产品表:
CREATE TABLE orders ( order_id integer PRIMARY KEY, product_no integer REFERENCES products (product_no), quantity integer );
Now it is impossible to create orders with product_no entries that do not appear in the products table.
We say that in this situation the orders table is the referencing table and the products table is the referenced
table. Similarly, there are referencing and referenced columns.
You can also shorten the above command to:
现在,我们不能创建任何其 product_no 没有在产品表中出现的订单。
在这种情况下我们把订单表叫做引用表,而产品表叫做被引用表。同样,也有引用字段和被引用字段。
你也可以把上面的命令简写成
CREATE TABLE orders ( order_id integer PRIMARY KEY, product_no integer REFERENCES products, quantity integer );
because in absence of a column list the primary key of the referenced table is used as the referenced column(s).
A foreign key can also constrain and reference a group of columns. As usual, it then needs to be written in table
constraint form. Here is a contrived syntax example:
因为如果缺少字段列表的话,就会引用被引用表的主键。
一个外键也可以约束和引用一组字段。同样,也需要写成表约束的形式。下面是一个捏造出来的语法例子:
CREATE TABLE t1 ( a integer PRIMARY KEY, b integer, c integer, FOREIGN KEY (b, c) REFERENCES other_table (c1, c2) );
Of course, the number and type of the constrained columns need to match the number and type of the referenced
columns.
You can assign your own name for a foreign key constraint, in the usual way.
当然,被约束的字段数目和类型需要和被引用字段数目和类型一致。
和平常一样,你也可以给外键约束赋予自定义的名字。
A table can contain more than one foreign key constraint. This is used to implement many-to-many relationships
between tables. Say you have tables about products and orders, but now you want to allow one order to contain
possibly many products (which the structure above did not allow). You could use this table structure:
一个表可以包含多于一个外键约束。这个特性用于实现表之间的多对多关系。比如你有关于产品和订单的表,但现在你想允许一个订单可以包含多种产品(上面那个结构是不允许这么做的),你可以使用这样的结构:
CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric );
CREATE TABLE orders ( order_id integer PRIMARY KEY, shipping_address text, ... );
CREATE TABLE order_items ( product_no integer REFERENCES products, order_id integer REFERENCES orders, quantity integer, PRIMARY KEY (product_no, order_id) );
Notice that the primary key overlaps with the foreign keys in the last table.
We know that the foreign keys disallow creation of orders that do not relate to any products. But what if a product
is removed after an order is created that references it? SQL allows you to handle that as well. Intuitively, we
have a few options:
注意最后的表的主键和外键是重叠的。
我们知道外键不允许创建和任何产品都无关的订单。但是如果一个订单创建之后其引用的产品被删除了怎么办?SQL 也允许你处理这个问题。简单说,我们有几种选择:
Disallow deleting a referenced product
不允许删除一个被引用的产品
Delete the orders as well
同时也删除订单
Something else?
其它的?
To illustrate this, let's implement the following policy on the many-to-many relationship example above: when
someone wants to remove a product that is still referenced by an order (via order_items), we disallow it. If
someone removes an order, the order items are removed as well:
为了说明这个问题,我们对上面的多对多关系制定下面的策略:如果有人想删除一种仍然被某个订单引用的产品(通过 order_items),那么就不允许这么做。如果有人删除了一个订单,那么订单项也被删除。
CREATE TABLE products ( product_no integer PRIMARY KEY, name text, price numeric );
CREATE TABLE orders ( order_id integer PRIMARY KEY, shipping_address text, ... );
CREATE TABLE order_items ( product_no integer REFERENCES products ON DELETE RESTRICT, order_id integer REFERENCES orders ON DELETE CASCADE, quantity integer, PRIMARY KEY (product_no, order_id) );
Restricting and cascading deletes are the two most common options. RESTRICT prevents deletion of a referenced row.
NO ACTION means that if any referencing rows still exist when the constraint is checked, an error is raised; this
is the default behavior if you do not specify anything. (The essential difference between these two choices is that
NO ACTION allows the check to be deferred until later in the transaction, whereas RESTRICT does not.) CASCADE
specifies that when a referenced row is deleted, row(s) referencing it should be automatically deleted as well.
There are two other options: SET NULL and SET DEFAULT. These cause the referencing columns to be set to nulls or
default values, respectively, when the referenced row is deleted. Note that these do not excuse you from observing
any constraints. For example, if an action specifies SET DEFAULT but the default value would not satisfy the
foreign key, the operation will fail.
限制和级联删除是两种最常见的选项。RESTRICT 禁止删除被引用的行。NO ACTION 的意思是如果在检查约束的时候还存在任何引用行,则抛出错误;如果你不声明任何东西,那么它就是缺省的行为。这两个选择的实际区别是:NO ACTION 允许约束检查推迟到事务的晚些时候,而 RESTRICT 不行。CASCADE 声明在删除一个被引用的行的时候,所有引用它的行也会被自动删除掉。在外键字段上的动作还有两个选项:SET NULL 和 SET DEFAULT ,它们导致在被引用行删除的时候,将引用它们的字段分别设置为 NULL 和缺省值。请注意这些选项并不能让你逃脱被观察和约束的境地。比如,如果一个动作声明 SET DEFAULT ,但是缺省值并不能满足外键,那么该动作就会失败。
Analogous to ON DELETE there is also ON UPDATE which is invoked when a referenced column is changed (updated). The
possible actions are the same.
与 ON DELETE 类似的还有 ON UPDATE 选项,它是在被引用字段修改(更新)的时候调用的,可用的动作是一样的。
Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the
referencing table for rows matching the old value, it is often a good idea to index the referencing columns.
Because this is not always needed, and there are many choices available on how to index, declaration of a foreign
key constraint does not automatically create an index on the referencing columns.
More information about updating and deleting data is in Chapter 6. 删除引用表的一行或更新引用列需要为比配旧的值扫描相关表,有个好办法就是使用引用列的索引,因为这个不总是需要的, 这里也有好几个选择如何去索引,声明一个外键约束不要需要自动创建一个引用列的索引
有关更新和删除数据的更多信息可以在章6里找到。
Finally, we should mention that a foreign key must reference columns that either are a primary key or form a unique
constraint. If the foreign key references a unique constraint, there are some additional possibilities regarding
how null values are matched. These are explained in the reference documentation for CREATE TABLE.
最后,我们应该说明的是,一个外键必须要么引用一个主键,要么引用一个唯一约束。如果外键引用了一个唯一约束,那么在如何匹配 NULL 这个问题上还有一些其它的可能性。这些东西都在 CREATE TABLE 中解释。
排除约束( Exclusion Constraints)
Exclusion constraints ensure that if any two rows are compared on the specified columns or expressions using the
specified operators, at least one of these operator comparisons will return false or null. The syntax is:
排除约束虚确保使用指定操作比较两行指定的列或表达式相同,最后可操作比较可能会返回 false 或 null, 语法为
CREATE TABLE circles ( c circle, EXCLUDE USING gist (c WITH &&) );
See also CREATE TABLE ... CONSTRAINT ... EXCLUDE for details.
查看CREATE TABLE ... CONSTRAINT ... EXCLUDE 了解细节
Adding an exclusion constraint will automatically create an index of the type specified in the constraint
declaration.
添加一个排除约束将自动创建类型指定约束声明的索引
系统字短(System Columns)
Every table has several system columns that are implicitly defined by the system. Therefore, these names cannot be
used as names of user-defined columns. (Note that these restrictions are separate from whether the name is a key
word or not; quoting a name will not allow you to escape these restrictions.) You do not really need to be
concerned about these columns; just know they exist.
每个表都有几个系统字段,这些字段是由系统隐含定义的。因此,这些名字不能用于用户定义的字段名。请注意这些限制与这个名字是否关键字无关,把名字用引号括起来并不能让你逃离这些限制。你实际上不需要注意这些字段,只要知道它们存在就可以了。
oid The object identifier (object ID) of a row. This column is only present if the table was created using WITH OIDS,
or if the default_with_oids configuration variable was set at the time. This column is of type oid (same name as
the column); see Section 8.16 for more information about the type.
行对象标识符(对象ID)。这个字段只有在创建表的时候使用了 WITH OIDS 或者是配置参数 default_with_oids 的值为真时出现。这个字段的类型是 oid(和字段同名)。参阅节8.16获取有关这种类型的更多信息。
tableoid The OID of the table containing this row. This column is particularly handy for queries that select from
inheritance hierarchies (see Section 5.8), since without it, it's difficult to tell which individual table a row
came from. The tableoid can be joined against the oid column of pg_class to obtain the table name.
包含本行的表的 OID 。这个字段对那些从继承层次中选取的查询特别有用(参阅节5.8),因为如果没有它的话,我们就很难说明一行来自哪个独立的表。tableoid 可以和 pg_class 的 oid 字段连接起来获取表名字。
xmin The identity (transaction ID) of the inserting transaction for this row version. (A row version is an individual
state of a row; each update of a row creates a new row version for the same logical row.)
插入该行版本的事务标识(事务ID)。注意:在这个环境里,一个行版本是一行的一个状态;一行的每次更新都为同一个逻辑行创建一个新的行版本。
cmin The command identifier (starting at zero) within the inserting transaction.
在插入事务内部的命令标识(从零开始)。
xmax The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version. It is possible for
this column to be nonzero in a visible row version. That usually indicates that the deleting transaction hasn't
committed yet, or that an attempted deletion was rolled back.
删除事务的标识(事务ID),如果不是被删除的行版本,那么是零。在一个可见行版本里,这个字段有可能是非零。这通常意味着删除事务还没有提交,或者是一个删除的企图被回滚掉了。
cmax The command identifier within the deleting transaction, or zero.
删除事务内部的命令标识符,或者是零。
ctid The physical location of the row version within its table. Note that although the ctid can be used to locate the
row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is
useless as a long-term row identifier. The OID, or even better a user-defined serial number, should be used to
identify logical rows.
一个行版本在它所处的表内的物理位置。请注意,尽管 ctid 可以用于非常快速地定位行版本,但每次 VACUUM FULL 之后,一个行的 ctid 都会被更新或者移动。因此 ctid 是不能作为长期的行标识符的。应该使用 OID ,或者更好是用户定义的序列号,来标识一个逻辑行。
OIDs are 32-bit quantities and are assigned from a single cluster-wide counter. In a large or long-lived database,
it is possible for the counter to wrap around. Hence, it is bad practice to assume that OIDs are unique, unless you
take steps to ensure that this is the case. If you need to identify the rows in a table, using a sequence generator
is strongly recommended. However, OIDs can be used as well, provided that a few additional precautions are taken:
OID 是32位的量,是在同一个集群内通用的计数器上赋值的。对于一个大型或者长时间使用的数据库,这个计数器是有可能重叠的。因此,假定 OID 唯一是非常错误的,除非你自己采取了措施来保证它们是唯一的。如果你需要标识表中的行,我们强烈建议使用序列号生成器。不过,也可以使用 OID ,只要采取几个注意事项即可:
A unique constraint should be created on the OID column of each table for which the OID will be used to identify
rows. When such a unique constraint (or unique index) exists, the system takes care not to generate an OID matching
an already-existing row. (Of course, this is only possible if the table contains fewer than 232 (4 billion) rows,
and in practice the table size had better be much less than that, or performance might suffer.)
在使用 OID 标识行的每个表的 OID 字段创建一个唯一约束。在唯一约束(或者唯一索引)存在的时候,系统会注意不去生成一个和现有行相同的 OID 。当然,只有在表中的数据行少于 232(40亿)行的时候才是可能的,而实际上表中的行最好远比这个小,要不性能就会受影响了。
OIDs should never be assumed to be unique across tables; use the combination of tableoid and row OID if you need a
database-wide identifier.
绝对不要假设 OID 是跨表唯一的;如果你需要全数据库范围内的标识,请使用 tableoid 和行的 OID 的组合。
Of course, the tables in question must be created WITH OIDS. As of PostgreSQL 8.1, WITHOUT OIDS is the default.
需要 OID 的表应该带着 WITH OIDS 创建。从 PostgreSQL 8.1开始,WITHOUT OIDS 是缺省的。
Transaction identifiers are also 32-bit quantities. In a long-lived database it is possible for transaction IDs to
wrap around. This is not a fatal problem given appropriate maintenance procedures; see Chapter 23 for details. It
is unwise, however, to depend on the uniqueness of transaction IDs over the long term (more than one billion
transactions).
事务标识符也是32位的量。在长时间运转的数据库里,它也可能会重叠。只要我们采取一些合适的维护步骤,这并不是很要命的问题;参阅章22获取细节。不过,在长时间运行的环境里(超过十亿次事务)依赖事务ID的唯一性并非明智的做法。
Command identifiers are also 32-bit quantities. This creates a hard limit of 232 (4 billion) SQL commands within a
single transaction. In practice this limit is not a problem — note that the limit is on the number of SQL
commands, not the number of rows processed. Also, as of PostgreSQL 8.3, only commands that actually modify the
database contents will consume a command identifier.
命令标识符也是32位的量。这样就在一个事务里有 232(四十亿)条 SQL 命令的硬限制。在现实里这个限制应该不是什么问题,需要注意的是这个限制是 SQL 命令的条数,而不是处理的行版本的条数。
修改表(Modifying Tables)
When you create a table and you realize that you made a mistake, or the requirements of the application change, you
can drop the table and create it again. But this is not a convenient option if the table is already filled with
data, or if the table is referenced by other database objects (for instance a foreign key constraint). Therefore
PostgreSQL provides a family of commands to make modifications to existing tables. Note that this is conceptually
distinct from altering the data contained in the table: here we are interested in altering the definition, or
structure, of the table.
如果你创建了一个表后发现自己犯了一个错误,或者是应用的需求发生了变化,那么你可以删除这个表然后重新创建它。但是如果这个表已经填充了许多数据,或者该表已经被其它数据库对象引用(比如一个外键约束),那这可不是一个方便的方法。因此 PostgreSQL 提供了一族命令用于修改现有表。请注意它在概念上和修改一个表中包含的数据是不一样的:这里我们感兴趣的是修改一个表的定义,或者说结构。
You can:
你可以
Add columns
增加字段
Remove columns
删除字段
Add constraints
增加约束
Remove constraints
删除约束
Change default values
修改缺省值
Change column data types
修改字段数据类型
Rename columns
重命名字段
Rename tables
重命名表
All these actions are performed using the ALTER TABLE command, whose reference page contains details beyond those
所有这些动作都是用 ALTER TABLE 命令执行的。
given here.
填加一列(Adding a Column)
To add a column, use a command like:
要增加一个字段,使用下面这样的命令:
ALTER TABLE products ADD COLUMN description text;
The new column is initially filled with whatever default value is given (null if you don't specify a DEFAULT
clause).
You can also define constraints on the column at the same time, using the usual syntax:
新增的字段对于表中已经存在的行而言最初将先填充所给出的缺省值(如果你没有声明 DEFAULT 子句,那么缺省是 NULL)。
你也可以同时在该字段上定义约束,使用通常的语法:
ALTER TABLE products ADD COLUMN description text CHECK (description <> '');
In fact all the options that can be applied to a column description in CREATE TABLE can be used here. Keep in mind
however that the default value must satisfy the given constraints, or the ADD will fail. Alternatively, you can add
constraints later (see below) after you've filled in the new column correctly.
实际上,所有在 CREATE TABLE 里描述的可以应用于字段的选项都可以在这里使用。不过,我们要注意的是缺省值必须满足给出的约束,否则 ADD 将会失败。另外,你可以在正确填充了新字段的数值之后再增加约束(见下文)。
Tip: Adding a column with a default requires updating each row of the table (to store the new column value).
However, if no default is specified, PostgreSQL is able to avoid the physical update. So if you intend to fill the
column with mostly nondefault values, it's best to add the column with no default, insert the correct values using
UPDATE, and then add any desired default as described below.
【提示】添加一个字段并填充缺省值将会导致更新表中的所有行(为了存储新字段的值),但如果没有指定缺省值,PostgreSQL 就可以避免物理更新。所以如果你将要在新字段中填充的值大多数都不等于缺省值,那么最好添加一个没有缺省值的字段,然后再使用 UPDATE 更新数据,最后使用下面的方法添加缺省值。
移去一列(Removing a Column)
To remove a column, use a command like:
要删除一个字段,使用下面这样的命令:
ALTER TABLE products DROP COLUMN description;
Whatever data was in the column disappears. Table constraints involving the column are dropped, too. However, if
the column is referenced by a foreign key constraint of another table, PostgreSQL will not silently drop that
constraint. You can authorize dropping everything that depends on the column by adding CASCADE:
不管字段里有啥数据,都会消失,和这个字段相关的约束也会被删除。不过,如果这个字段被另一个表的外键所引用,PostgreSQL 则不会隐含地删除该约束。你可以通过使用 CASCADE 指明删除任何依赖该字段的东西:
ALTER TABLE products DROP COLUMN description CASCADE;
See Section 5.12 for a description of the general mechanism behind this.
参阅节5.12获取有关这些操作背后的机制的信息。
添加一个约束 (Adding a Constraint)
To add a constraint, the table constraint syntax is used. For example:
要增加一个约束,必须使用表约束语法。比如:
ALTER TABLE products ADD CHECK (name <> ''); ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no); ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;
To add a not-null constraint, which cannot be written as a table constraint, use this syntax:
要增加一个不能写成表约束的非空约束,使用下面的语法:
ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;
The constraint will be checked immediately, so the table data must satisfy the constraint before it can be added.
这个约束将立即进行检查,所以表在添加约束之前必须符合约束条件。
移去一个约束(Removing a Constraint)
To remove a constraint you need to know its name. If you gave it a name then that's easy. Otherwise the system
assigned a generated name, which you need to find out. The psql command \d tablename can be helpful here; other
interfaces might also provide a way to inspect table details. Then the command is:
要删除一个约束,你需要知道它的名字。如果你曾经给了它取了名字,那么事情就很简单。否则你就需要找出系统自动分配的名字。psql 的命令 \d tablename 可以这个帮忙;其它接口可能也提供了检查表的细节的方法。然后就是这条命令:
ALTER TABLE products DROP CONSTRAINT some_name;
(If you are dealing with a generated constraint name like $2, don't forget that you'll need to double-quote it to
make it a valid identifier.)
As with dropping a column, you need to add CASCADE if you want to drop a constraint that something else depends on.
An example is that a foreign key constraint depends on a unique or primary key constraint on the referenced column
(s).
This works the same for all constraint types except not-null constraints. To drop a not null constraint use:
如果你在处理一个生成的约束名,比如 $2 ,别忘了你需要给它添加双引号,让它成为一个有效的标识符。
和删除字段一样,如果你想删除被依赖的约束,你需要用 CASCADE 。一个例子是某个外键约束依赖被引用字段上的唯一约束或者主键约束。
除了非空约束外,所有约束类型都这么用。要删除非空约束,可以这样:
ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
(Recall that not-null constraints do not have names.)
要记得非空约束没有名字。
改变一个列的默认值 (Changing a Column's Default Value)
To set a new default for a column, use a command like:
要给一个字段设置缺省值,可以使用一个像下面这样的命令:
ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
Note that this doesn't affect any existing rows in the table, it just changes the default for future INSERT
commands.
To remove any default value, use:
请注意这么做不会影响任何表中现有的数据行,它只是为将来的 INSERT 命令改变缺省值。
要删除缺省值,可以用
ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
This is effectively the same as setting the default
这样实际上相当于把缺省设置为空。结果是,如果我们删除一个还没有定义的缺省值不算错误,因为缺省隐含就是 NULL 。
改变一列的数据类型(Changing a Column's Data Type)
To convert a column to a different data type, use a command like:
把一个字段转换成另外一种数据类型,使用下面的命令:
ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
This will succeed only if each existing entry in the column can be converted to the new type by an implicit cast.
If a more complex conversion is needed, you can add a USING clause that specifies how to compute the new values
from the old.
只有在字段里现有的每个项都可以隐含的转换城新类型时才可能成功。如果需要更复杂的转换,你可以增加一个 USING 子句,它声明如何从旧值里计算新值。
PostgreSQL will attempt to convert the column's default value (if any) to the new type, as well as any constraints
that involve the column. But these conversions might fail, or might produce surprising results. It's often best to
drop any constraints on the column before altering its type, and then add back suitably modified constraints
afterwards.
PostgreSQL 将试图把字段的缺省值(如果存在)转换成新的类型,还有涉及该字段的任何约束。但是这些转换可能失败,或者可能生成奇怪的结果。在修改某字段类型之前,你最好删除那些约束,然后再把合适的约束添加上去。
重新命名一列(Renaming a Column)
To rename a column:
重命名一个字段:
ALTER TABLE products RENAME COLUMN product_no TO product_number;
重新命名一表(Renaming a Table)
To rename a table:
重命名一个表: ALTER TABLE products RENAME TO items;
权限(Privileges)
When an object is created, it is assigned an owner. The owner is normally the role that executed the creation
statement. For most kinds of objects, the initial state is that only the owner (or a superuser) can do anything
with the object. To allow other roles to use it, privileges must be granted. There are several different kinds of
privilege: SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, and
USAGE. For more information on the different types of privileges supported by PostgreSQL, see the GRANT reference
page.
如果你创建了一个数据库对象,那么你就成为它的所有者。对象的所有者是合法的角色可以执行创建声明。对于大多数种类的对象,初始 状态中只有所有者(或超级用户)可以对这个对象做任何事。可以让其他用户使用它,必须赋予他们权限。有几种不同类型的权限: ELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, and USAGE.如需更多有关PostgreSQL的支持权限的不同类型的信息,请参阅授予参考页
There are several different privileges: SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE,
CONNECT, TEMPORARY, EXECUTE, and USAGE. The privileges applicable to a particular object vary depending on the
object's type (table, function, etc). For complete information on the different types of privileges supported by
PostgreSQL, refer to the GRANT reference page. The following sections and chapters will also show you how those
privileges are used.
有好多种不同的权限:SELECT, INSERT, UPDATE, DELETE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, USAGE 。适用于特定对象的权限因对象类型(表/函数等)不同而不同。有关 PostgreSQL 所支持的不同类型的权限的完整信息,请参考 GRANT 的手册页。下面的章节将为你展示如何利用这些权限。
The right to modify or destroy an object is always the privilege of the owner only.
修改或者删除一个对象的权限永远是所有者独有的权限。
An object can be assigned to a new owner with an ALTER command of the appropriate kind for the object, e.g. ALTER
TABLE. Superusers can always do this; ordinary roles can only do it if they are both the current owner of the
object (or a member of the owning role) and a member of the new owning role.
一个对象可以为对象用一个合适类型ALTER命令分配到一个新的所有者,例如ALTER TABLE,超级用户总是可以做到这一点,如果是
当前对象的所有者(或所有者角色之一)或`新`所有者角色的一个成员,普通角色也可以做到这点
To assign privileges, the GRANT command is used. For example, if joe is an existing user, and accounts is an
existing table, the privilege to update the table can be granted with:
分配,使用GRANT命令。例如,如果joe是现有用户,帐户是现有的表,以更新的表的权限:
GRANT UPDATE ON accounts TO joe;
Writing ALL in place of a specific privilege grants all privileges that are relevant for the object type.
在权限的位置写上 ALL 则赋予所有与该对象类型相关的权限。
The special "user" name PUBLIC can be used to grant a privilege to every user on the system. Also, "group" roles
can be set up to help manage privileges when there are many users of a database — for details see Chapter 20.
To revoke a privilege, use the fittingly named REVOKE command:
名为 PUBLIC 的特殊"用户"可以用于将权限赋予系统中的所有用户。另外,还可以使用"组"角色来帮助管理一群用户的权限,细节可参见章20。
可以使用 REVOKE 命令撤销权限:
REVOKE ALL ON accounts FROM PUBLIC;
The special privileges of the object owner (i.e., the right to do DROP, GRANT, REVOKE, etc.) are always implicit in
being the owner, and cannot be granted or revoked. But the object owner can choose to revoke his own ordinary
privileges, for example to make a table read-only for himself as well as others.
对象所有者的特殊权限(也就是 DROP, GRANT, REVOKE 等权限)总是隐含地属于所有者,并且不能赋予或者撤销。但是对象所有者可以选择撤销自己的普通权限,比如把一个表做成对自己和别人都是只读的。
Ordinarily, only the object's owner (or a superuser) can grant or revoke privileges on an object. However, it is
possible to grant a privilege "with grant option", which gives the recipient the right to grant it in turn to
others. If the grant option is subsequently revoked then all who received the privilege from that recipient
(directly or through a chain of grants) will lose the privilege. For details see the GRANT and REVOKE reference
pages.
最初,只有对象所有者(或者超级用户)可以赋予或者撤销对象的权限。但是,我们可以赋予一个"with grant option"权限,这样就允许接受权限的人将该权限转授他人。如果授权选项后来被撤销,那么所有那些从这个接受者接受了权限的用户(直接或间级)都将失去该权限。细节详见 GRANT 和 REVOKE 手册页。
模式(Schemas)
A PostgreSQL database cluster contains one or more named databases. Users and groups of users are shared across the
entire cluster, but no other data is shared across databases. Any given client connection to the server can access
only the data in a single database, the one specified in the connection request.
一个 PostgreSQL 数据库集群包含一个或多个已命名数据库。用户和用户组在整个集群范围内是共享的,但是其它数据并不共享。任何与服务器连接的客户都只能访问那个在连接请求里声明的数据库。
Note: Users of a cluster do not necessarily have the privilege to access every database in the cluster. Sharing of
user names means that there cannot be different users named, say, joe in two databases in the same cluster; but the
system can be configured to allow joe access to only some of the databases.
【注意】集群中的用户并不一定要有访问集群内所有数据库的权限。共享用户名的意思是不能有重名用户。假定同一个集群里有两个数据库和一个 joe 用户,系统可以配置成只允许 joe 访问其中的一个数据库。
A database contains one or more named schemas, which in turn contain tables. Schemas also contain other kinds of
named objects, including data types, functions, and operators. The same object name can be used in different
schemas without conflict; for example, both schema1 and myschema can contain tables named mytable. Unlike
databases, schemas are not rigidly separated: a user can access objects in any of the schemas in the database he is
connected to, if he has privileges to do so.
一个数据库包含一个或多个已命名的模式,模式又包含表。模式还可以包含其它对象,包括数据类型、函数、操作符等。同一个对象名可以在不同的模式里使用而不会导致冲突;比如,schema1 和 myschema 都可以包含一个名为 mytable 的表。和数据库不同,模式不是严格分离的:只要有权限,一个用户可以访问他所连接的数据库中的任意模式中的对象。
There are several reasons why one might want to use schemas:
我们需要模式的原因有好多:
To allow many users to use one database without interfering with each other.
允许多个用户使用一个数据库而不会干扰其它用户。
To organize database objects into logical groups to make them more manageable.
把数据库对象组织成逻辑组,让它们更便于管理。
Third-party applications can be put into separate schemas so they do not collide with the names of other objects.
第三方的应用可以放在不同的模式中,这样它们就不会和其它对象的名字冲突。
Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.
模式类似于操作系统层次的目录,只不过模式不能嵌套。
创建一个概要(Creating a Schema)
To create a schema, use the CREATE SCHEMA command. Give the schema a name of your choice. For example:
要创建一个模式,使用 CREATE SCHEMA 命令。给出你选择的模式名字。比如:
CREATE SCHEMA myschema;
To create or access objects in a schema, write a qualified name consisting of the schema name and table name
separated by a dot:
要创建或者访问在模式中的对象,写出一个受修饰的名字,这个名字包含模式名以及表名,它们之间用一个句点分开:
schema.table
This works anywhere a table name is expected, including the table modification commands and the data access
commands discussed in the following chapters. (For brevity we will speak of tables only, but the same ideas apply
to other kinds of named objects, such as types and functions.)
这个方式在任何需要表名字的地方都可用,包括后面章节讨论的表修改命令和数据访问命令。出于简化,我们将只讨论表,这个概念适用于所有其它已命名对象类型,比如数据类型和函数。
Actually, the even more general syntax
实际上,更一般的语法是
database.schema.table
can be used too, but at present this is just for pro forma compliance with the SQL standard. If you write a
database name, it must be the same as the database you are connected to.
So to create a table in the new schema, use:
这个语法也可以使用,但目前它只是为了和 SQL 标准形式上兼容。如果你写了一个数据库名,那么它必须和你当前连接的数据库同名。
要在新模式里创建一个表,用
CREATE TABLE myschema.mytable ( ... );
To drop a schema if it's empty (all objects in it have been dropped), use:
如果一个模式是空的(所有它里面的对象都已经删除),那么删除一个模式的命令如下:
DROP SCHEMA myschema;
To drop a schema including all contained objects, use:
要删除一个模式及其包含的所有对象,可以使用:
DROP SCHEMA myschema CASCADE;
See Section 5.12 for a description of the general mechanism behind this.
Often you will want to create a schema owned by someone else (since this is one of the ways to restrict the
activities of your users to well-defined namespaces). The syntax for that is:
参阅节5.11获取对隐藏在这些动作背后的东西的一般机制的描述。
通常你想创建一个他人拥有的模式(因为这是一种限制用户在定义良好的模式中的活动的方法)。其语法如下:
CREATE SCHEMA schemaname AUTHORIZATION username;
You can even omit the schema name, in which case the schema name will be the same as the user name. See Section
5.7.6 for how this can be useful.
Schema names beginning with pg_ are reserved for system purposes and cannot be created by users.
你甚至可以省略模式名字,这时模式名将和用户名同名。参阅节5.7.6获取这种情况的适用场合。
以 pg_ 开头的模式名是保留给系统使用的,用户不能创建这样的名字。
公共模式 (The Public Schema)
In the previous sections we created tables without specifying any schema names. By default such tables (and other
objects) are automatically put into a schema named "public". Every new database contains such a schema. Thus, the
following are equivalent:
在前面的小节里,我们没有声明任何模式名字就创建了表。缺省时,这样的表(以及其它对象)都自动放到一个叫做"public"的模式中去了。每个新数据库都包含一个这样的模式。因此,下面的命令是等效的:
CREATE TABLE products ( ... );
and:
或者
CREATE TABLE public.products ( ... );
概要搜索路径(The Schema Search Path)
Qualified names are tedious to write, and it's often best not to wire a particular schema name into applications
anyway. Therefore tables are often referred to by unqualified names, which consist of just the table name. The
system determines which table is meant by following a search path, which is a list of schemas to look in. The first
matching table in the search path is taken to be the one wanted. If there is no match in the search path, an error
is reported, even if matching table names exist in other schemas in the database.
The first schema named in the search path is called the current schema. Aside from being the first schema searched,
it is also the schema in which new tables will be created if the CREATE TABLE command does not specify a schema
name.
To show the current search path, use the following command:
全称的名字写起来非常费劲,并且我们最好不要在应用里直接写上特定的模式名。因此,表通常都是用未修饰的名字引用的,这样的名字里只有表名字。系统通过查找一个搜索路径来判断一个表究竟是哪个表,这个路径是一个需要查找的模式名列表。在搜索路径里找到的第一个表将被使用。如果在搜索路径中没有找到表,那么就报告一个错误(即使在数据库里的其它模式中存在此表也如此)。
在搜索路径中的第一个模式叫做"当前模式"。除了是搜索的第一个模式之外,它还是在 CREATE TABLE 没有声明模式名的时候,新建表的默认所在地。
要显示当前搜索路径,使用下面的命令:
SHOW search_path;
In the default setup this returns:
在缺省的设置中,返回下面的东西:
search_path -------------- "$user",public
The first element specifies that a schema with the same name as the current user is to be searched. If no such
schema exists, the entry is ignored. The second element refers to the public schema that we have seen already.
The first schema in the search path that exists is the default location for creating new objects. That is the
reason that by default objects are created in the public schema. When objects are referenced in any other context
without schema qualification (table modification, data modification, or query commands) the search path is
traversed until a matching object is found. Therefore, in the default configuration, any unqualified access again
can only refer to the public schema.
To put our new schema in the path, we use:
第一个元素声明搜索和当前用户同名的模式。因为还没有这样的模式存在,所以这条记录被忽略。第二个元素指向我们已经看过的公共模式。
搜索路径中第一个存在的模式是创建新对象的缺省位置。这就是为什么缺省的对象都会创建在 public 模式里的原因。如果在其它环境中引用对象且没有模式修饰,那么系统会遍历搜索路径,直到找到一个匹配的对象。因此,在缺省的配置里,任何未修饰的访问只能引用 public 模式。
要设置模式的搜索路径,可以用
SET search_path TO myschema,public;
(We omit the $user here because we have no immediate need for it.) And then we can access the table without schema
qualification:
(省略了 $user 是因为并不立即需要它)然后我们就可以不使用模式修饰来访问表了:
DROP TABLE mytable;
Also, since myschema is the first element in the path, new objects would by default be created in it.
We could also have written:
同样,因为 myschema 是路径中的第一个元素,新对象缺省时将创建在这里。
我们也可以写成
SET search_path TO myschema;
Then we no longer have access to the public schema without explicit qualification. There is nothing special about
the public schema except that it exists by default. It can be dropped, too.
See also Section 9.23 for other ways to manipulate the schema search path.
然后我们如果不明确修饰的话,就不能再访问 public 模式了。public 模式没有任何特殊之处,只不过它缺省时就存在。我们也可以删除它。
又见节9.23以获取其它操作模式搜索路径的方法。
The search path works in the same way for data type names, function names, and operator names as it does for table
names. Data type and function names can be qualified in exactly the same way as table names. If you need to write a
qualified operator name in an expression, there is a special provision: you must write
搜索路径对于数据类型名、函数名、操作符名的运作方式和表名完全相同。数据类型和函数名可以像表名一样加以修饰。如果你需要在表达式里写一个有模式修饰的操作符,你必须这么写:
OPERATOR(schema.operator)
This is needed to avoid syntactic ambiguity. An example is:
这样是为了避免语法歧义。下面是一个例子:
SELECT 3 OPERATOR(pg_catalog.+) 4;
In practice one usually relies on the search path for operators, so as not to have to write anything so ugly as
that.
实践中我们通常依赖搜索路径寻找操作符,这样就不用写这么难看的东西了。
策略和权限(Schemas and Privileges)
By default, users cannot access any objects in schemas they do not own. To allow that, the owner of the schema must
grant the USAGE privilege on the schema. To allow users to make use of the objects in the schema, additional
privileges might need to be granted, as appropriate for the object.
缺省时,用户无法访问模式中不属于他们所有的对象。为了让他们能够访问,模式的所有者需要在模式上赋予他们 USAGE 权限。为了让用户使用模式中的对象,我们可能需要赋予适合该对象的额外权限。
A user can also be allowed to create objects in someone else's schema. To allow that, the CREATE privilege on the
schema needs to be granted. Note that by default, everyone has CREATE and USAGE privileges on the schema public.
This allows all users that are able to connect to a given database to create objects in its public schema. If you
do not want to allow that, you can revoke that privilege:
用户也可以在别人的模式里创建对象。要允许这么做,需要被赋予在该模式上的 CREATE 权限。请注意,缺省时每个人都在 public 模式上有 CREATE 和 USAGE 权限。这样就允许所有可以连接到指定数据库上的用户在这里创建对象。如果你不打算这么做,可以撤销这个权限:
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
(The first "public" is the schema, the second "public" means "every user". In the first sense it is an identifier,
in the second sense it is a key word, hence the different capitalization; recall the guidelines from Section
4.1.1.)
第一个"public"是模式,第二个"PUBLIC"意思是"所有用户"。第一句里它是个标识符,而第二句里是个关键字,所以有不同的大小写。记住我们在节4.1.1里面说过的原则。
系统目录概要(The System Catalog Schema)
In addition to public and user-created schemas, each database contains a pg_catalog schema, which contains the
system tables and all the built-in data types, functions, and operators. pg_catalog is always effectively part of
the search path. If it is not named explicitly in the path then it is implicitly searched before searching the
path's schemas. This ensures that built-in names will always be findable. However, you can explicitly place
pg_catalog at the end of your search path if you prefer to have user-defined names override built-in names.
除了 public 和用户创建的模式之外,每个数据库都包含一个 pg_catalog 模式,它包含系统表和所有内置数据类型、函数、操作符。pg_catalog 总是搜索路径中的一部分。如果它没有明确出现在路径中,那么它隐含地在所有路径之前搜索。这样就保证了内置名字总是可以被搜索。不过,你可以明确地把 pg_catalog 放在搜索路径之后,如果你想使用用户自定义的名字覆盖内置的名字的话。
In PostgreSQL versions before 7.3, table names beginning with pg_ were reserved. This is no longer true: you can
create such a table name if you wish, in any non-system schema. However, it's best to continue to avoid such names,
to ensure that you won't suffer a conflict if some future version defines a system table named the same as your
table. (With the default search path, an unqualified reference to your table name would then be resolved as the
system table instead.) System tables will continue to follow the convention of having names beginning with pg_, so
that they will not conflict with unqualified user-table names so long as users avoid the pg_ prefix.
在 PostgreSQL 版本 7.3 之前,以 pg_ 开头的表名字是保留的。这个规则现在不正确了:如果必要,你可以创建这样的表名字,只要是在非系统模式里。不过,我们最好还是不要使用这样的名字,以保证自己将来不会和新版本冲突:那些版本也许会定义一些和你的表同名的表(在缺省搜索路径中,一个对你的表的无修饰引用将解析为系统表)。系统表将继续遵循以 pg_ 开头的传统,因此,只要你的表不是以 pg_ 开头,就不会和无修饰的用户表名字冲突。
使用模式(Usage Patterns)
Schemas can be used to organize your data in many ways. There are a few usage patterns that are recommended and are
easily supported by the default configuration:
模式可以用多种方式组织数据。下面是一些建议使用的模式,它们也很容易在缺省配置中得到支持:
If you do not create any schemas then all users access the public schema implicitly. This simulates the situation
where schemas are not available at all. This setup is mainly recommended when there is only a single user or a few
cooperating users in a database. This setup also allows smooth transition from the non-schema-aware world.
如果没有创建任何模式,那么所有用户隐含都访问 public 模式。这样就模拟了没有模式的时候的情景。这种设置建议主要用在只有一个用户或者数据库里只有几个可信用户的情形。这样的设置也允许我们平滑地从无模式的环境过渡。
You can create a schema for each user with the same name as that user. Recall that the default search path starts with $user, which resolves to the user name. Therefore, if each user has a separate schema, they access their own schemas by default. If you use this setup then you might also want to revoke access to the public schema (or drop it altogether), so users are truly constrained to their own schemas.
你可以为每个用户创建一个模式,名字和用户相同。要记得缺省的搜索路径从 $user 开始,它会解析为用户名。因此,如果每个用户都有一个独立的模式,那么他们缺省时访问他们自己的模式。如果你使用了这样的设置,那么你可能还想撤销对 public 模式的访问(或者删除它),这样,用户就真的限制于他们自己的模式了。
To install shared applications (tables to be used by everyone, additional functions provided by third parties,
etc.), put them into separate schemas. Remember to grant appropriate privileges to allow the other users to access
them. Users can then refer to these additional objects by qualifying the names with a schema name, or they can put
the additional schemas into their search path, as they choose.
要安装共享的应用(被所有人使用的表、第三方提供的额外函数等等),我们可以把它们放到独立的模式中。只要记得给需要访问它们的用户赋予合适的权限就可以了。然后用户就可以通过用一个模式名修饰来使用这些额外的对象,或者他们可以把额外的模式放到他们的搜索路径中。
可移植性(Portability)
In the SQL standard, the notion of objects in the same schema being owned by different users does not exist.
Moreover, some implementations do not allow you to create schemas that have a different name than their owner. In
fact, the concepts of schema and user are nearly equivalent in a database system that implements only the basic
schema support specified in the standard. Therefore, many users consider qualified names to really consist of
username.tablename. This is how PostgreSQL will effectively behave if you create a per-user schema for every user.
在 SQL 标准里,在同一个模式里的对象被不同的用户所有的概念是不存在的。而且,有些实现不允许你创建和它们的所有者不同名的模式。实际上,模式和用户的概念在那些只实现了标准中规定的基本模式支持的数据库系统里几乎是一样的。因此,许多用户考虑对名字加以修饰,使它们真正由 username.tablename 组成。如果你为每个用户都创建了一个模式,这实际上就是 PostgreSQL 的行为。
Also, there is no concept of a public schema in the SQL standard. For maximum conformance to the standard, you
should not use (perhaps even remove) the public schema.
同样,在 SQL 标准里也没有 public 模式的概念。为了最大限度地遵循标准,你不应该使用(可能甚至是应该删除) public 模式。
Of course, some SQL database systems might not implement schemas at all, or provide namespace support by allowing
(possibly limited) cross-database access. If you need to work with those systems, then maximum portability would be
achieved by not using schemas at all.
当然,有些数据库系统可能根本没有模式,或者是通过允许跨数据库访问来提供模式的功能。如果你需要在这些系统上干活,那么为了最大限度的移植性,应该根本不使用模式。
继承(Inheritance)
PostgreSQL implements table inheritance, which can be a useful tool for database designers. (SQL:1999 and later
define a type inheritance feature, which differs in many respects from the features described here.)
PostgreSQL 实现了表继承,这个特性对数据库设计人员来说是一个很有效的工具。(SQL999 及以后的标准定义了类型继承特性,和我们在这里描述的很多特性有区别。)
Let's start with an example: suppose we are trying to build a data model for cities. Each state has many cities,
but only one capital. We want to be able to quickly retrieve the capital city for any particular state. This can be
done by creating two tables, one for state capitals and one for cities that are not capitals. However, what happens
when we want to ask for data about a city, regardless of whether it is a capital or not? The inheritance feature
can help to resolve this problem. We define the capitals table so that it inherits from cities:
让我们从一个例子开始:假设我们试图制作一个城市数据模型。每个州都有许多城市,但是只有一个首府。我们希望能够迅速检索任何州的首府。这个任务可以通过创建两个表来实现,一个是州府表,一个是非州府表。不过,如果我们不管什么城市都想查该怎么办?继承的特性可以帮助我们解决这个问题。我们定义 capitals 表,它继承自 cities 表:
CREATE TABLE cities ( name text, population float, altitude int -- in feet );
CREATE TABLE capitals ( state char(2) ) INHERITS (cities);
In this case, the capitals table inherits all the columns of its parent table, cities. State capitals also have an
extra column, state, that shows their state.
在这种情况下,capitals 表继承它的父表 cities 中的所有属性。州首府有一个额外的 state 属性显示其所在的州。
In PostgreSQL, a table can inherit from zero or more other tables, and a query can reference either all rows of a
table or all rows of a table plus all of its descendant tables. The latter behavior is the default. For example,
the following query finds the names of all cities, including state capitals, that are located at an altitude over
500 feet:
在 PostgreSQL 里,一个表可以从零个或多个其它表中继承属性,而且一个查询既可以引用一个表中的所有行,也可以引用一个表及其所有后代表的行(后面这个是缺省行为)。比如,下面的查询查找所有海拔 500 英尺以上的城市名,包括州首府:
SELECT name, altitude FROM cities WHERE altitude > 500;
Given the sample data from the PostgreSQL tutorial (see Section 2.1), this returns:
使用 PostgreSQL 教程里面的数据(参阅节2.1),它返回:
name | altitude -----------+---------- Las Vegas | 2174 Mariposa | 1953 Madison | 845
On the other hand, the following query finds all the cities that are not state capitals and are situated at an
altitude over 500 feet:
另一方面,如果要找出不包括州首府的所有海拔超过 500 英尺的城市,查询应该是这样的:
SELECT name, altitude FROM ONLY cities WHERE altitude > 500; name | altitude -----------+---------- Las Vegas | 2174 Mariposa | 1953
Here the ONLY keyword indicates that the query should apply only to cities, and not any tables below cities in the
inheritance hierarchy. Many of the commands that we have already discussed — SELECT, UPDATE and DELETE — support
the ONLY keyword.
In some cases you might wish to know which table a particular row originated from. There is a system column called
tableoid in each table which can tell you the originating table:
cities 前面的 ONLY 表明该查询应该只针对 cities 而不包括其后代。许多我们已经讨论过的命令(SELECT, UPDATE, DELETE)都支持 ONLY 关键字。
有时候你可能想知道某个行版本来自哪个表。在每个表里我们都有一个 tableoid 系统属性可以告诉你源表是谁:
SELECT c.tableoid, c.name, c.altitude FROM cities c WHERE c.altitude > 500;
which returns:
将返回
tableoid | name | altitude ----------+-----------+---------- 139793 | Las Vegas | 2174 139793 | Mariposa | 1953 139798 | Madison | 845
(If you try to reproduce this example, you will probably get different numeric OIDs.) By doing a join with pg_class you can see the actual table names:
(你可能会得到不同的 OID) 通过和 pg_class 做一个连接,就可以看到实际的表名字:
SELECT p.relname, c.name, c.altitude FROM cities c, pg_class p WHERE c.altitude > 500 AND c.tableoid = p.oid;
which returns:
将返回
relname | name | altitude ----------+-----------+---------- cities | Las Vegas | 2174 cities | Mariposa | 1953 capitals | Madison | 845
Inheritance does not automatically propagate data from INSERT or COPY commands to other tables in the inheritance
hierarchy. In our example, the following INSERT statement will fail:
对于 INSERT 或 COPY ,继承并不自动影响其后代表。在我们的例子里,下面的 INSERT 语句将会失败:
INSERT INTO cities (name, population, altitude, state) VALUES ('New York', NULL, NULL, 'NY');
We might hope that the data would somehow be routed to the capitals table, but this does not happen: INSERT always
inserts into exactly the table specified. In some cases it is possible to redirect the insertion using a rule (see
Chapter 37). However that does not help for the above case because the cities table does not contain the column
state, and so the command will be rejected before the rule can be applied.
我们可能希望数据被传递到 capitals 表里面去,但这是不会发生的:INSERT 总是插入明确声明的那个表。在某些情况下,我们可以使用规则进行重定向插入(参阅章37)。不过它不能对上面的例子有什么帮助,因为 cities 表并不包含 state 字段,因此命令在规则施加之前就会被拒绝掉。
All check constraints and not-null constraints on a parent table are automatically inherited by its children. Other
types of constraints (unique, primary key, and foreign key constraints) are not inherited.
所有父表的检查约束和非空约束都会自动被所有子表继承。不过其它类型的约束(唯一、主键、外键)不会被继承。
A table can inherit from more than one parent table, in which case it has the union of the columns defined by the
parent tables. Any columns declared in the child table's definition are added to these. If the same column name
appears in multiple parent tables, or in both a parent table and the child's definition, then these columns are
"merged" so that there is only one such column in the child table. To be merged, columns must have the same data
types, else an error is raised. The merged column will have copies of all the check constraints coming from any one
of the column definitions it came from, and will be marked not-null if any of them are.
一个子表可以从多个父表继承,这种情况下它将拥有所有父表字段的总和,并且子表中定义的字段也会加入其中。如果同一个字段名出现在多个父表中,或者同时出现在父表和子表的定义里,那么这些字段就会被"融合",这样在子表里就只有一个这样的字段。要想融合,字段的数据类型必须相同,否则就会抛出一个错误。融合的字段将会拥有其父字段的所有检查约束,并且如果某个父字段存在非空约束,那么融合后的字段也必须是非空的。
Table inheritance is typically established when the child table is created, using the INHERITS clause of the CREATE
TABLE statement. Alternatively, a table which is already defined in a compatible way can have a new parent
relationship added, using the INHERIT variant of ALTER TABLE. To do this the new child table must already include
columns with the same names and types as the columns of the parent. It must also include check constraints with the
same names and check expressions as those of the parent. Similarly an inheritance link can be removed from a child
using the NO INHERIT variant of ALTER TABLE. Dynamically adding and removing inheritance links like this can be
useful when the inheritance relationship is being used for table partitioning (see Section 5.9).
表继承通常使用带 INHERITS 子句的 CREATE TABLE 语句定义。另外,一个已经用此方法定义的子表可以使用带 INHERIT 的 ALTER TABLE 命令添加一个新父表。注意:该子表必须已经包含新父表的所有字段且类型一致,此外新父表的每个约束的名字及其表达式都必须包含在此子表中。同样,一个继承链可以使用带 NO INHERIT 的 ALTER TABLE 命令从子表上删除。允许动态添加和删除继承链对基于继承关系的表分区(参见节5.9)很有用。
One convenient way to create a compatible table that will later be made a new child is to use the LIKE clause in
CREATE TABLE. This creates a new table with the same columns as the source table. If there are any CHECK
constraints defined on the source table, the INCLUDING CONSTRAINTS option to LIKE should be specified, as the new
child must have constraints matching the parent to be considered compatible.
创建一个将要作为子表的新表的便利途径是使用带 LIKE 子句的 CREATE TABLE 命令。它将创建一个与源表字段相同的新表。如果源表中存在约束,那么应该指定 LIKE 的 INCLUDING CONSTRAINTS 选项,因为子表必须包含源表中的 CHECK 约束。
A parent table cannot be dropped while any of its children remain. Neither can columns or check constraints of
child tables be dropped or altered if they are inherited from any parent tables. If you wish to remove a table and
all of its descendants, one easy way is to drop the parent table with the CASCADE option.
任何存在子表的父表都不能被删除,同样,子表中任何从父表继承的字段也不能被删除或修改。如果你想删除一个表及其所有后代,最简单的办法是使用 CASCADE 选项。
ALTER TABLE will propagate any changes in column data definitions and check constraints down the inheritance
hierarchy. Again, dropping columns that are depended on by other tables is only possible when using the CASCADE
option. ALTER TABLE follows the same rules for duplicate column merging and rejection that apply during CREATE
TABLE.
ALTER TABLE 会把所有数据定义和检查约束传播到后代里面去。另外,只有在使用 CASCADE 选项的情况下,才能删除父表的字段或者约束。ALTER TABLE 在重复字段融合和拒绝方面和 CREATE TABLE 的规则相同。
Note how table access permissions are handled. Querying a parent table can automatically access data in child
tables without further access privilege checking. This preserves the appearance that the data is (also) in the
parent table. Accessing the child tables directly is, however, not automatically allowed and would require further
privileges to be granted.
请注意表的访问权限是如何处理,查询一个父表会自动访问子表中的数据而不需要跟多的访问权限检测,它也保留父表中的数据可见性, 但直接访问子表是不自动的,而且需要增加更多的权限。
注意事项(Caveats)
Note that not all SQL commands are able to work on inheritance hierarchies. Commands that are used for data
querying, data modification, or schema modification (e.g., SELECT, UPDATE, DELETE, most variants of ALTER TABLE,
but not INSERT and ALTER TABLE ... RENAME) typically default to including child tables and support the ONLY
notation to exclude them. Commands that do database maintenance and tuning (e.g., REINDEX, VACUUM) typically only
work on individual, physical tables and do no support recursing over inheritance hierarchies. The respective
behavior of each individual command is documented in the reference part (Reference I, SQL Commands).
注意不是所有的SQL命令都能在继承结构中正常工作,用于数据查询,数据更新,或策略更新(例如 SELECT, UPDATE, DELETE, 还有ALTER TABLE,但不支持INSERT 和 ALTER TABLE 还有 RENAME)通常默认情况下包括子表和支持ONLY符号去排除他们。数据的维护和调优(例如 REINDEX, VACUUM)典型只在单独,物理表中工作,不支持遍历继承结构。有关跟多的命令细节在(Reference I, SQL Commands).
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign key
constraints only apply to single tables, not to their inheritance children. This is true on both the referencing
and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
继承的一个严重局限性是索引(包括唯一约束)和外键约束只能用于单个表,而不能包括它们的子表(不管对引用表还是被引用表都是如此),因此,在上面的例子里:
If we declared cities.name to be UNIQUE or a PRIMARY KEY, this would not stop the capitals table from having rows
with names duplicating rows in cities. And those duplicate rows would by default show up in queries from cities. In
fact, by default capitals would have no unique constraint at all, and so could contain multiple rows with the same
name. You could add a unique constraint to capitals, but this would not prevent duplication compared to cities.
1 即使我们声明 cities.name 为 UNIQUE 或 PRIMARY KEY 也不会阻止 capitals 表拥有重复名字的 cities 数据行。并且这些重复的行在查询 cities 表的时候会显示出来。实际上,缺省时 capitals 将完全没有唯一约束,因此可能包含带有同名的多个行。你应该给 capitals 增加唯一约束,但即使这样做也不能避免与 cities 的重复。
Similarly, if we were to specify that cities.name REFERENCES some other table, this constraint would not
automatically propagate to capitals. In this case you could work around it by manually adding the same REFERENCES
constraint to capitals.
2 类似的,即使我们声明 cities.name 参照(REFERENCES)某些其它的表,这个约束也不会自动传播到 capitals 表。在这种条件下,你可以通过手工给 capitals 表增加同样的 REFERENCES 约束来做到这点。
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names,
but not capital names. There is no good workaround for this case.
3 声明一个其它表的字段为 REFERENCES cities(name) 将允许其它表包含城市名,但是不包含首府名。这种情况下没有很好的绕开办法。
These deficiencies will probably be fixed in some future release, but in the meantime considerable care is needed
in deciding whether inheritance is useful for your application.
这些缺点很可能在将来的版本中修补,但同时你也需要考虑一下,继承是否对你的问题真正有用。
Deprecated: In releases of PostgreSQL prior to 7.1, the default behavior was not to include child tables in
queries. This was found to be error prone and also in violation of the SQL standard. You can get the pre-7.1
behavior by turning off the sql_inheritance configuration option.
【已废弃】在7.1以前的 PostgreSQL 版本里,缺省的行为是不在查询里包含子表。后来发现这么做很容易出错并且也违反了 SQL 标柱。你可以通过关闭 sql_inheritance 配置选项来兼容以前的行为。
分区(Partitioning)
PostgreSQL supports basic table partitioning. This section describes why and how to implement partitioning as part of your database design.
PostgreSQL 支持基本的表分区功能。本节描述为什么需要表分区以及如何在数据库设计中使用表分区。
概述(Overview)
Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits:
分区的意思是把逻辑上的一个大表分割成物理上的几块。分区可以提供若干好处:
Query performance can be improved dramatically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. The partitioning substitutes for leading columns of indexes, reducing index size and making it more likely that the heavily-used parts of the indexes fit in memory.
1 某些类型的查询性能可以得到极大提升。特别是表中访问率较高的行位于一个单独分区或少数几个分区上的情况下。分区可以减少索引体积从而可以将高使用率部分的索引存放在内存中。如果索引不能全部放在内存中,那么在索引上的读和写都会产生更多的磁盘访问。
When queries or updates access a large percentage of a single partition, performance can be improved by taking advantage of sequential scan of that partition instead of using an index and random access reads scattered across the whole table.
2 当查询或更新一个分区的大部分记录时,连续扫描那个分区而不是使用索引离散的访问整个表可以获得巨大的性能提升。
Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design. ALTER TABLE is far faster than a bulk operation. It also entirely avoids the VACUUM overhead caused by a bulk DELETE.
3 如果需要大量加载或者删除的记录位于单独的分区上,那么可以通过直接读取或删除那个分区以获得巨大的性能提升,因为 ALTER TABLE 比操作大量的数据要快的多。它同时还可以避免由于大量 DELETE 导致的 VACUUM 超载。
Seldom-used data can be migrated to cheaper and slower storage media.
4 很少用的数据可以移动到便宜一些的慢速存储介质上。
The benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.
这种好处通常只有在表可能会变得非常大的情况下才有价值。到底多大的表会从分区中收益取决于具体的应用,不过有个基本的拇指规则就是表的大小超过了数据库服务器的物理内存大小。
Currently, PostgreSQL supports partitioning via table inheritance. Each partition must be created as a child table of a single parent table. The parent table itself is normally empty; it exists just to represent the entire data set. You should be familiar with inheritance (see Section 5.8) before attempting to set up partitioning.
目前,PostgreSQL 支持通过表继承进行分区。每个分区必须做为单独一个父表的子表进行创建。父表自身通常是空的,它的存在只是为了代表整个数据集。你在试图实现分区之前,应该先熟悉继承(参阅节5.8)。
The following forms of partitioning can be implemented in PostgreSQL:
PostgreSQL 可以实现下面形式的分区:
Range Partitioning
范围分区
The table is partitioned into "ranges" defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. For example one might partition by date ranges, or by ranges of identifiers for particular business objects.
表被一个或者多个关键字段分区成"范围",这些范围在不同的分区里没有重叠。比如,我们可以为特定的商业对象根据数据范围分区,或者根据标识符范围分区。
列表分区 List Partitioning
The table is partitioned by explicitly listing which key values appear in each partition.
表通过明确地列出每个分区里应该出现那些关键字值实现。
实现分区(Implementing Partitioning)
To set up a partitioned table, do the following:
要设置一个分区的表,做下面的步骤:
Create the "master" table, from which all of the partitions will inherit.
This table will contain no data. Do not define any check constraints on this table, unless you intend them to be applied equally to all partitions. There is no point in defining any indexes or unique constraints on it, either.
1 创建"主表",所有分区都从它继承。
这个表中没有数据,不要在这个表上定义任何检查约束,除非你希望约束同样也适用于所有分区。同样,在其上定义任何索引或者唯一约束也没有意义。
Create several "child" tables that each inherit from the master table. Normally, these tables will not add any columns to the set inherited from the master.
We will refer to the child tables as partitions, though they are in every way normal PostgreSQL tables.
2 创建几个"子表",每个都从主表上继承。通常,这些表不会增加任何字段。
我们将把子表称作分区,尽管它们就是普通的 PostgreSQL 表。
Add table constraints to the partition tables to define the allowed key values in each partition. Typical examples would be:
3 给分区表增加约束,定义每个分区允许的健值。
典型的例子是:
CHECK ( x = 1 ) CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' )) CHECK ( outletID >= 100 AND outletID < 200 )
Ensure that the constraints guarantee that there is no overlap between the key values permitted in different partitions. A common mistake is to set up range constraints like:
确保这些约束能够保证在不同的分区里不会有重叠的键值。一个常见的错误是设置下面这样的范围:
CHECK ( outletID BETWEEN 100 AND 200 ) CHECK ( outletID BETWEEN 200 AND 300 )
This is wrong since it is not clear which partition the key value 200 belongs in.
这样做是错误的,因为它没说清楚健值 200 属于那个范围。
Note that there is no difference in syntax between range and list partitioning; those terms are descriptive only.
请注意在范围和列表分区的语法方面没有什么区别;这些术语只是用于描述的。
For each partition, create an index on the key column(s), as well as any other indexes you might want. (The key index is not strictly necessary, but in most scenarios it is helpful. If you intend the key values to be unique then you should always create a unique or primary-key constraint for each partition.)
4 对于每个分区,在关键字字段上创建一个索引,以及其它你想创建的索引。关键字字段索引并非严格必需的,但是在大多数情况下它是很有帮助的。如果你希望关键字值是唯一的,那么你应该总是给每个分区创建一个唯一或者主键约束。
Optionally, define a trigger or rule to redirect data inserted into the master table to the appropriate partition.
5 另外,定义一个规则或者触发器,把对主表的修改重定向到合适的分区表。
Ensure that the constraint_exclusion configuration parameter is not disabled in postgresql.conf. If it is, queries will not be optimized as desired.
6 确保 postgresql.conf 里的配置参数 constraint_exclusion 是打开的。没有这个参数,查询不会按照需要进行优化。
For example, suppose we are constructing a database for a large ice cream company. The company measures peak temperatures every day as well as ice cream sales in each region. Conceptually, we want a table like:
比如,假设我们为一个巨大的冰激凌公司构造数据库。该公司每天都测量最高温度,以及每个地区的冰激凌销售。概念上,我们需要一个这样的表:
CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int );
We know that most queries will access just the last week's, month's or quarter's data, since the main use of this table will be to prepare online reports for management. To reduce the amount of old data that needs to be stored, we decide to only keep the most recent 3 years worth of data. At the beginning of each month we will remove the oldest month's data.
我们知道大多数查询都只会访问最后一周,最后一个月或者最后一个季度的数据,因为这个表的主要用途是为管理准备在线报告。为了减少需要存储的旧数据,我们决定值保留最近三年的有用数据。在每个月的开头,我们都会删除最旧的一个月的数据。
In this situation we can use partitioning to help us meet all of our different requirements for the measurements table. Following the steps outlined above, partitioning can be set up as follows:
在这种情况下,我们可以使用分区来帮助实现所有对表的不同需求。下面的步骤描述了上面的需求,分区可以这样设置:
The master table is the measurement table, declared exactly as above.
1 主表是 measurement 表,就像上面那样声明。
Next we create one partition for each active month:
2 然后我们为每个月创建一个分区:
CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement); CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement); ... CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement); CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement); CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
Each of the partitions are complete tables in their own right, but they inherit their definitions from the measurement table.
This solves one of our problems: deleting old data. Each month, all we will need to do is perform a DROP TABLE on the oldest child table and create a new child table for the new month's data.
每个分区都是拥有自己内容的完整的表,只是它们从 measurement 表继承定义。
这样就解决了我们的一个问题:删除旧数据。每个月,我们需要做的只是在最旧的子表上执行一个 DROP TABLE ,然后为新月份创建一个新的子表。
We must provide non-overlapping table constraints. Rather than just creating the partition tables as above, the table creation script should really be:
3 我们必须增加非重叠的表约束,所以我们的建表脚本就变成:
CREATE TABLE measurement_y2006m02 ( CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' ) ) INHERITS (measurement); CREATE TABLE measurement_y2006m03 ( CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' ) ) INHERITS (measurement); ... CREATE TABLE measurement_y2007m11 ( CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' ) ) INHERITS (measurement); CREATE TABLE measurement_y2007m12 ( CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' ) ) INHERITS (measurement); CREATE TABLE measurement_y2008m01 ( CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' ) ) INHERITS (measurement);
We probably need indexes on the key columns too:
我们可能还需要在关键字字段上有索引:
CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate); CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate); ... CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate); CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate); CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
We choose not to add further indexes at this time.
我们选择先不建立更多的索引。
We want our application to be able to say INSERT INTO measurement ... and have the data be redirected into the appropriate partition table. We can arrange that by attaching a suitable trigger function to the master table. If data will be added only to the latest partition, we can use a very simple trigger function:
我们想要我的应用程序能检测INSERT INTO,和数据重定向到相应的分区表,我们可以安置合适的触发函数到主表中,如果数据只能添加到最后一个分区,我们可以使用一个非常简单的饿触发函数 4
CREATE OR REPLACE FUNCTION measurement_insert_trigger() RETURNS TRIGGER AS $$ BEGIN INSERT INTO measurement_y2008m01 VALUES (NEW.*); RETURN NULL; END; $$ LANGUAGE plpgsql;
After creating the function, we create a trigger which calls the trigger function: 然后创建函数,我们创建一个触发叫做触发函数
CREATE TRIGGER insert_measurement_trigger BEFORE INSERT ON measurement FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
We must redefine the trigger function each month so that it always points to the current partition. The trigger definition does not need to be updated, however. 我们必须重新定义每个月的触发函数,因为它一般是指向单前文职,触发定义不需要更新。 We might want to insert data and have the server automatically locate the partition into which the row should be added. We could do this with a more complex trigger function, for example: 我们可以需要插入数据,想要服务自动定位到应该添加到的行的分区,我们可以使用一个更加复杂的除法函数,例如
CREATE OR REPLACE FUNCTION measurement_insert_trigger() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.logdate >= DATE '2006-02-01' AND NEW.logdate < DATE '2006-03-01' ) THEN INSERT INTO measurement_y2006m02 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2006-03-01' AND NEW.logdate < DATE '2006-04-01' ) THEN INSERT INTO measurement_y2006m03 VALUES (NEW.*); ... ELSIF ( NEW.logdate >= DATE '2008-01-01' AND NEW.logdate < DATE '2008-02-01' ) THEN INSERT INTO measurement_y2008m01 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql;
The trigger definition is the same as before. Note that each IF test must exactly match the CHECK constraint for its partition. 触发像以往那样定义,注意每个IF测试在它们分区中必须有完全匹配CHEC约束。 While this function is more complex than the single-month case, it doesn't need to be updated as often, since branches can be added in advance of being needed. 尽管这个函数比单月的列子中的要复杂,但它不用经常更新,因为分支会提前加入 Note: In practice it might be best to check the newest partition first, if most inserts go into that partition. For simplicity we have shown the trigger's tests in the same order as in other parts of this example. 注意,这个实践中,它可能首先会更好的去检测新的分区,如果是插入到这个分区,为了简单起见,我们已经展示了这个列子同样的需求的触发测试 As we can see, a complex partitioning scheme could require a substantial amount of DDL. In the above example we would be creating a new partition each month, so it might be wise to write a script that generates the required DDL automatically. 正如我们可以看到,一个复杂的分区计划可能需要大量的DDL。在上面的例子中,我们将每月创建一个新的分区,而它需要写的脚本数量 比DDL自动生成的要多。
管理分取( Managing Partitions)
Normally the set of partitions established when initially defining the table are not intended to remain static. It is common to want to remove old partitions of data and periodically add new partitions for new data. One of the most important advantages of partitioning is precisely that it allows this otherwise painful task to be executed nearly instantaneously by manipulating the partition structure, rather than physically moving large amounts of data around.
通常分区集在定义表的时候就已经确定了,但我们常常需要周期性的删除旧分区并添加新分区。分区最重要的好处是它能恰到好处的适应这个需求:以极快的速度操作分区的结构,而不是痛苦的物理移动大量数据。
The simplest option for removing old data is simply to drop the partition that is no longer necessary:
删除旧数据最简单的方法是删除不再需要的分区:
DROP TABLE measurement_y2006m02;
This can very quickly delete millions of records because it doesn't have to individually delete every record.
Another option that is often preferable is to remove the partition from the partitioned table but retain access to it as a table in its own right:
这个命令可以迅速删除数包含数百万条记录的分区,因为它不需要单独删除每一条记录。
还可以在删除分区的同时保留其作为一个表访问的能力:
ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
This allows further operations to be performed on the data before it is dropped. For example, this is often a useful time to back up the data using COPY, pg_dump, or similar tools. It might also be a useful time to aggregate data into smaller formats, perform other data manipulations, or run reports.
Similarly we can add a new partition to handle new data. We can create an empty partition in the partitioned table just as the original partitions were created above:
这将允许将来对这些数据执行其它的操作(比如使用 COPY, pg_dump 之类的工具进行备份)。并且此时也是执行其它数据操(数据聚集或运行报表等)的有利时机。
同样,我们可以像前面创建最初的分区一样,创建一个新的空分区来处理新数据。
CREATE TABLE measurement_y2008m02 ( CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' ) ) INHERITS (measurement);
As an alternative, it is sometimes more convenient to create the new table outside the partition structure, and make it a proper partition later. This allows the data to be loaded, checked, and transformed prior to it appearing in the partitioned table:
有时在分区结构之外创建新表并在一段时间之后将其变为分区更为方便。因为这将允许在该表变为分区之前对其中的数据进行加载、检查、转换之类的操作。
CREATE TABLE measurement_y2008m02 (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS); ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02 CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' ); \copy measurement_y2008m02 from 'measurement_y2008m02' -- possibly some other data preparation work ALTER TABLE measurement_y2008m02 INHERIT measurement;
分区和约束排除(Partitioning and Constraint Exclusion)
Constraint exclusion is a query optimization technique that improves performance for partitioned tables defined in the fashion described above. As an example:
约束排除是一种查询优化技巧,它改进了用上述方法定义的表分区的性能。比如:
SET constraint_exclusion = on; SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
Without constraint exclusion, the above query would scan each of the partitions of the measurement table. With constraint exclusion enabled, the planner will examine the constraints of each partition and try to prove that the partition need not be scanned because it could not contain any rows meeting the query's WHERE clause. When the planner can prove this, it excludes the partition from the query plan.
如果没有约束排除,上面的查询会扫描 measurement 表中的每一个分区。打开了约束排除之后,规划器将检查每个分区的约束然后试图证明该分区不需要被扫描(因为它不能包含任何符合 WHERE 子句条件的数据行)。如果规划器可以证明这个,它就把该分区从查询规划里排除出去。
You can use the EXPLAIN command to show the difference between a plan with constraint_exclusion on and a plan with it off. A typical unoptimized plan for this type of table setup is:
你可以使用 EXPLAIN 命令显示一个规划在 constraint_exclusion 打开和关闭情况下的不同。用上面方法设置的表的典型的缺省规划是:
SET constraint_exclusion = off; EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; QUERY PLAN ----------------------------------------------------------------------------------------------- Aggregate (cost=158.66..158.68 rows=1 width=0) -> Append (cost=0.00..151.88 rows=2715 width=0) -> Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date) -> Seq Scan on measurement_y2006m02 measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date) -> Seq Scan on measurement_y2006m03 measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date) ... -> Seq Scan on measurement_y2007m12 measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date) -> Seq Scan on measurement_y2008m01 measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date)
Some or all of the partitions might use index scans instead of full-table sequential scans, but the point here is that there is no need to scan the older partitions at all to answer this query. When we enable constraint exclusion, we get a significantly cheaper plan that will deliver the same answer:
部分或者全部分区可能会使用索引扫描而不是全表扫描,不过这里要表达的意思是没有必要扫描旧分区就可以回答这个查询。在打开约束排除之后,我们可以得到生成同样回答的明显简化的规划:
SET constraint_exclusion = on; EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; QUERY PLAN ----------------------------------------------------------------------------------------------- Aggregate (cost=63.47..63.48 rows=1 width=0) -> Append (cost=0.00..60.75 rows=1086 width=0) -> Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date) -> Seq Scan on measurement_y2008m01 measurement (cost=0.00..30.38 rows=543 width=0) Filter: (logdate >= '2008-01-01'::date)
Note that constraint exclusion is driven only by CHECK constraints, not by the presence of indexes. Therefore it isn't necessary to define indexes on the key columns. Whether an index needs to be created for a given partition depends on whether you expect that queries that scan the partition will generally scan a large part of the partition or just a small part. An index will be helpful in the latter case but not the former.
请注意,约束排除只由 CHECK 约束驱动,而不会由索引驱动。因此,在关键字字段上定义索引是没有必要的。在给出的分区上是否需要建立索引取决于那些扫描该分区的查询通常是扫描该分区的一大部分还是只是一小部分。对于后者,索引通常都有帮助,对于前者则没有什么好处。
The default (and recommended) setting of constraint_exclusion is actually neither on nor off, but an intermediate setting called partition, which causes the technique to be applied only to queries that are likely to be working on partitioned tables. The on setting causes the planner to examine CHECK constraints in all queries, even simple ones that are unlikely to benefit.
默认(推荐)constraint_exclusion设置即不开也不关,但一个中间级设置调用分区会导致技术仅能应用在查询操作,而且可能只在分区表上工作。 开,设置会导致所有查询都在计划检测CHECK约束,即使是一个很简单的,这是不可能受益的。
替代分区方法(Alternative Partitioning Methods)
A different approach to redirecting inserts into the appropriate partition table is to set up rules, instead of a trigger, on the master table. For example:
CREATE RULE measurement_insert_y2006m02 AS ON INSERT TO measurement WHERE ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' ) DO INSTEAD INSERT INTO measurement_y2006m02 VALUES (NEW.*); ... CREATE RULE measurement_insert_y2008m01 AS ON INSERT TO measurement WHERE ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' ) DO INSTEAD INSERT INTO measurement_y2008m01 VALUES (NEW.*);
A rule has significantly more overhead than a trigger, but the overhead is paid once per query rather than once per row, so this method might be advantageous for bulk-insert situations. In most cases, however, the trigger method will offer better performance.
Be aware that COPY ignores rules. If you want to use COPY to insert data, you'll need to copy into the correct partition table rather than into the master. COPY does fire triggers, so you can use it normally if you use the trigger approach.
Another disadvantage of the rule approach is that there is no simple way to force an error if the set of rules doesn't cover the insertion date; the data will silently go into the master table instead.
Partitioning can also be arranged using a UNION ALL view, instead of table inheritance. For example,
CREATE VIEW measurement AS SELECT * FROM measurement_y2006m02 UNION ALL SELECT * FROM measurement_y2006m03 ... UNION ALL SELECT * FROM measurement_y2007m11 UNION ALL SELECT * FROM measurement_y2007m12 UNION ALL SELECT * FROM measurement_y2008m01;
However, the need to recreate the view adds an extra step to adding and dropping individual partitions of the data set. In practice this method has little to recommend it compared to using inheritance.
注意事项(Caveats)
The following caveats apply to partitioned tables:
下面的注意事项适合于已分区的表:
There is no automatic way to verify that all of the CHECK constraints are mutually exclusive. It is safer to create code that generates partitions and creates and/or modifies associated objects than to write each by hand.
The schemes shown here assume that the partition key column(s) of a row never change, or at least do not change enough to require it to move to another partition. An UPDATE that attempts to do that will fail because of the CHECK constraints. If you need to handle such cases, you can put suitable update triggers on the partition tables, but it makes management of the structure much more complicated.
If you are using manual VACUUM or ANALYZE commands, don't forget that you need to run them on each partition individually. A command like:
ANALYZE measurement; will only process the master table.
The following caveats apply to constraint exclusion:
Constraint exclusion only works when the query's WHERE clause contains constants. A parameterized query will not be optimized, since the planner cannot know which partitions the parameter value might select at run time. For the same reason, "stable" functions such as CURRENT_DATE must be avoided.
Keep the partitioning constraints simple, else the planner may not be able to prove that partitions don't need to be visited. Use simple equality conditions for list partitioning, or simple range tests for range partitioning, as illustrated in the preceding examples. A good rule of thumb is that partitioning constraints should contain only comparisons of the partitioning column(s) to constants using B-tree-indexable operators.
All constraints on all partitions of the master table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions.
外键数据(Foreign Data)
PostgreSQL implements portions of the SQL/MED specification, allowing you to access data that resides outside PostgreSQL using regular SQL queries. Such data is referred to as foreign data. (Note that this usage is not to be confused with foreign keys, which are a type of constraint within the database.)
Foreign data is accessed with help from a foreign data wrapper. A foreign data wrapper is a library that can communicate with an external data source, hiding the details of connecting to the data source and fetching data from it. There are several foreign data wrappers available, which can for example read plain data files residing on the server, or connect to another PostgreSQL instance. If none of the existing foreign data wrappers suit your needs, you can write your own; see Chapter 50.
To access foreign data, you need to create a foreign server object, which defines how to connect to a particular external data source, according to the set of options used by a particular foreign data wrapper. Then you need to create one or more foreign tables, which define the structure of the remote data. A foreign table can be used in queries just like a normal table, but a foreign table has no storage in the PostgreSQL server. Whenever it is used, PostgreSQL asks the foreign data wrapper to fetch the data from the external source.
Currently, foreign tables are read-only. This limitation may be fixed in a future release.
其他数据库对象(Other Database Objects)
Tables are the central objects in a relational database structure, because they hold your data. But they are not the only objects that exist in a database. Many other kinds of objects can be created to make the use and management of the data more efficient or convenient. They are not discussed in this chapter, but we give you a list here so that you are aware of what is possible:
在关系结构里,表是核心的对象,因为它们保存你的数据。但是它们并非存在于数据库中的唯一对象。我们可以创建许多其它类型的对象来让我们对数据的使用和管理变得更方便。我们没有在这一章里讨论这些对象,但是我们在这里会给你一个列表,这样你就知道什么是可能的。
Views
1 视图
Functions and operators
2 函数和操作符
Data types and domains
3 数据类型和域
Triggers and rewrite rules
4 触发器和重写规则
Detailed information on these topics appears in Part V.
这些主题的详细信息在 Part V 里面。
依赖项跟踪(Dependency Tracking)
When you create complex database structures involving many tables with foreign key constraints, views, triggers, functions, etc. you implicitly create a net of dependencies between the objects. For instance, a table with a foreign key constraint depends on the table it references.
如果你创建了一个包含许多表,并且带有外键约束、视图、触发器、函数等复杂的数据库结构。那么你就会在对象之间隐含地创建了一个依赖性的网络。比如,一个带有外键约束的表依赖于它所引用的表。
To ensure the integrity of the entire database structure, PostgreSQL makes sure that you cannot drop objects that other objects still depend on. For example, attempting to drop the products table we had considered in Section 5.3.5, with the orders table depending on it, would result in an error message such as this:
为了保证整个数据库结构的完整性,PostgreSQL 保证你无法删除那些还被其它对象依赖的对象。比如,试图删除在节5.3.5里被订单表所依赖的产品表是不能成功的,会有类似下面的错误信息出现:
DROP TABLE products;
NOTICE: constraint orders_product_no_fkey on table orders depends on table products ERROR: cannot drop table products because other objects depend on it HINT: Use DROP ... CASCADE to drop the dependent objects too.
The error message contains a useful hint: if you do not want to bother deleting all the dependent objects individually, you can run:
这个错误信息包含一个有用的提示:如果你不想麻烦的分别删除所有依赖对象,你可以运行
DROP TABLE products CASCADE;
and all the dependent objects will be removed. In this case, it doesn't remove the orders table, it only removes the foreign key constraint. (If you want to check what DROP ... CASCADE will do, run DROP without CASCADE and read the NOTICE messages.)
然后所有被依赖的对象都将被删除(并不删除订单表,只是删除外键约束)。如果你想检查 DROP ... CASCADE 会干什么,运行不带 CASCADE 的 DROP 然后阅读 NOTICE 信息。
All drop commands in PostgreSQL support specifying CASCADE. Of course, the nature of the possible dependencies varies with the type of the object. You can also write RESTRICT instead of CASCADE to get the default behavior, which is to prevent the dropping of objects that other objects depend on.
PostgreSQL 里的所有删除命令都支持声明 CASCADE 。当然,具体的依赖性实体取决于对象的类型。你也可以写 RESTRICT 而不是 CASCADE 以获取缺省的行为(仅限于删除那些其它对象所依赖的对象)。
Note: According to the SQL standard, specifying either RESTRICT or CASCADE is required. No database system actually enforces that rule, but whether the default behavior is RESTRICT or CASCADE varies across systems.
【注意】根据 SQL 标准,要求至少声明 RESTRICT 或 CASCADE 中的一个。实际上没有哪种数据库系统强制这一点,但是缺省的行为是 RESTRICT 还是 CASCADE 则因系统而异。
Note: Foreign key constraint dependencies and serial column dependencies from PostgreSQL versions prior to 7.3 are not maintained or created during the upgrade process. All other dependency types will be properly created during an upgrade from a pre-7.3 database.
【注意】在 PostgreSQL 7.3之前的外键约束依赖性和序列字段依赖性在升级过程中都不会得到维护或者创建。所有其它的依赖性类型 在从7.3版本以前的数据库升级过程中都将得到恰当的创建。