From PostgreSQL wiki
Jump to: navigation, search


II.SQL语句( The SQL Language)

This part describes the use of the SQL language in PostgreSQL. We start with describing the general syntax of SQL, then explain how to create the structures to hold data, how to populate the database, and how to query it. The middle part lists the available data types and functions for use in SQL commands. The rest treats several aspects that are important for tuning a database for optimal performance.

本章描述PostgreSQL中的 SQL 的语法。 我们开始描述SQL的主要语法,这些内容是理解随后各章的基础,在那些章里面将详细介绍 SQL 命令如何用于 定义和生成数据库,还有查询它们。中间部分列出了在SQL命令中的可用的数据类型和函数。剩下的部分有几个问题,关于一个数据库的最佳性能。

The information in this part is arranged so that a novice user can follow it start to end to gain a full understanding of the topics without having to refer forward too many times. The chapters are intended to be self-contained, so that advanced users can read the chapters individually as they choose. The information in this part is presented in a narrative fashion in topical units. Readers looking for a complete description of a particular command should see Part VI.

这一部分的信息是这样安排的:新手可以从头读到尾, 便可以获取有关主题的完整了解,而不需要向前 引用太多的次数。 里面的章节是设计成自包含的,这样高级用户就可以选择独立的章节来阅读。 这部分 的信息是按照主题单元以叙述的方式组织的。 如果你需要了解特定命令的完整描述,那么应该看看 Part VI

Readers of this part should know how to connect to a PostgreSQL database and issue SQL commands. Readers that are unfamiliar with these issues are encouraged to read Part I first. SQL commands are typically entered using the PostgreSQL interactive terminal psql, but other programs that have similar functionality can be used as well.

本书的读者应该知道如何与一个 PostgreSQL 数据库连接并发出 SQL 命令。 我们建议那些不熟悉 这些方面的读者首先阅读Part I。 通常 SQL 命令是用 PostgreSQL 交互终端 psql 输入的,但其 它有类似功能的程序也可以使用。

SQL语法(SQL Syntax)

语法结构(Lexical Structure)

SQL input consists of a sequence of commands. A command is composed of a sequence of tokens, terminated by a semicolon (";"). The end of the input stream also terminates a command. Which tokens are valid depends on the syntax of the particular command.

SQL 输入由一系列命令组成。 一条命令是由一系列记号构成, 用一个分号(";")结尾。 输入流的终 止也结束一条命令。哪些记号是合法的取决于特定命令的语法。

A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol. Tokens are normally separated by whitespace (space, tab, newline), but need not be if there is no ambiguity (which is generally only the case if a special character is adjacent to some other token type).

记号可以是一个关键字, 一个标识符,一个 引号包围的标识符, 一个文本(或常量),或者是特殊的 字符符号。 记号通常由空白分隔(空格,tab,换行符),但如果不存在混淆的时候也可以不用 (通常 只是一个特殊字符与一些其它记号类型相联的时候)。

For example, the following is (syntactically) valid SQL input:

比如,下列命令是(语法上)合法的 SQL 输入:


This is a sequence of three commands, one per line (although this is not required; more than one command can be on a line, and commands can usefully be split across lines).

这里是三条命令的序列,每条一行(尽管并不要求这么做; 多条命令可以在一行里,并且命令可以合理 地分裂成多个行)。

Additionally, comments can occur in SQL input. They are not tokens, they are effectively equivalent to whitespace.

另外,在 SQL 输入里可以有注释。 它们不是记号,它们实际上等效于空白。

The SQL syntax is not very consistent regarding what tokens identify commands and which are operands or parameters. The first few tokens are generally the command name, so in the above example we would usually speak of a "SELECT", an "UPDATE", and an "INSERT" command. But for instance the UPDATE command always requires a SET token to appear in a certain position, and this particular variation of INSERT also requires a VALUES in order to be complete. The precise syntax rules for each command are described in Part VI.

如果从哪些记号标识命令,哪些是操作数或参数的角度考虑, SQL 语法并不是非常一致。通常头几个 记号是命令名字, 因此上面的例子我们通常可以说是一个"SELECT", 一个"UPDATE",和一 个"INSERT"命令。 不过, UPDATE 命令总是要求一个 SET 在某个位置出现,并且这个变体的 INSERT 还要求有一个 VALUES 才完整。每条命令的准确语法规则都在 Part VI 里描写。

标识和关键字 (Identifiers and Key Words)

Tokens such as SELECT, UPDATE, or VALUES in the example above are examples of key words, that is, words that have a fixed meaning in the SQL language. The tokens MY_TABLE and A are examples of identifiers. They identify names of tables, columns, or other database objects, depending on the command they are used in. Therefore they are sometimes simply called "names". Key words and identifiers have the same lexical structure, meaning that one cannot know whether a token is an identifier or a key word without knowing the language. A complete list of key words can be found in Appendix C.

象上面的例子里的 SELECT,UPDATE, 或 VALUES 这样的记号都是关键字的例子, 也就是那些在 SQL 语言里有固定含义的单词。 记号 MY_TABLE 和 A 是标识符的例子。 根据使用它们的命令的不 同,它们标识表,字段,或者其它数据库对象的名字。 因此,有时候只是简单地叫它们"名字"。 关键字 和标识符有着同样的词法结构,意思是我们在没有认识这种语言之前是无法区分一个记号是标识符还是 名字。 你可以在 Appendix C 里找到一个关键字的完整列表。

SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($). Note that dollar signs are not allowed in identifiers according to the letter of the SQL standard, so their use might render applications less portable. The SQL standard will not define a key word that contains digits or starts or ends with an underscore, so identifiers of this form are safe against possible conflict with future extensions of the standard.

SQL 标识符和关键字必须以一个字母开头 (a-z 以及带可区别标记的字母以及非拉丁字母 )或下划线 开头 (_)开头。标识符和关键字里随后的字符可以是字母,数字(0-9),或者下划线,还有($)符号,但美元符号不充许在标识符中,因为是SQL基本字 母,因此它们的使用可能会导致应用程序移植性下降。SQL基本字符不会定义为包含数字或下划线,因此这种形式的标识符号足以保证以后的基本字符扩展

The system uses no more than NAMEDATALEN-1 bytes of an identifier; longer names can be written in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier length is 63 bytes. If this limit is problematic, it can be raised by changing the NAMEDATALEN constant in src/include/pg_config_manual.h.

系统使用不超过 NAMEDATALEN-1 个字符作为标识符; 你可以在命令中写更长的名字,但它们会被截 断。缺省时, NAMEDATALEN 是 64,因此标识符最大长度是 63 如果觉得这个限制有问题,那么你 可以在 src/include/pg_config_manual.h 里修改 NAMEDATALEN 来改变它。

Key words and unquoted identifiers are case insensitive. Therefore:



can equivalently be written as:


uPDaTE my_TabLE SeT a = 5;

A convention often used is to write key words in upper case and names in lower case, e.g.:


UPDATE my_table SET a = 5;

There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes ("). A delimited identifier is always an identifier, never a key word. So "select" could be used to refer to a column or table named "select", whereas an unquoted select would be taken as a key word and would therefore provoke a parse error when used where a table or column name is expected. The example can be written with quoted identifiers like this:

还有第二种标识符:分隔标识符 或引号包围的标识符。 它是通过在双引号(" ) 里包围转义字符序列 形成的。 分隔标识符总是一个标识符,而不是关键字。因此,你可以用 "SELECT" 表示一个字段名字 或者名字叫 "SELECT" 的表,而一个没有引号的 SELECT 将被当做一条命令的一部分,因此如果把 它当做一个表的名字或者字段名字用的话就会产生一个分析错误。 上面的例子可以用引起的标识符这么 写:

UPDATE "my_table" SET "a" = 5;

Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.) This allows constructing table or column names that would otherwise not be possible, such as ones containing spaces or ampersands. The length limitation still applies.

引号包围的标识符可以包含除引号本身以外的任何其它字符。 要包含一个双引号,我们可以写两个双引 号。 这样我们就可以构造那些原本是不允许的表或者字段名字, 比如那些包含空白或与号的名字。但长 度限制依旧。

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!这部分在 8.1文档中不存在的----开始

A variant of quoted identifiers allows including escaped Unicode characters identified by their code points. This variant starts with U& (upper or lower case U followed by ampersand) immediately before the opening double quote, without any spaces in between, for example U&"foo". (Note that this creates an ambiguity with the operator &. Use spaces around the operator to avoid this problem.) Inside the quotes, Unicode characters can be specified in escaped form by writing a backslash followed by the four-digit hexadecimal code point number or alternatively a backslash followed by a plus sign followed by a six-digit hexadecimal code point number. For example, the identifier "data" could be written as

一个引号标识变量充许在转义 Unicode 符号在它的代码行中, 一个变量(在U符号后的可以是大写或小写)在开双引用符号前有一个U&(在U符号后的可以是大写或小写),其中不包含任何空格,例如U&"foo".(注意这里创建用操作符号创建可能会产生歧义,在操作边上使用空格可以避免这个问题),在引号里面,Unicode字符可以由反斜线通过四位十六进制代码行,数字或另外的反斜线加上加号符接着六位十六进制代码数字转义的指定,例如标识符号“date”可以着样写


The following less trivial example writes the Russian word "slon" (elephant) in Cyrillic letters: 下面的小例子少写俄语单词“slon“为西里尔字母(大象):


If a different escape character than backslash is desired, it can be specified using the UESCAPE clause after the string, for example:

一个不同的转义字符比较需要反斜杠,它也可以通过UESCAPE短词放到字符串后面 ,例如:

U&"d!0061t!+000061" UESCAPE '!'

The escape character can be any single character other than a hexadecimal digit, the plus sign, a single quote, a double quote, or a whitespace character. Note that the escape character is written in single quotes, not double quotes. 转义字符可以为任意单一字符而不是一个十六进制数字,加号,一个单一引用,一个双引用或者空白字符。注意 转义字符写到单引用中,而不是双引用。 To include the escape character in the identifier literally, write it twice.


The Unicode escape syntax works only when the server encoding is UTF8. When other server encodings are used, only code points in the ASCII range (up to \007F) can be specified. Both the 4-digit and the 6-digit form can be used to specify UTF-16 surrogate pairs to compose characters with code points larger than U+FFFF, although the availability of the 6-digit form technically makes this unnecessary. (Surrogate pairs are not stored directly, but combined into a single code point that is then encoded in UTF-8.)

Unicode 转义语法仅在服务端解码为UTF8时才工作,当另服务端解码使用时,在ASCII范围(到\007F)中的代码行可以指定。4位和6位格式都可以用于替代指定UTF-16,组成超过U+FFFF的字符,尽管6位数格式用在这里是没必要的(替代'对'不是直接存储的,而是结合到单一代码行中,然后解码为UTF-8)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1 这部分在 8.1文档中不存在的----结束

Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If you want to write portable applications you are advised to always quote a particular name or never quote it.)

把一个标识符用引号包围的起来同时也令它大小写相关,而没有引号包围起来的名字总是转成小写。 比 如,我们认为标识符 FOO,foo 和 "foo" 是一样的 PostgreSQL名字, 但 "Foo" 和 "FOO" 与上面三个以及它们之间都是不同的。 (PostgreSQL 里对未加引号的名子总是转换成小写, 这和 SQL 是不兼容的,SQL 里要求未用引号包围起来的名字总是转成大写。 因此 foo 等于 "FOO"。 如 果你想写可移植的程序,那么我们建议你要么就总是引号包围的某个名字,要么就坚决不引。)


There are three kinds of implicitly-typed constants in PostgreSQL: strings, bit strings, and numbers. Constants can also be specified with explicit types, which can enable more accurate representation and more efficient handling by the system. These alternatives are discussed in the following subsections.

在 PostgreSQL 里有三种隐含类型的常量: 字符串,位串,和数值。 常量也可以声明为明确的类型, 这样就可以使用更准确的表现形式以及可以通过系统更有效地处理。 这些候选的在后面的小节描述。

字符串常量(String Constants)

A string constant in SQL is an arbitrary sequence of characters bounded by single quotes ('), for example 'This is a string'. To include a single-quote character within a string constant, write two adjacent single quotes, e.g., 'Diannes horse'. Note that this is not the same as a double-quote character (").

SQL 里的一个字串文本是用单引号(')包围的任意字符序列, 比如,'This is a string'。 这种声明字串常量的方法是 SQL 标准定义的。 在这种类型的字串常量里嵌入单引号的标准兼容的做法 是敲入两个连续的单引号比如,'Diannes horse'。 但双引用字符(")是不同的。

Two string constants that are only separated by whitespace with at least one newline are concatenated and effectively treated as if the string had been written as one constant. For example:

两个只是通过至少有一个换行符的空白分隔的字符串常量会被连接在一起,并当做它们是写成一个常量 处理。 比如:

SELECT 'foo'

is equivalent to:


SELECT 'foobar';


SELECT 'foo'      'bar';

is not valid syntax. (This slightly bizarre behavior is specified by SQL; PostgreSQL is following the standard.)

是非法的语法,(这个略微有些怪异的行为是 SQL 声明的; PostgreSQL 遵循标准。)

C语言中的转意字符串常量(String Constants with C-style Escapes)


PostgreSQL also accepts "escape" string constants, which are an extension to the SQL standard. An escape string constant is specified by writing the letter E (upper or lower case) just before the opening single quote, e.g., E'foo'. (When continuing an escape string constant across lines, write E only before the first opening quote.) Within an escape string, a backslash character (\) begins a C-like backslash escape sequence, in which the combination of backslash and following character(s) represent a special byte value, as shown in Table 4-1.

PostgreSQL 也接受 “转义”字符串常量,是SQL的一个扩展,一个转义字符串常量由单词E(大写或小写)在开单引用前书写,例如,E'foo' (当在一个行中连续出现一个转义字符串常量,仅把E写到第一个开引号前)在一个转义字符串中,由一个反斜杠字符(\)开始,像C中的反斜杠转义序列,在表4-1中,描述反斜杠的成分和一个指定的字节值。

Table 4-1. Backslash Escape Sequences

反斜杠转义序列(Backslash Escape Sequence) 解释(Interpretation)
\b 空格
\f 换页
\n 新行
\r 回车
\t tab
\o, \oo, \ooo (o = 0 - 7) 八进制字节值
\xh, \xhh (h = 0 - 9, A - F) 十六进制字节值
\uxxxx, \Uxxxxxxxx (x = 0 - 9, A - F) 16位或32位十六进制的Unicode字符值

Any other character following a backslash is taken literally. Thus, to include a backslash character, write two backslashes (\\). Also, a single quote can be included in an escape string by writing \', in addition to the normal way of .

其他别的字符紧跟一个反斜线,因此,想包含一个反斜线时,就直接写2个反斜线(\\),除了正常的方式为 一个单引用也可以包含到转义字符中 如 \'

It is your responsibility that the byte sequences you create, especially when using the octal or hexadecimal escapes, compose valid characters in the server character set encoding. When the server encoding is UTF-8, then the Unicode escapes or the alternative Unicode escape syntax, explained in Section, should be used instead. (The alternative would be doing the UTF-8 encoding by hand and writing out the bytes, which would be very cumbersome.)

The Unicode escape syntax works fully only when the server encoding is UTF8. When other server encodings are used, only code points in the ASCII range (up to \u007F) can be specified. Both the 4-digit and the 8-digit form can be used to specify UTF-16 surrogate pairs to compose characters with code points larger than U+FFFF, although the availability of the 8-digit form technically makes this unnecessary. (When surrogate pairs are used when the server encoding is UTF8, they are first combined into a single code point that is then encoded in UTF-8.)

警告(Caution) If the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. However, as of PostgreSQL 9.1, the default is on, meaning that backslash escapes are recognized only in escape string constants. This behavior is more standards-compliant, but might break applications which rely on the historical behavior, where backslash escapes were always recognized. As a workaround, you can set this parameter to off, but it is better to migrate away from using backslash escapes. If you need to use a backslash escape to represent a special character, write the string constant with an E.

In addition to standard_conforming_strings, the configuration parameters escape_string_warning and backslash_quote govern treatment of backslashes in string constants.

除了为standard_conforming_strings外,配置参数 escape_string_warning和 backslash_quote 在字符常量中也作为反斜杠对待

The character with the code zero cannot be in a string constant. 0代码的字符不能在一个字符串常量中。

UNICODE中的转意字符串常量(String Constants with Unicode Escapes)

PostgreSQL also supports another type of escape syntax for strings that allows specifying arbitrary Unicode characters by code point. A Unicode escape string constant starts with U& (upper or lower case letter U followed by ampersand) immediately before the opening quote, without any spaces in between, for example U&'foo'. (Note that this creates an ambiguity with the operator &. Use spaces around the operator to avoid this problem.) Inside the quotes, Unicode characters can be specified in escaped form by writing a backslash followed by the four-digit hexadecimal code point number or alternatively a backslash followed by a plus sign followed by a six-digit hexadecimal code point number. For example, the string 'data' could be written as

U&'d\0061t\+000061' The following less trivial example writes the Russian word "slon" (elephant) in Cyrillic letters:

U&'\0441\043B\043E\043D' If a different escape character than backslash is desired, it can be specified using the UESCAPE clause after the string, for example:

U&'d!0061t!+000061' UESCAPE '!' The escape character can be any single character other than a hexadecimal digit, the plus sign, a single quote, a double quote, or a whitespace character.

The Unicode escape syntax works only when the server encoding is UTF8. When other server encodings are used, only code points in the ASCII range (up to \007F) can be specified. Both the 4-digit and the 6-digit form can be used to specify UTF-16 surrogate pairs to compose characters with code points larger than U+FFFF, although the availability of the 6-digit form technically makes this unnecessary. (When surrogate pairs are used when the server encoding is UTF8, they are first combined into a single code point that is then encoded in UTF-8.)

Also, the Unicode escape syntax for string constants only works when the configuration parameter standard_conforming_strings is turned on. This is because otherwise this syntax could confuse clients that parse the SQL statements to the point that it could lead to SQL injections and similar security issues. If the parameter is set to off, this syntax will be rejected with an error message.

To include the escape character in the string literally, write it twice.


$符号引用字符串常量(Dollar-quoted String Constants)

While the standard syntax for specifying string constants is usually convenient, it can be difficult to understand when the desired string contains many single quotes or backslashes, since each of those must be doubled. To allow more readable queries in such situations, PostgreSQL provides another way, called "dollar quoting", to write string constants. A dollar-quoted string constant consists of a dollar sign ($), an optional "tag" of zero or more characters, another dollar sign, an arbitrary sequence of characters that makes up the string content, a dollar sign, the same tag that began this dollar quote, and a dollar sign. For example, here are two different ways to specify the string "Dianne's horse" using dollar quoting:

尽管声明字串常量的标准方法通常都很方便,但是如果字串包含很多单引号或者反斜杠, 那么理解字串 的内容可能就会变得很苦涩,因为每个单引号都要加倍。 为了让这种场合下的查询更具可读性, PostgreSQL 允许另外一种称作"美元符包围"的字串常量声明办法。 一个通过美元符包围声明的字串 常量由一个美元符号($),一个可选的零个或多个字符"记号", 另外一个美元符号,一个组成字串常 量的任意字符的序列,一个美元符号,以及一个和开始这个美元符包围的记号相同的记号,和一个美元 符号组成。 比如,下面是两个不同的方法,用美元符包围声明了前面的例子:

$$Dianne's horse$$ $SomeTag$Dianne's horse$SomeTag$

Notice that inside the dollar-quoted string, single quotes can be used without needing to be escaped. Indeed, no characters inside a dollar-quoted string are ever escaped: the string content is always written literally. Backslashes are not special, and neither are dollar signs, unless they are part of a sequence matching the opening tag.

请注意,在美元符包围的字串里,单引号可以不用逃逸使用。 实际上,在一个美元符包围的字串里,没 有什么字符需要逃逸: 字串内容总是按照字面内容写。反斜杠不是特殊的, 美元符自己也不是特殊的, 除非它们和开标签的一部分匹配。

It is possible to nest dollar-quoted string constants by choosing different tags at each nesting level. This is most commonly used in writing function definitions. For example:

我们可以通过在不同嵌套级别使用不同的美元符引号字串常量来实现嵌套。 最常见的是写函数定义的时 候。比如:

    RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);

Here, the sequence $q$[\t\r\n\v\\]$q$ represents a dollar-quoted literal string [\t\r\n\v\\], which will be recognized when the function body is executed by PostgreSQL. But since the sequence does not match the outer dollar quoting delimiter $function$, it is just some more characters within the constant so far as the outer string is concerned.

这里,序列 $q$[\t\r\n\v\\]$q$ 表示一个美元符包围的字串文本 [\t\r\n\v\\], 在函数 体被 PostgreSQL 执行的时候,它将被识别出来。 但是因为这个序列不匹配外层的美元符分隔符 $function$, 所以只要考虑了外层字串,它就只是常量里面的一些额外的字符而已。

The tag, if any, of a dollar-quoted string follows the same rules as an unquoted identifier, except that it cannot contain a dollar sign. Tags are case sensitive, so $tag$String content$tag$ is correct, but $TAG$String content$tag$ is not. 如果有标签的话,一个美元符包围的字串遵循和无引号包围的标识符相同的规则, 只是它不能包含美元 符。标签是大小写相关的,因此 $tag$String content$tag$ 是正确的,而 $TAG$String content$tag$ 不对。

A dollar-quoted string that follows a keyword or identifier must be separated from it by whitespace; otherwise the dollar quoting delimiter would be taken as part of the preceding identifier.

一个后面跟着关键字或者标识符的美元包围的字串必须用空白隔开; 否则美元符包围分隔符将会被认为 前面标识符的一部分。

Dollar quoting is not part of the SQL standard, but it is often a more convenient way to write complicated string literals than the standard-compliant single quote syntax. It is particularly useful when representing string constants inside other constants, as is often needed in procedural function definitions. With single-quote syntax, each backslash in the above example would have to be written as four backslashes, which would be reduced to two backslashes in parsing the original string constant, and then to one when the inner string constant is re-parsed during function execution.

美元符包围不是 SQL 标准,但是在写复杂的字串文本的时候,它通常比标准的单引号语法更方便。 尤 其是在其它常量里表现字串常量的时候更有用,比如经常在过程函数定义里面的。 如果用单引号语法, 每个上面例子里的反斜杠都必须写四个,它们在作为字串文本分析的时候会减少为两个, 然后在函数执 行的时候在内层字串常量里会再次被解析为一个。

字符串位的常量(Bit-string Constants)

Bit-string constants look like regular string constants with a B (upper or lower case) immediately before the opening quote (no intervening whitespace), e.g., B'1001'. The only characters allowed within bit-string constants are 0 and 1.

位串常量看起来很象在开引号前面有一个 B (大写或小写)的普通字符串(它们之间没有空白), 比如 B'1001'。位串常量里可以用的字符只有 0 和 1。

Alternatively, bit-string constants can be specified in hexadecimal notation, using a leading X (upper or lower case), e.g., X'1FF'. This notation is equivalent to a bit-string constant with four binary digits for each hexadecimal digit.

另外,位串常量可以用十六进制表示法声明,方法是使用前缀的 X (大写或者小写),比如, X'1FF'。 这种表示法等效于一个每个十六进制位四个二进制位的位串常量。

Both forms of bit-string constant can be continued across lines in the same way as regular string constants. Dollar quoting cannot be used in a bit-string constant.

两种形式的位串常量都可以象普通字串常量那样跨行连续。 美元符包围不能用于位串常量。

数值常量(Numeric Constants)

Numeric constants are accepted in these general forms:



where digits is one or more decimal digits (0 through 9). At least one digit must be before or after the decimal point, if one is used. At least one digit must follow the exponent marker (e), if one is present. There cannot be any spaces or other characters embedded in the constant. Note that any leading plus or minus sign is not actually considered part of the constant; it is an operator applied to the constant.

这里的 digits 是一个或多个十进制位(0 到 9)。 如果有小数点,那么至少有一位在小数点前面或 后面。如果出现了指数分隔符(e),那么至少有一个位跟在它后面。 在常量里不能有空格或者其他字 符嵌入在内。 请注意任何前导地正号或者负号实际上都不认为是常量的一部分; 它是施加于常量的一个 操作符。

These are some examples of valid numeric constants:



A numeric constant that contains neither a decimal point nor an exponent is initially presumed to be type integer if its value fits in type integer (32 bits); otherwise it is presumed to be type bigint if its value fits in type bigint (64 bits); otherwise it is taken to be type numeric. Constants that contain decimal points and/or exponents are always initially presumed to be type numeric.

如果一个数值常量既不包含小数点,也不包含指数操作符, 那么如果它的数值可以放在integer类型 中(32位), 则认为它是integer类型;如果它的数值可以放在 bigint中(64位),则认为它 是 bigint; 否则认为它是 numeric类型。包含小数点和/或指数操作符的常量总是被认为是 numeric类型。

The initially assigned data type of a numeric constant is just a starting point for the type resolution algorithms. In most cases the constant will be automatically coerced to the most appropriate type depending on context. When necessary, you can force a numeric value to be interpreted as a specific data type by casting it. For example, you can force a numeric value to be treated as type real (float4) by writing:

给一个数值常量赋予初始数据类型只是类型解析算法的开端。 在大多数情况下该常量会根据环境被自动 强制转换成最合适的类型。 必要时,你可以通过强制类型转换把一个数值解析成特定的数据类型。 比如, 你可以强制要求把一个数值当作类型real(float4)来看,方法是这么写:

REAL '1.23'  -- string style
1.23::REAL   -- PostgreSQL (historical) style

These are actually just special cases of the general casting notations discussed next.


其他类型常量(Constants of Other Types)

A constant of an arbitrary type can be entered using any one of the following notations:


type 'string'
CAST ( 'string' AS type )

The string constant's text is passed to the input conversion routine for the type called type. The result is a constant of the indicated type. The explicit type cast can be omitted if there is no ambiguity as to the type the constant must be (for example, when it is assigned directly to a table column), in which case it is automatically coerced.

在字串常量的文本将传递给那种叫 type 的类型的输入转换过程。 结果是这种类型的一个常量。如果不 存在该常量所属类型的歧义, 那么明确的类型映射可以省略(比如,当你把它直接赋予一个表字段的时 候), 这种情况下它会自动转换。

The string constant can be written using either regular SQL notation or dollar-quoting.

字串常量可以用普通 SQL 表示法或者美元符包围来书写。

It is also possible to specify a type coercion using a function-like syntax:


typename ( 'string' )

but not all type names can be used in this way; see Section 4.2.9 for details.

不过并非所有类型名可以这样使用;参阅 Section 4.2.9 获取细节。

The ::, CAST(), and function-call syntaxes can also be used to specify run-time type conversions of arbitrary expressions, as discussed in Section 4.2.9. To avoid syntactic ambiguity, the type 'string' syntax can only be used to specify the type of a simple literal constant. Another restriction on the type 'string' syntax is that it does not work for array types; use :: or CAST() to specify the type of an array constant.

这个::,CAST(),和函数调用语法也可以用于声明任意表达式的运行时类型转换, 如 Section 4.2.9 中讨论的那样。 但是 type 'string' 的形式只能用于声明一个文本常量的类型。 type 'string' 的另外一个限制是它不能用于数组类型;要用 :: 或者 CAST() 声明一个数组常量的类型。

The CAST() syntax conforms to SQL. The type 'string' syntax is a generalization of the standard: SQL specifies this syntax only for a few data types, but PostgreSQL allows it for all types. The syntax with :: is historical PostgreSQL usage, as is the function-call syntax.

CAST() 语法遵循 SQL。 type 'string' 语法是标准的一个推广:SQL 只是给少数几种数据类 型声明了这个语法, 但 PostgreSQL 允许将其用于所有类型。带 :: 的语法是 PostgreSQL 的历 史用法,函数调用语法也是。


An operator name is a sequence of up to NAMEDATALEN-1 (63 by default) characters from the following list:

一个操作符是最多 NAMEDATALEN-1 (缺省 63 个字符)个下列字符的序列:

+ - * / < > = ~ ! @ # % ^ & | ` ?

There are a few restrictions on operator names, however:


-- and /* cannot appear anywhere in an operator name, since they will be taken as the start of a comment.

• -- 和 /* 不能出现在操作符名字中的任何地方,因为它们会被当做注释开始对待。

A multiple-character operator name cannot end in + or -, unless the name also contains at least one of these characters:

~ ! @ # % ^ & | ` ?

• 多字符操作符不能以 + 或 - 结束, 除非其名字至少还包含下列操作符之一:

~ ! @ # % ^ & | ` ?

For example, @- is an allowed operator name, but *- is not. This restriction allows PostgreSQL to parse SQL-compliant queries without requiring spaces between tokens. When working with non-SQL-standard operator names, you will usually need to separate adjacent operators with spaces to avoid ambiguity. For example, if you have defined a left unary operator named @, you cannot write X*@Y; you must write X* @Y to ensure that PostgreSQL reads it as two operator names not one.

比如,@- 是允许的操作符名字, 但 *- 不是。这个限制允许 PostgreSQL 在不要求记号之 间有空白的情况下分析 SQL 兼容的查询。 当你使用非 SQL 标准的操作符名字的时候,你通常需要用空白分隔相邻的操作符以避免歧义。 比如, 如果你定义了一个叫 "@" 的左单目操作符,那么你就不能写 X*@Y;而是要写成 X* @Y 以确保 PostgreSQL 把它读成两个操作符,而不是一个。

特殊元(Special Characters)

Some characters that are not alphanumeric have a special meaning that is different from being an operator. Details on the usage can be found at the location where the respective syntax element is described. This section only exists to advise the existence and summarize the purposes of these characters.

有些非字母数字字符有一些特殊含义,因此不能用做操作符。 它们的用法的细节可以在相应的描述语法 元素的地方找到。 本节只是描述它们的存在和概括一下这些字符的目的。

A dollar sign ($) followed by digits is used to represent a positional parameter in the body of a function definition or a prepared statement. In other contexts the dollar sign can be part of an identifier or a dollar-quoted string constant.

• 美元符号($)后面跟着数字用于在一个函数体定义或者准备好的语句中 表示参数的位置。在其 他环境里美元符号可能是一个标识符名字或者是一个美元符包围的字串常量的一部分。

Parentheses (()) have their usual meaning to group expressions and enforce precedence. In some cases parentheses are required as part of the fixed syntax of a particular SQL command.

• 圆括弧(())用于分组和强制优先级的时候含义与平常一样。 有些场合里圆括弧是作为一个特 定 SQL 命令的固定语法的一部分要求的。

Brackets ([]) are used to select the elements of an array. See Section 8.14 for more information on arrays.

• 方括弧([])用于选取数组元素。 参阅 Section 8.10 获取更多信息。

Commas (,) are used in some syntactical constructs to separate the elements of a list.

• 逗号(,)在一些语法构造里用于分隔一个列表的元素。

The semicolon (;) terminates an SQL command. It cannot appear anywhere within a command, except within a string constant or quoted identifier.

• 分号(;)结束一条 SQL 命令。 它不能出现在一条命令里的任何地方,除非引号包围的来当做 字符串常量或者标识符用。

The colon (:) is used to select "slices" from arrays. (See Section 8.14.) In certain SQL dialects (such as Embedded SQL), the colon is used to prefix variable names.

• 冒号 (:)用于从数组中选取"片段"。(参阅 Section 8.10。)在一些 SQL 方言里(比如 嵌入 SQL ), 冒号用于前缀变量名。

The asterisk (*) is used in some contexts to denote all the fields of a table row or composite value. It also has a special meaning when used as the argument of an aggregate function, namely that the aggregate does not require any explicit parameter.

• 星号 (* 在某些环境里表示一个表行或者一个符合类型值的全部字段。 在用作聚集函数 COUNT 的参数时还有特殊含义。

The period (.) is used in numeric constants, and to separate schema, table, and column names.

• 句点 (.用在数字常量里,并用于分隔模式,表和字段名字。

注释 (Comments)

A comment is a sequence of characters beginning with double dashes and extending to the end of the line, e.g.:


-- This is a standard SQL comment

-- 这是标准的 SQL92 注释

Alternatively, C-style block comments can be used:

另外,还可以使用 C-风格的块注释:

/* multiline comment
 * with nesting: /* nested block comment */

where the comment begins with /* and extends to the matching occurrence of */. These block comments nest, as specified in the SQL standard but unlike C, so that one can comment out larger blocks of code that might contain existing block comments.

A comment is removed from the input stream before further syntax analysis and is effectively replaced by whitespace.

这里注释以 /* 开头并扩展到对应的 */。这些块注释可以嵌套,就象 SQL99 里说的那样, 但和 C 不一样,因此我们可以注释掉一大块已经包含块注释的代码。 注释在进一步的语法分析之前被从输入流删除并用空白代替。

语法的优先级(Lexical Precedence)

Table 4-2 shows the precedence and associativity of the operators in PostgreSQL. Most operators have the same precedence and are left-associative. The precedence and associativity of the operators is hard-wired into the parser. This can lead to non-intuitive behavior; for example the Boolean operators < and > have a different precedence than the Boolean operators <= and >=. Also, you will sometimes need to add parentheses when using combinations of binary and unary operators. For instance:

Table 4-2 显示了 PostgreSQL 里面的操作符的优先级和关联性。 大多数操作符都有相同的优先级 并且都是左关联的。 这种情况可能会有不那么直观的行为;比如,布尔操作符 < 和 > 和布尔操作符 <= 和 >= 之间有着不同的优先级。同样,当你把双目和单目操作符组合使用的时候, 有时候也需要加圆括 弧。比如

SELECT 5 ! - 6;

will be parsed as:


SELECT 5 ! (- 6);

because the parser has no idea — until it is too late — that ! is defined as a postfix operator, not an infix one. To get the desired behavior in this case, you must write:

因为分析器不知道 ! 定义成了后缀操作符, 而不是中缀操作符。— 知道的时候只能是太晚了 — 要在本 例中获得你需要的特性,你要写成

SELECT (5 !) - 6;

This is the price one pays for extensibility.


Table 4-2. Operator Precedence (decreasing)

Operator/Element Associativity Description
. left 表/列名字分隔
:: left PostgreSQL-风格的类型转换
[ ] left 数组元素的拣选
- right 一元减
^ left
* / % left 乘法,除法,模
+ - left 加,减
ISNULL 检测是否为 null
NOTNULL 检测是否不为 null
(any other) left all other native and user-defined operators
IN 集员
OVERLAPS 重叠的时间间隔
< > 小于,大于
= right 等于 ,附值
NOT right 逻辑否定
AND left 逻辑与
OR left 逻辑或

Note that the operator precedence rules also apply to user-defined operators that have the same names as the built-in operators mentioned above. For example, if you define a "+" operator for some custom data type it will have the same precedence as the built-in "+" operator, no matter what yours does.

请注意操作符优先级也适用于和上面提到的同名的内置操作符用户定义操作符。 比如,如果你为一些客 户数据类型定义一个 "+" 操作符, 那么它和内置的 "+" 操作符有同样的优先级,不管你干了什么。

When a schema-qualified operator name is used in the OPERATOR syntax, as for example in:

如果在 OPERATOR 语法里使用了模式修饰的操作符名, 比如

   SELECT 3 OPERATOR(pg_catalog.+) 4;

the OPERATOR construct is taken to have the default precedence shown in Table 4-2 for "any other" operator. This is true no matter which specific operator appears inside OPERATOR().

那么 OPERATOR 构造就会有 Table 4-2 表里面为"任何其它"操作符显示的缺省优先级。 不管什么特 定的操作符出现在 OPERATOR()里,都是这样。

值表达式(Value Expressions)

Value expressions are used in a variety of contexts, such as in the target list of the SELECT command, as new column values in INSERT or UPDATE, or in search conditions in a number of commands. The result of a value expression is sometimes called a scalar, to distinguish it from the result of a table expression (which is a table). Value expressions are therefore also called scalar expressions (or even simply expressions). The expression syntax allows the calculation of values from primitive parts using arithmetic, logical, set, and other operations.

值表达式用在各种语法环境中,比如在 SELECT 命令的目标列表中,在 INSERT 或 UPDATE 中用做 新的列值,或者在许多命令中的搜索条件中使用。 我们有时候把值表达式的结果叫做标量, 以便与一个 表表达式的结果相区别(是一个表)。因此值表达式也叫做标量表达式 (或者更简单的表达式)。表达 式语法允许对来自基本部分的数值进行算术,逻辑,集合,和其它操作的运算。

A value expression is one of the following:


A constant or literal value

• 一个常量或者文本值。

A column reference

• 一个字段引用。

A positional parameter reference, in the body of a function definition or prepared statement

• 一个位置参数引用,在函数声明体中。

A subscripted expression

• 一个下标表达式

A field selection expression

• 一个区域性拣选表达式

An operator invocation

• 一个操作符调用

A function call

• 一个函数调用。

An aggregate expression

• 一个标量子查询。

A window function call


A type cast

• 一个类型转换

A collation expression

• 一个集合表达式

A scalar subquery

• 一个标量子查询。

An array constructor

• 一个数组构造

A row constructor

• 一个行构造

Another value expression in parentheses (used to group subexpressions and override precedence)


In addition to this list, there are a number of constructs that can be classified as an expression but do not follow any general syntax rules. These generally have the semantics of a function or operator and are explained in the appropriate location in Chapter 9. An example is the IS NULL clause.

除了这个列表以外,还有许多构造可以归类为表达式,但是不遵循任何通用的语法规则。 它们通常有函 数或操作符的语义,并且在 Chapter 9 里合适的位置描述。 一个例子是 IS NULL 子句。

We have already discussed constants in Section 4.1.2. The following sections discuss the remaining options.

我们已经在 Section 4.1.2 里有讨论过的内容了。下面的节讨论剩下的选项。

列引用( Column References)

A column can be referenced in the form:



correlation is the name of a table (possibly qualified with a schema name), or an alias for a table defined by means of a FROM clause. The correlation name and separating dot can be omitted if the column name is unique across all the tables being used in the current query. (See also Chapter 7.)

correlation 是一个表的名字(可能有模式修饰), 或者是用FROM子句这样的方法定义的表的别 名,或者是关键字 NEW 或 OLD。 (NEW和 OLD只能出现在一条改写规则中, 而其他相关的名字可以 用于任意 SQL 语句中。) 如果在当前查询中所使用的所有表中,该字段名字是唯一的, 那么这个相关 名字和分隔用的点就可以省略。 (又见 Chapter 7。)

位置参数(Positional Parameters)

A positional parameter reference is used to indicate a value that is supplied externally to an SQL statement. Parameters are used in SQL function definitions and in prepared queries. Some client libraries also support specifying data values separately from the SQL command string, in which case parameters are used to refer to the out-of-line data values. The form of a parameter reference is:

位置参数引用用于标识从外部给一个 SQL 语句的一个参数。 参数用于 SQL 函数定义语句和准备好的 查询。 有些客户端库还支持在 SQL 命令字串外边声明数据值,这种情况下参数用于引用 SQL 字串行 外的数据。 一个参数的形式如下:


For example, consider the definition of a function, dept, as:

比如,看看一个函数 dept 的定义, 如下

    AS $$ SELECT * FROM dept WHERE name = $1 $$

Here the $1 references the value of the first function argument whenever the function is invoked.

在函数被调用的时候这里的 $1 将引用第一个函数的参数。


If an expression yields a value of an array type, then a specific element of the array value can be extracted by writing



or multiple adjacent elements (an "array slice") can be extracted by writing



(Here, the brackets [ ] are meant to appear literally.) Each subscript is itself an expression, which must yield an integer value.

(在这里,方括弧 [ ] 的意思是按照字面文本的方式出现。) 每个subscript自己都是一个表达式, 它必须生成一个整数值。

In general the array expression must be parenthesized, but the parentheses can be omitted when the expression to be subscripted is just a column reference or positional parameter. Also, multiple subscripts can be concatenated when the original array is multidimensional. For example:

通常,数组 expression 必须用圆括弧包围, 但如果要进行脚标计算的表达式只是一个字段引用或者 一个位置参数,那么圆括弧可以省略。 同样,如果源数组是多维的,那么多个脚标可以连接在一起。比 如,


The parentheses in the last example are required. See Section 8.14 for more about arrays.

最后一个例子里的圆括弧是必须的。参阅 Section 8.10 获取有关数组的更多信息。

区域性拣选(Field Selection)

If an expression yields a value of a composite type (row type), then a specific field of the row can be extracted by writing



In general the row expression must be parenthesized, but the parentheses can be omitted when the expression to be selected from is just a table reference or positional parameter. For example:

通常,行 expression 必须用圆括弧包围, 但是如果要选取的表达式只是一个表引用或者位置参数, 可以省略圆括弧。 比如


(Thus, a qualified column reference is actually just a special case of the field selection syntax.) (因此,一个全称的字段引用实际上只是一个字段选择语法的特例。)

An important special case is extracting a field from a table column that is of a composite type: 一个重要的特殊情况是从一个表中的列中提取一个字段,这是一个复合类型:


The parentheses are required here to show that compositecol is a column name not a table name, or that mytable is a table name not a schema name in the second case.


In a select list (see Section 7.3), you can ask for all fields of a composite value by writing .*:

在一个拣选列表中(查看 Section 7.3),你可以通过书写.*获得复合值的字段:


操作符的调用(Operator Invocations)

There are three possible syntaxes for an operator invocation:


expression operator expression (binary infix operator)
operator expression (unary prefix operator)
expression operator (unary postfix operator)

where the operator token follows the syntax rules of Section 4.1.3, or is one of the key words AND, OR, and NOT, or is a qualified operator name in the form:

这里的 operator 记号遵循语法规则: Section 4.1.3, 或者是记号:AND, OR,和 NOT 之一。 或者是一个被修饰的操作符名


Which particular operators exist and whether they are unary or binary depends on what operators have been defined by the system or the user. Chapter 9 describes the built-in operators.

具体存在哪个操作符以及它们是单目还是双目取决于系统或用户定义了什么操作符。Chapter 9 描述 了内置的操作符。

函数调用 (Function Calls)

The syntax for a function call is the name of a function (possibly qualified with a schema name), followed by its argument list enclosed in parentheses:

函数调用的语法是合法函数名字(可能有模式名修饰), 后面跟着在圆括弧里的它的参数列表:

function_name ([expression [, expression ... ]] )

For example, the following computes the square root of 2:

比如,下面的代码计算 2 的平方根:


The list of built-in functions is in Chapter 9. Other functions can be added by the user.

内置函数的列表在 Chapter 9 里。 其它函数可以由用户添加。

The arguments can optionally have names attached. See Section 4.3 for details.

参数可以附带一个可选名字请参见Section 4.3。 Note: A function that takes a single argument of composite type can optionally be called using field-selection syntax, and conversely field selection can be written in functional style. That is, the notations col(table) and table.col are interchangeable. This behavior is not SQL-standard but is provided in PostgreSQL because it allows use of functions to emulate "computed fields". For more information see Section 35.4.2.

注意: 一个函数有一个单一复合类型的参数,可以使用字段-拣选语法调用,反之字段拣选可以写在函数式风格。也就是说,表達方式为 列(表)和表。列是通用的。这种行为是不是SQL标准,但PostgreSQL提供了因为他充许函数模拟“计算字段”。更多的信息查看 Section 35.4.2

聚合表达式(Aggregate Expressions)

An aggregate expression represents the application of an aggregate function across the rows selected by a query. An aggregate function reduces multiple inputs to a single output value, such as the sum or average of the inputs. The syntax of an aggregate expression is one of the following:

一个聚集表达式代表一个聚集函数对一个查询选出的行的处理。 一个聚集函数把多个输入缩减为一个输 出值, 比如给输入求和或平均。一个聚集表达式的语法是下列之一:

aggregate_name (expression [ , ... ] [ order_by_clause ] ) aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) aggregate_name ( * )

where aggregate_name is a previously defined aggregate (possibly qualified with a schema name), expression is any value expression that does not itself contain an aggregate expression or a window function call, and order_by_clause is a optional ORDER BY clause as described below.

这里 aggregate_name 是前面定义的聚集,(可能是全称), 而 expression 是一个本身不包 含聚集表达式的任意值表达式。

The first form of aggregate expression invokes the aggregate once for each input row. The second form is the same as the first, since ALL is the default. The third form invokes the aggregate once for each distinct value of the expression (or distinct set of values, for multiple expressions) found in the input rows. The last form invokes the aggregate once for each input row; since no particular input value is specified, it is generally only useful for the count(*) aggregate function.

第一种形式的聚集表达式为所有表达式生成非空值的输入行调用聚集。 (实际上,是否忽略空值由聚集 函数决定 — 但是所有标准的聚集函数都忽略它们。) 第二种形式和第一种一样,因为 ALL 是缺省值。 第三种形式为所有输入行里找到表达式的所有唯一的非空值调用聚集。 最后一种形式为每个输入行(不 管是空还是非空)调用一次聚集; 因为没有声明特定的输入值。通常它只是对 count(*) 聚集函数有用。

Most aggregate functions ignore null inputs, so that rows in which one or more of the expression(s) yield null are discarded. This can be assumed to be true, unless otherwise specified, for all built-in aggregates.

For example, count(*) yields the total number of input rows; count(f1) yields the number of input rows in which f1 is non-null, since count ignores nulls; and count(distinct f1) yields the number of distinct non-null values of f1.

比如,count(*) 生成输入行的总数; count(f1) 生成 f1 为非空的输入行数; count(distinct f1) 生成 f1 唯一非空的行数。

Ordinarily, the input rows are fed to the aggregate function in an unspecified order. In many cases this does not matter; for example, min produces the same result no matter what order it receives the inputs in. However, some aggregate functions (such as array_agg and string_agg) produce results that depend on the ordering of the input rows. When using such an aggregate, the optional order_by_clause can be used to specify the desired ordering. The order_by_clause has the same syntax as for a query-level ORDER BY clause, as described in Section 7.5, except that its expressions are always just expressions and cannot be output-column names or numbers. For example:

SELECT array_agg(a ORDER BY b DESC) FROM table; When dealing with multiple-argument aggregate functions, note that the ORDER BY clause goes after all the aggregate arguments. For example, write this:

SELECT string_agg(a, ',' ORDER BY a) FROM table; not this:

SELECT string_agg(a ORDER BY a, ',') FROM table; -- incorrect The latter is syntactically valid, but it represents a call of a single-argument aggregate function with two ORDER BY keys (the second one being rather useless since it's a constant).

If DISTINCT is specified in addition to an order_by_clause, then all the ORDER BY expressions must match regular arguments of the aggregate; that is, you cannot sort on an expression that is not included in the DISTINCT list.

Note: The ability to specify both DISTINCT and ORDER BY in an aggregate function is a PostgreSQL extension.

The predefined aggregate functions are described in Section 9.18. Other aggregate functions can be added by the user.

An aggregate expression can only appear in the result list or HAVING clause of a SELECT command. It is forbidden in other clauses, such as WHERE, because those clauses are logically evaluated before the results of aggregates are formed.

When an aggregate expression appears in a subquery (see Section 4.2.11 and Section 9.20), the aggregate is normally evaluated over the rows of the subquery. But an exception occurs if the aggregate's arguments contain only outer-level variables: the aggregate then belongs to the nearest such outer level, and is evaluated over the rows of that query. The aggregate expression as a whole is then an outer reference for the subquery it appears in, and acts as a constant over any one evaluation of that subquery. The restriction about appearing only in the result list or HAVING clause applies with respect to the query level that the aggregate belongs to.

如果一个聚集表达式出现在一个子查询里(参阅 Section 4.2.9 和 Section 9.16), 聚集通常是 在子查询的行上进行计算。但是如果聚集的参数只包含外层查询的变量则有一个例外: 这个聚集会属于 离他最近的外层查询,并且在该查询上进行计算。 该聚集表达式整体上属于它出现的子查询对外层查询 的引用,其作用相当于子查询任何一次计算中的一个常量。 这个聚集表达式的有关只能出现在结果列或 者 HAVING 子句的限制适用于聚集所属的查询层。

窗体函数的调用(Window Function Calls)

A window function call represents the application of an aggregate-like function over some portion of the rows selected by a query. Unlike regular aggregate function calls, this is not tied to grouping of the selected rows into a single output row — each row remains separate in the query output. However the window function is able to scan all the rows that would be part of the current row's group according to the grouping specification (PARTITION BY list) of the window function call. The syntax of a window function call is one of the following:

function_name ([expression [, expression ... ]]) OVER ( window_definition ) function_name ([expression [, expression ... ]]) OVER window_name function_name ( * ) OVER ( window_definition ) function_name ( * ) OVER window_name where window_definition has the syntax

[ existing_window_name ] [ PARTITION BY expression [, ...] ] [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ] [ frame_clause ] and the optional frame_clause can be one of

[ RANGE | ROWS ] frame_start [ RANGE | ROWS ] BETWEEN frame_start AND frame_end where frame_start and frame_end can be one of

UNBOUNDED PRECEDING value PRECEDING CURRENT ROW value FOLLOWING UNBOUNDED FOLLOWING Here, expression represents any value expression that does not itself contain window function calls. The PARTITION BY and ORDER BY lists have essentially the same syntax and semantics as GROUP BY and ORDER BY clauses of the whole query, except that their expressions are always just expressions and cannot be output-column names or numbers. window_name is a reference to a named window specification defined in the query's WINDOW clause. Named window specifications are usually referenced with just OVER window_name, but it is also possible to write a window name inside the parentheses and then optionally supply an ordering clause and/or frame clause (the referenced window must lack these clauses, if they are supplied here). This latter syntax follows the same rules as modifying an existing window name within the WINDOW clause; see the SELECT reference page for details.

The frame_clause specifies the set of rows constituting the window frame, for those window functions that act on the frame instead of the whole partition. If frame_end is omitted it defaults to CURRENT ROW. Restrictions are that frame_start cannot be UNBOUNDED FOLLOWING, frame_end cannot be UNBOUNDED PRECEDING, and the frame_end choice cannot appear earlier in the above list than the frame_start choice — for example RANGE BETWEEN CURRENT ROW AND value PRECEDING is not allowed. The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW; it sets the frame to be all rows from the partition start up through the current row's last peer in the ORDER BY ordering (which means all rows if there is no ORDER BY). In general, UNBOUNDED PRECEDING means that the frame starts with the first row of the partition, and similarly UNBOUNDED FOLLOWING means that the frame ends with the last row of the partition (regardless of RANGE or ROWS mode). In ROWS mode, CURRENT ROW means that the frame starts or ends with the current row; but in RANGE mode it means that the frame starts or ends with the current row's first or last peer in the ORDER BY ordering. The value PRECEDING and value FOLLOWING cases are currently only allowed in ROWS mode. They indicate that the frame starts or ends with the row that many rows before or after the current row. value must be an integer expression not containing any variables, aggregate functions, or window functions. The value must not be null or negative; but it can be zero, which selects the current row itself.

The built-in window functions are described in Table 9-44. Other window functions can be added by the user. Also, any built-in or user-defined aggregate function can be used as a window function.

The syntaxes using * are used for calling parameter-less aggregate functions as window functions, for example count(*) OVER (PARTITION BY x ORDER BY y). * is customarily not used for non-aggregate window functions. Aggregate window functions, unlike normal aggregate functions, do not allow DISTINCT or ORDER BY to be used within the function argument list.

Window function calls are permitted only in the SELECT list and the ORDER BY clause of the query.

More information about window functions can be found in Section 3.5, Section 9.19, Section 7.2.4.

类型的转换(Type Casts)

A type cast specifies a conversion from one data type to another. PostgreSQL accepts two equivalent syntaxes for type casts:

一个类型转换声明一个从一种数据类型到另外一种数据类型的转换。 PostgreSQL 接受两种等效的类 型转换语法:

CAST ( expression AS type )

The CAST syntax conforms to SQL; the syntax with :: is historical PostgreSQL usage.

CAST 语法遵循 SQL;:: 的语法是 PostgreSQL 传统用法。

When a cast is applied to a value expression of a known type, it represents a run-time type conversion. The cast will succeed only if a suitable type conversion operation has been defined. Notice that this is subtly different from the use of casts with constants, as shown in Section A cast applied to an unadorned string literal represents the initial assignment of a type to a literal constant value, and so it will succeed for any type (if the contents of the string literal are acceptable input syntax for the data type).

如果对一个已知类型的值表达式应用转换,它代表一个运行时类型转换。 只有在定义了合适的类型转换 操作的情况下,该转换才能成功。 请注意这一点和用于常量的转换略有区别,如 Section 所 示。 一个应用于某个未修饰的字串文本的转换表示给一个字串文本数值赋予一个初始化类型, 因此它对 于任何类型都会成功(如果字串文本的内容符合该数据类型的输入语法接受。)

An explicit type cast can usually be omitted if there is no ambiguity as to the type that a value expression must produce (for example, when it is assigned to a table column); the system will automatically apply a type cast in such cases. However, automatic casting is only done for casts that are marked "OK to apply implicitly" in the system catalogs. Other casts must be invoked with explicit casting syntax. This restriction is intended to prevent surprising conversions from being applied silently.

如果对于一个值表达式生成的数值对某类型而言不存在混淆的情况, 那么我们可以省略明确的类型转换 (比如,在给一个表字段赋值的时候); 在这样的情况下,系统将自动附加一个类型转换。 不过,自动 转换只适用于那些系统表中标记着 "OK to apply implicitly" 的转换函数。 其它转换函数必须用明 确的转换语法调用。 这些限制是为了避免一些怪异的转换被应用。

It is also possible to specify a type cast using a function-like syntax:


typename ( expression )

However, this only works for types whose names are also valid as function names. For example, double precision cannot be used this way, but the equivalent float8 can. Also, the names interval, time, and timestamp can only be used in this fashion if they are double-quoted, because of syntactic conflicts. Therefore, the use of the function-like cast syntax leads to inconsistencies and should probably be avoided.

不过,这个方法只能用于那些名字同时也是有效函数名字的类型。 比如,double precision 就不 能这么用, 但是等效的 float8 可以。同样,interval, time,和 timestamp 如果加了双引 号也只能这么用, 因为存在语法冲突。因此,函数样的类型转换会导致不一致, 所以可能应该避免在新 应用中这么用。

Note: The function-like syntax is in fact just a function call. When one of the two standard cast syntaxes is used to do a run-time conversion, it will internally invoke a registered function to perform the conversion. By convention, these conversion functions have the same name as their output type, and thus the "function-like syntax" is nothing more than a direct invocation of the underlying conversion function. Obviously, this is not something that a portable application should rely on. For further details see CREATE CAST.

(函数样语法实际上就似乎一个函数调用。如果使用两种标准转换语法做运行时转换, 那么它将在内部调用一个已注册得函数执行转换。通常, 这种转换函数和它们得输出类型同名,但是这 个要点可不是那些可以移植的程序可以依赖的东西。)

校对表达式(Collation Expressions)

The COLLATE clause overrides the collation of an expression. It is appended to the expression it applies to:

expr COLLATE collation

where collation is a possibly schema-qualified identifier. The COLLATE clause binds tighter than operators; parentheses can be used when necessary.

If no collation is explicitly specified, the database system either derives a collation from the columns involved in the expression, or it defaults to the default collation of the database if no column is involved in the expression.

The two common uses of the COLLATE clause are overriding the sort order in an ORDER BY clause, for example:


and overriding the collation of a function or operator call that has locale-sensitive results, for example:

SELECT * FROM tbl WHERE a > 'foo' COLLATE "C";

Note that in the latter case the COLLATE clause is attached to an input argument of the operator we wish to affect. It doesn't matter which argument of the operator or function call the COLLATE clause is attached to, because the collation that is applied by the operator or function is derived by considering all arguments, and an explicit COLLATE clause will override the collations of all other arguments. (Attaching non-matching COLLATE clauses to more than one argument, however, is an error. For more details see Section 22.2.) Thus, this gives the same result as the previous example:

SELECT * FROM tbl WHERE a COLLATE "C" > 'foo';

But this is an error:

SELECT * FROM tbl WHERE (a > 'foo') COLLATE "C";

because it attempts to apply a collation to the result of the > operator, which is of the non-collatable data type boolean.

标量子查询(Scalar Subqueries)

A scalar subquery is an ordinary SELECT query in parentheses that returns exactly one row with one column. (See Chapter 7 for information about writing queries.) The SELECT query is executed and the single returned value is used in the surrounding value expression. It is an error to use a query that returns more than one row or more than one column as a scalar subquery. (But if, during a particular execution, the subquery returns no rows, there is no error; the scalar result is taken to be null.) The subquery can refer to variables from the surrounding query, which will act as constants during any one evaluation of the subquery. See also Section 9.20 for other expressions involving subqueries.

一个标量子查询是一个放在圆括弧里的普通 SELECT查询, 它只返回只有一个字段的一行。(参阅 Chapter 7 获取有关写查询的信息。) 该 SELECT 将被执行, 而其单个返回值将在周围的值表达式 中使用。 把一个返回超过一行或者超过一列的查询用做标量查询是错误的。 (不过,在特定的执行中, 子查询不返回行则不算错误;标量结果认为是NULL。) 该子查询可以引用周围查询的变量,那些变 量也是在计算任意子查询的时候当做常量使用的。 又见 Section 9.16。

For example, the following finds the largest city population in each state:


SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
    FROM states;

数组构造(Array Constructors)

An array constructor is an expression that builds an array value using values for its member elements. A simple array constructor consists of the key word ARRAY, a left square bracket [, a list of expressions (separated by commas) for the array element values, and finally a right square bracket ]. For example:

一个数组构造器是一个表达式,它从它的成员元素上构造一个数组值。 一个简单的数组构造器由关键字 ARRAY,一个左方括弧 [, 一个或多个表达式(用逗号分隔)表示数组元素值,以及最后一个右方括弧 ]。 比如

SELECT ARRAY[1,2,3+4];
(1 row)

By default, the array element type is the common type of the member expressions, determined using the same rules as for UNION or CASE constructs (see Section 10.5). You can override this by explicitly casting the array constructor to the desired type, for example:

SELECT ARRAY[1,2,22.7]::integer[];
(1 row)

This has the same effect as casting each expression to the array element type individually. For more on casting, see Section 4.2.9.

Multidimensional array values can be built by nesting array constructors. In the inner constructors, the key word ARRAY can be omitted. For example, these produce the same result:

多维数组值可以通过嵌套数组构造器的方法来制作。 在内层构造器里,关键字 ARRAY 可以省略。比如, 下面的两句生成同样的结果:

(1 row)

SELECT ARRAY[[1,2],[3,4]];
(1 row)

Since multidimensional arrays must be rectangular, inner constructors at the same level must produce sub-arrays of identical dimensions. Any cast applied to the outer ARRAY constructor propagates automatically to all the inner constructors.


Multidimensional array constructor elements can be anything yielding an array of the proper kind, not only a sub-ARRAY construct. For example:

多维数组构造器元素可以是任何生成合适数组的东西,而不仅仅是一个子 ARRAY 构造。 比如:

CREATE TABLE arr(f1 int[], f2 int[]);

INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], ARRAY[[5,6],[7,8]]);

SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
(1 row)

You can construct an empty array, but since it's impossible to have an array with no type, you must explicitly cast your empty array to the desired type. For example:

我们也可以从一个子查询的结果中构造一个数组。在这种形式下, 数组构造器是用关键字 ARRAY 后面 跟着一个用圆括弧(不是方括弧)包围的子查询。 比如:

SELECT ARRAY[]::integer[];
(1 row)

It is also possible to construct an array from the results of a subquery. In this form, the array constructor is written with the key word ARRAY followed by a parenthesized (not bracketed) subquery. For example:

SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
(1 row)

The subquery must return a single column. The resulting one-dimensional array will have an element for each row in the subquery result, with an element type matching that of the subquery's output column.

子查询必须返回一个字段。生成的一维数组将为子查询里每行结果生成一个元素, 元素类型匹配子查询 的输出字段。

The subscripts of an array value built with ARRAY always begin with one. For more information about arrays, see Section 8.14.

用 ARRAY 建立的数组值的脚标总是从一开始。 有关数组的更多信息,参阅 Section 8.14。

行构造( Row Constructors)

A row constructor is an expression that builds a row value (also called a composite value) using values for its member fields. A row constructor consists of the key word ROW, a left parenthesis, zero or more expressions (separated by commas) for the row field values, and finally a right parenthesis. For example:

一个行构造器是一个从提供给它的成员字段数值中制作行数值(也叫复合类型值)的表达式。 一个行构 造器由关键字 ROW,一个左圆括弧, 零个或者多个用做行字段值的表达式(用逗号分隔),以及最后一 个右圆括弧。比如,

SELECT ROW(1,2.5,'this is a test');

The key word ROW is optional when there is more than one expression in the list.

如果在列表里有多个表达式,那么关键字 ROW 是可选的。

A row constructor can include the syntax rowvalue.*, which will be expanded to a list of the elements of the row value, just as occurs when the .* syntax is used at the top level of a SELECT list. For example, if table t has columns f1 and f2, these are the same:

缺省时,ROW 表达式创建的值是一个匿名的记录类型。如果必要,你可以把它转换成一个命名的复合类 型 — 既可以是一个表的行类型,也可以是一个用 CREATE TYPE AS 创建的复合类型。 可能会需要 一个明确的转换以避免歧义。比如:

SELECT ROW(t.*, 42) FROM t;
SELECT ROW(t.f1, t.f2, 42) FROM t;

Note: Before PostgreSQL 8.2, the .* syntax was not expanded, so that writing ROW(t.*, 42) created a two-field row whose first field was another row value. The new behavior is usually more useful. If you need the old behavior of nested row values, write the inner row value without .*, for instance ROW(t, 42).

By default, the value created by a ROW expression is of an anonymous record type. If necessary, it can be cast to a named composite type — either the row type of a table, or a composite type created with CREATE TYPE AS. An explicit cast might be needed to avoid ambiguity. For example:

CREATE TABLE mytable(f1 int, f2 float, f3 text);


-- No cast needed since only one getf1() exists

-- 因为只有一个 getf1() 存在,所以不需要类型转换

SELECT getf1(ROW(1,2.5,'this is a test'));
(1 row)

CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);


-- Now we need a cast to indicate which function to call:

-- 现在我们需要类型转换以表明调用哪个函数:

SELECT getf1(ROW(1,2.5,'this is a test'));

ERROR: function getf1(record) is not unique

SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
(1 row)

SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
(1 row)

Row constructors can be used to build composite values to be stored in a composite-type table column, or to be passed to a function that accepts a composite parameter. Also, it is possible to compare two row values or test a row with IS NULL or IS NOT NULL, for example:

行构造器可以用于制作存储在复合类型表字段里面的复合类型值, 或者是传递给一个接受复合类型参数 的函数。还有,我们也可以比较两个行数值或者用 IS NULL 或 IS NOT NULL 测试一个行数值,比 如

SELECT ROW(1,2.5,'this is a test') = ROW(1, 3, 'not the same');

SELECT ROW(table.*) IS NULL FROM table;  -- detect all-null rows

For more detail see Section 9.21. Row constructors can also be used in connection with subqueries, as discussed in Section 9.20.

更多的细节,请参阅 Section 9.21。 行构造还可以用于连接子查询,这些在 Section 9.20 里面有 详细讨论。

表达式的值的规则(Expression Evaluation Rules)

The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order.

子表达式的计算顺序是没有定义的。特别要指出的是, 一个操作符或者函数的输入并不一定是按照从左 向右的顺序或者以某种特定的顺序进行计算的。

Furthermore, if the result of an expression can be determined by evaluating only some parts of it, then other subexpressions might not be evaluated at all. For instance, if one wrote:

另外,如果一个表达式的结果可以通过只判断它的一部分就可以得到, 那么其它子表达式就可以完全不 计算了。比如,如果我们这么写

SELECT true OR somefunc();

then somefunc() would (probably) not be called at all. The same would be the case if one wrote:

那么 somefunc() 就(可能)根本不会被调用。 如果我们写下面的,也可能会是这样

SELECT somefunc() OR true;

Note that this is not the same as the left-to-right "short-circuiting" of Boolean operators that is found in some programming languages.


As a consequence, it is unwise to use functions with side effects as part of complex expressions. It is particularly dangerous to rely on side effects or evaluation order in WHERE and HAVING clauses, since those clauses are extensively reprocessed as part of developing an execution plan. Boolean expressions (AND/OR/NOT combinations) in those clauses can be reorganized in any manner allowed by the laws of Boolean algebra.

因此,拿那些有副作用的函数作为复杂表达式的一部分是不明智的选择。 在 WHERE 和 HAVING 子句 里面依赖副作用或者是计算顺序是特别危险的, 因为这些子句都是作为生成一个执行规划的一部分进行 了大量的再处理。 在这些子句里的布尔表达式(AND/OR/NOT 的组合)可以以布尔代数运算律允许的 任意方式进行识别。

When it is essential to force evaluation order, a CASE construct (see Section 9.16) can be used. For example, this is an untrustworthy way of trying to avoid division by zero in a WHERE clause:

如果强制计算顺序非常重要,那么可以使用 CASE 构造(参阅 Section 9.13)。 比如,下面是一种 视图避免在 WHERE 子句里被零除的不可信的方法:

SELECT ... WHERE x > 0 AND y/x > 1.5;

But this is safe:

但是下面这样的是安全的: SELECT ... WHERE CASE WHEN x > 0 THEN y/x > 1.5 ELSE false END;

A CASE construct used in this fashion will defeat optimization attempts, so it should only be done when necessary. (In this particular example, it would be better to sidestep the problem by writing y > 1.5*x instead.)

用这种风格的 CASE 构造会阻止优化,因此应该只在必要的时候使用。 (在这个特殊的例子里,毫无疑 问写成 y > 1.5*x 更好。)

调用函数(Calling Functions)

PostgreSQL allows functions that have named parameters to be called using either positional or named notation. Named notation is especially useful for functions that have a large number of parameters, since it makes the associations between parameters and actual arguments more explicit and reliable. In positional notation, a function call is written with its argument values in the same order as they are defined in the function declaration. In named notation, the arguments are matched to the function parameters by name and can be written in any order.

In either notation, parameters that have default values given in the function declaration need not be written in the call at all. But this is particularly useful in named notation, since any combination of parameters can be omitted; while in positional notation parameters can only be omitted from right to left.

PostgreSQL also supports mixed notation, which combines positional and named notation. In this case, positional parameters are written first and named parameters appear after them.

The following examples will illustrate the usage of all three notations, using the following function definition:

CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
        WHEN $3 THEN UPPER($1 || ' ' || $2)
        ELSE LOWER($1 || ' ' || $2)

Function concat_lower_or_upper has two mandatory parameters, a and b. Additionally there is one optional parameter uppercase which defaults to false. The a and b inputs will be concatenated, and forced to either upper or lower case depending on the uppercase parameter. The remaining details of this function definition are not important here (see Chapter 35 for more information).

使用位置标记(Using Positional Notation)

Positional notation is the traditional mechanism for passing arguments to functions in PostgreSQL. An example is:

SELECT concat_lower_or_upper('Hello', 'World', true);
(1 row)

All arguments are specified in order. The result is upper case since uppercase is specified as true. Another example is:

SELECT concat_lower_or_upper('Hello', 'World');
 hello world
(1 row)

Here, the uppercase parameter is omitted, so it receives its default value of false, resulting in lower case output. In positional notation, arguments can be omitted from right to left so long as they have defaults.

使用命名标记 (Using Named Notation)

In named notation, each argument's name is specified using := to separate it from the argument expression. For example:

SELECT concat_lower_or_upper(a := 'Hello', b := 'World');
 hello world
(1 row)

Again, the argument uppercase was omitted so it is set to false implicitly. One advantage of using named notation is that the arguments may be specified in any order, for example:

SELECT concat_lower_or_upper(a := 'Hello', b := 'World', uppercase := true);
(1 row)
SELECT concat_lower_or_upper(a := 'Hello', uppercase := true, b := 'World');
(1 row)

使用混合标记(Using Mixed Notation)

The mixed notation combines positional and named notation. However, as already mentioned, named arguments cannot precede positional arguments. For example:


SELECT concat_lower_or_upper('Hello', 'World', uppercase := true);
(1 row)

In the above query, the arguments a and b are specified positionally, while uppercase is specified by name. In this example, that adds little except documentation. With a more complex function having numerous parameters that have default values, named or mixed notation can save a great deal of writing and reduce chances for error.