PostgreSQL wiki - User contributions [en]

FOSDEM/PGDay 2017 Developer Meeting

2017-01-04T14:27:28Z

Hanada:

A meeting of the interested PostgreSQL developers is being planned for Thursday 2nd February, 2017 at the Brussels Marriott Hotel, prior to FOSDEM/PGDay 2017. In order to keep the numbers manageable, this meeting is by '''invitation only'''. Unfortunately it is quite possible that we've overlooked important individuals during the planning of the event - if you feel you fall into this category and would like to attend, please contact Dave Page (dpage@pgadmin.org).

Please note that the attendee numbers have been kept low in order to keep the meeting more productive. Invitations have been sent only to developers that have been highly active on the database server over the 9.6 and 10 release cycles. We have not invited any contributors based on their contributions to related projects, or seniority in regional user groups or sponsoring companies.

This is a PostgreSQL Community event.

== Meeting Goals ==

* Review the progress of the 10.0 schedule, and formulate plans to address any issues
* Address any proposed timing, policy, or procedure issues
* Address any proposed [http://en.wikipedia.org/wiki/Wicked_problem Wicked problems]

== Time & Location ==

The meeting will be:

* 9:00AM to 5:00PM
* Brussels Marriott Hotel

Coffee, tea and snacks will be served starting at 8:45am. Lunch will be provided.

== RSVPs ==

The following people have RSVPed to the meeting (in alphabetical order, by surname) and will be attending:

* Oleg Bartunov
* Andrew Dunstan
* Etsuro Fujita
* Magnus Hagander
* Petr Jelinek
* Alexander Korotkov
* Noah Misch
* Bruce Momjian
* Simon Riggs
* Dave Page
* Masahiko Sawada
* Teodor Sigaev
* Tomas Vondra

The following people have sent their apologies:

* Dimitri Fontaine
* Kyotaro Horiguchi
* Shigeru Hanada
* Amit Kapila
* Tom Lane
* Thomas Munro
* Michael Paquier
* Craig Ringer

==Agenda Items==

Please add agenda items here!

* Sharding update

* Setting up the Release Management Team for Postgres 10.0 (Simon)

==Agenda==

{| border="1" cellpadding="4" cellspacing="0"
!Time
!Item
!Presenter
|- style="font-style:italic;background-color:lightgray;"
|09:00 - 09:10
|Welcome and introductions
|Dave

|-
|09:10 - 09:30
|10.0 Release Schedule
|All

|- style="font-style:italic;background-color:lightgray;"
|10:30 - 11:00
|Coffee break
|All

|- style="font-style:italic;background-color:lightgray;"
|12:30 - 13:30
|Lunch
|All

|- style="font-style:italic;background-color:lightgray;"
|15:00 - 15:30
|Tea break
|All

|-
|16:30 - 17:00
|Any other business
|Dave

|- style="font-style:italic;background-color:lightgray;"
|17:00
|Finish
|
|}

== Minutes ==

PgCon2015ClusterSummit

2015-06-18T18:01:20Z

Hanada: /* pgCon 2015 Cluster Hacker Summit */ Remove duplicate item

=pgCon 2015 Cluster Hacker Summit=

This year's Cluster Hacker Summit will be part of the [[PgCon_2015_Developer_Unconference]]. As such, this page will be used to coordinate sessions to propose for the Unconference, and eventually to list an agenda for the Clustering Track.

The Cluster Summit covers both general PostgreSQL clustering, as well as PostgresXC and PostgresXL development.

As it is part of the Developer Unconference, Clustering sessions will take place starting in the afternoon of Tuesday, June 16 through 5pm on Wednesday, June 17. If you are participating, and will not be able to make it on Tuesday, please note that in your attendance comments.

== Attendee RSVPs ==

* Josh Berkus (both days)
* Koichi Suzuki
* Tatsuo Ishii
* Yugo Nagata
* Steve Singer (arrive tuesday mid-afternoon)
* Jan Wieck (arrive tuesday evening)
* Shigeru Hanada
* Ahsan Hadi
* Ashutosh Bapat
* Bruce Momjian
* Etsuro Fujita
* Tetsuo Sakata
* Amit Langote
* Kyotaro Horiguchi
* Ozgun Erdogan (Wednesday)
* Marco Slot (Wednesday)
* Simon Riggs

== Suggested Sessions ==

For each session below please provide a title and a moderator/leader/speaker for the session.

=== Pgpool-II: toward next major version 3.5 ===

Firstly we report the project progress since last year's Cluster Summit: introducing pgpool-II 3.4. Then we explain the current status of pgpool-II 3.5 which is under development.

Session Leader: Tatsuo Ishii

==== Attendees ====

==== Meeting Notes ====

=== Horizontal Scalability and Sharding ===

Session Leaders: Ahsan Hadi, Ashutosh Bapat

==== Attendees ====

==== Meeting Notes ====

=== Slony ===

What is the future for Slony development? Are users interested in a new Slony based on Logical Decoding? Who's going to work on this?

Session Leaders: Steve Singer, Chris Browne, Jan Wieck

==== Session Notes ====

Slony Development

Leader: Steve Singer, Chris Browne, Jan Wieck

Note-taker: Josh Berkus

In the past year Slony development has been pretty stagnant. Steve has worked on a prototype with Logical Decoding. Works for a demo, but not all features work, and performance was not all that impressive. Part of that is that how much we've optimized the Slony log. Logical slony has some lower write overhead, but the latency for assembly on the other side is not insignificant.

The stability of slony and features are OK, but will it survive with modern features of Postgres. With some features it will be different. It does something which others don't.

Chris: only takes a few hours of work per year to keep up with Postgres releases. But we've been forgetting how to do releases and management.

Jan: mentioned limitations of UDR/BDR. But most users are on older versions. Even if 9.6 has everything built in, people will still be using older versions for 5 years. Slony was built to allow upgrading. If it doesn't make sense to maintain them then we won't.

Slony is still useful for upgrading across architectures and character decodings. Josh mentioned the early stage of development of UDR. Steve said that it would take more than one year to get slony working with Logical Decoding and to make it as stable as trigger-based replication.

Chris asked what the issues were with LD and Slony development. Jan pointed out that for partial replication ... where you're replicating a few low-volume tables ... then saving all the WAL for the replication slot is a big loss. So LD and classic Slony log table need to coexist.

Josh said there's two reasons why he wants LD: (1) easier installation and (2) bloat in the queue tables caused by long-running transactions (3) removal overhead.

Re: bloat Jan mentioned Grittner's snapshot-too-old work. Josh said that work is partly because of Slony. You can't even truncate segments if there's a long-running transaction. Chris mentioned the idea of per-table logs.

Moving the reporting queries onto the replica can fix this issue, but it just moves the bloat around. More discussion around this. Doing this against a non-forwarding replica would prevent bloat.

Other requests:

* Parallel initial data copy.
* Call Slony as a library (mostly Python)
* Add tables to multiple versions of databases
* No table locks
* More visibility commands via Slonik, once it's a library
* Cancel subscribe set in progress

Libraries: one way to get around the multiple library issue is to expose the function API. But according to Jan the function API is not complete. Steve would rather have Slonik as a shared library. Calling the slony SPs directly isn't safe. Dave Cramer asked if we do a Python library how are we going to call it from Java? We'll create a C shared library.

Valentin suggestes exposing the shared libaries via stored procedures. Chris pointed out that the lack of autonomous transactions prevents this. But if the library is wrapped in functions it can be exposed. Or create some kind of REST API. Jan said that they're planning on supporting pgAdmin4, so an abstracted libary would support this. This is a good idea. Some people use GUIs, but some don't.

Steve asked if people are using it for DR. Josh mentioned one multi-region case. Valintin said more availability than DR.

Jan said that more visibility functions for Slonik is a must once it's a library. And it would make the API more stable because you wouldn't have to use the Slony catalog.

Steve asked if Slony could take one exclusive lock at a time would help. One user said no, they don't get enough downtime. But others said yes. Rod really wants an add trigger command which takes an access exclusive lock. That's a major postgres issue in terms of locking and tables. Jan speculated how that would be possible. Valintin tries to resolve this by setting statement_timeout and having it fail. The Slony team has some idea of how to do this. This is similar to concurrent index creation. Jan will look at doing CREATE TRIGGER CONCURRENTLY. Suggested that there should be a general lock state where you ask for a lock concurrently. This should be a function call. Kevin suggested that we should call this "deferrable" rather than Concurrent. With timeouts.

Steve appealed for people to help implement some of these things. Jan will sign up for LOCK TABLE DEFERRABLE. Josh said that he's help test/spec the API. Biggest issues with syntax is varadic arguments.

Chris asked if anyone was using logshipping, and someone said yes, and they're using the daemon to apply the logs.

Chris asked about DDL. Change in 2.2 which allows passing in the DDL string instead of needing a file. PostgresQL 9.5 will have DDL event replication. Do we care about building this? Chris said the most easy change would be to allow dropping a table to automatically drop it from replication. Rejecting DDL not run through execute script would be even better. It should be an option, which you can override. Josh voted on blocking DDL.

Steve's concern with blocking DDL is that people do this for legitimate reasons. Josh mentioned stupid dev tricks where they bring slony down by running rake on the production cluster. Also a library would be compatible with rake.

Do people see Slony as still being relevant in 2-3 years if BDR/UDR succeeds? Hard to be sure, since it's not mature yet.

They then set some targets:

* Parallel initial data copy: Slony 2.3+, requires Postgres 9.3+
* Call Slony as a library (mostly Python): Slony 2.3
* Add tables to multiple versions of databases: works with library
* No table locks: Postgres 9.6+
* More visibility commands via Slonik, once it's a library: Slony 2.4?
* Cancel subscribe set in progress
* Prevent DDL: Postgres 9.4, slony 2.3

Not likely to work on Slony + Logical Decoding, because it's bigger than all of the features above. Making it stable would take years, and it doesnt' perform better. Valintin is working on LD systems and building libaries on top of LD like a python library and replication to Kafka. Steve's code is on Github. Particularly, being able to switch between LD and triggers would be really complex.

Josh mentioned the issue around WAL log volume from the slony buffers. One user requested the ability to execute a command on a group of slaves. So an kind of EXECUTE SCRIPT on a group of nodes.

=== Bi Directional Replication & Logical Decoding ===

Including DDL replication, Online Upgrade, Logical Replication, Bi Directional Replication etc

Session Leaders: Simon Riggs, Andres Freund

==== Meeting Notes ====

BDR and UDR etc.

Leader: Simon

Note-taker: Josh Berkus

This will be about development of BDR, not using BDR. Will be only 10 minutes about BDR.

BDR stands for Bi-Directional Replication, has been put together across a couple years. Has been released a couple months ago. Is being used in production at some sites. Have released 0.9.1 bugfixing 0.9. But it's not 1.0 yet. We want to get it into Postgres.

BDR allows replication to flow in two directions. It's logical replication which permits making changes to the data as it's replicated. Translating the WAL stream (transaction log stream) and uses Logical Decoding to take action and stream changes. Works on a commit-by-commit basis.

Each server has a sender process which talks to an apply process on another server. These are implemented as background workers, a 9.3 feature. 9.4 with the BDR plugin supports one-way replication.

Two-way replication requires handling conflicts. This requests patches on postgres, so they have a "spoon" of Postgres (not quite a fork). Most of this is in 9.5, but there's a couple things still waiting for 9.6. Sequence AM was not included in 9.5, also WAL messaging, to send non-transactional WAL messages. Implemented logical replication, then built upon that to make bi-directional replication possible. Now building a system to handle DDL so that it can be replicated. The DDL is hard because we need it in "absolute form". The DDL deparse code is still in a module.

The other part of the system is zero-downtime upgrade. This uses UDR. It works with version 9.4.

Discussion of logical vs. binary replication. Grant asked what about parallelism on the apply side of the replication. Andres tested this, and it wasn't really a problem; the applier is much faster than the writes on the origin nodes.

Kevin took issue with the assertion that commit order of application doesn't allow seeing anoninalies. He brought up an example case where that's not true with batch processing.

Discussion of features required in core started.

Open items for including features in core are:

* SEQAM - ready for commit
* WAL messages - need to discuss and have a flamewar but otherwise there.
* Metadata - where do we store metadata for the replication system? Connection information?
* Control - still use functions, or implement special-case DDL?
* DDL Replication Code

Metadata is currently stored in a mix of security labels and metadata tables. Is this a conflict with RLS? Shouldn't be, but it's a bit of a hack; it's done because it's extra data which is created and deleted with a table.

UDR Functions:
* subscribue

BDR:
* create group
* join group

Robert asked a question about synchronizing timestamps. This motivated some of the patches to 9.5.

Should we have full DDL for this, or should we have functions? Simon thinks functions are fine. Haas likes DDL because it's more self-documenting. Simon argued that there's been a lot of changes to the functions. It's been iterating. But once it's in core (Haas), people expect very stable APIs, so you won't be able to change them anyway. Some discussion about dump and restore followed. There are some things which can't be restored. Replication slots is a good example of this, Haas feels like that's kind of unfinished. There's a pretty good argument that you want to be able to restore your replication sets.

The problem of deprecating APIs (Smith) already exists. We can add more arguments. Which model do you want? As soon as you put it into syntax, it's a lot harder to change parameters etc. There's also the question of to what extent we want to keep backwards compatibility of replication stuff.

Do we need a generic concept of a supervisor worker, because people keep reinventing this concept?

What about the metadata? We want it to work even if people rename tables, etc. Does this work reasonably well for 100,000 table cases? Should work with relcache, should be fine.

Currently subscribe/group uses pgdump, which requires passing in connections string so that we can dump out the database. That's not the only problem with the dependancy on pgdump. Dependancies on external binaries is kind of an issue. Abstracting out pgdump has been a TODO forever. Slony needs self-connection information too. You could have BDR GUCs, but that didn't work really well. This is different for each database.

pgdump is used to create the initial snapshot of the data and structure. You can use pgbasebackup instead, but that copies everything. Other database tools do it table-by-table. Getting sufficient administration tools into core is critical. We don't want to have 5 separate sets of tools like we do now most of which are buggy. Grant says: make the APIs really well, it's easy to build the tools on top of the APIs. He doesn't want tools which work on the base stuff. We have different sets of tools because they have different use cases.

=== pg_shard v2.0 and Lessons Learned from NoSQL Databases ===

Session Leaders: Ozgun Erdogan, Marco Slot

==== Attendees ====

==== Meeting Notes ====

pg_shard 2.0

Ozgun explained how pg_shard is put together. There's a metadata node, which connects to a bunch of backend nodes. Each backend node contains multiple shards in one database. The shards are tables.

In pg_shard 2.0, the metadata will be fully distributed.

The metatdata node tracks where shards are located. And shards can be redistributed.

So there are a few proposals for how to distribute metadata. There are several use-cases they are trying to answer:
* NoSQL use-case on the eventual consistency model
** real-time analytics over log data
* SAP Hana-like use case. ACID-compliant scalable RDBMS database.

Not like OracleRAC which is shared disk.

The proposals for sharing metadata:
# replication metadata to all nodes assuming communtative writes ... that is write order doesn't matter. So replicate change statements between all nodes. Use BDR.
# Shard health is decoupled shard health from metadata. Delegate health to replication groups. Could be enhanced by streaming replication. Basically failover between pairs of nodes.

They explained the first proposal. If they get inserts onto one table, if an insert fails, that node is marked invalid. Josh questioned whether or not this would ever become consistent. You would need to buffer the writes and replay them or resync-from the one healthy replica. Also requires that events can never conflict. It pretty much only supports the insert-only use case, because all writes have to be incremental. This is the AP proposal out of CAP.

For the 2nd proposal, then RDS could handle that for us. The 2nd proposal relies on having small replication groups, which would fail over in the event of a node failure. Streaming replication could be used between replicas. You'd need small groups with at least 3 nodes.

Josh made a third proposal, using RAFT-like semantics to share metadata and make it mostly consistent. Various issues were pointed out with this.

Alvaro suggested requiring quorum every time you do a read. Some discussion of Paxos etc. ensued.

=== FDW Enhancements ===

[http://www.slideshare.net/babystarmonja/foreign-data-wrapper-enhancements Slides used for this session]

Session Leaders: Shigheru Hanada and Esteru Fujita

Note-taker: Josh Berkus

==== Attendees ====

==== Meeting Notes ====

Enhancements proposed for 9.5:
* Inheritance Support: Committed: foreign table can be parent or child of other tables
* Update push-down: Returned with feedback: updated against Foreign tables without fetching data from the remote node.
* Join Push-down: API committed: allows joining on the remote server.

Update pushdown requires certain conditions in the Update statement. Also it requires a new FDW API, called from nodeModifyTable.

Currently joins are performed on the local server, which can be very slow. If both tables on on the external server and joins are supported we should be able to join over there. The FDW API is committed, but pgsql_fdw changes were not committed. The major issue was "how do we construct the remote query?"

Should we use a parse tree? They would like to support in Oracle and MySQL FDWs. A general SQL deparser would be idea for this, but we don't have one.

We also want sort push-down. But there is a problem selecting the key for the sort. Josh asked why. Shigheru explained that FDW sees only plan tree, and the plan tree generates path information for each key, which includes multiple candidates.

Other possible enhancements:
* sort push-down
* aggregate push-down
* more aggressive join push-down

For sort push-down, we also need to mark a Foreign Scan as sorted. But problems: limiting sort key candidates. Do we need to introduce FOREIGN INDEX concepts? Should we have FDW catalogs? Also, how can we be sure that sorting on the target and the local server are identical (collations etc.)? And what about pre-sorted join results? He asked for ideas on how to implement this.

Tom suggested that if you took the overhead of doing an explain, you could check and see if it's doing a merge join on the remote node. It might be expensive to see the explain plan.

He doesn't have a really concrete idea how to implement aggregate push-down. Maybe they should implement a new FDW API, and replace the Aggregate node with a ForeignScan. Issues include how to determine the semantics of the GROUP BY clause. Also how do we map local functions to remote ones? THere's stuff in the SQL standard for this but not very well defined.

More aggressive join push down would support doing a foreign nestloop scan. One way to do that is with local small tables we can push materialized data cross the FDW and join against it. Or we could do a temporary table or VALUES statement. If we know that the table is replicated on the remote side, we could join against it.

Other ideas?

Paul asked about extended types, like for PostGIS. Geometry operators aren't allowed to pass down through pgsql_fdw. He had to hack pgsql_fdw in order to pass those through. Maybe when you declare a server, you could declare which extensions are installed in the server, which would be checked in FDW. Shigheru thinks this is a good idea. Right now we don't push them down because the operator might be different on the target.

Is an extension a useful unit for this? Tom and Paul think yes. Also we don't actually need to check versions. We also want to create mappings for individual functions though.

Marco asked about CSTORE_FDW. The FDWAPI requires us to read row-by-row, which kills some of the advantages of the column store. Josh asked about COPY protocol; it would be good to copy into remote tables.

They also talked about pushing down pre-aggregates instead of finished aggregates. That is, count/sum instead of AVG. That way it will work with partitioned foreign tables. Basically, we would export the transition function somehow, like a MapReduce system. No idea how to do this. Also, how would it work with non-postgres systems?

PgCon2015ClusterSummit

2015-06-18T18:00:13Z

Hanada: /* FDW Enhancements */ Link the slides

=pgCon 2015 Cluster Hacker Summit=

This year's Cluster Hacker Summit will be part of the [[PgCon_2015_Developer_Unconference]]. As such, this page will be used to coordinate sessions to propose for the Unconference, and eventually to list an agenda for the Clustering Track.

The Cluster Summit covers both general PostgreSQL clustering, as well as PostgresXC and PostgresXL development.

As it is part of the Developer Unconference, Clustering sessions will take place starting in the afternoon of Tuesday, June 16 through 5pm on Wednesday, June 17. If you are participating, and will not be able to make it on Tuesday, please note that in your attendance comments.

== Attendee RSVPs ==

* Josh Berkus (both days)
* Koichi Suzuki
* Tatsuo Ishii
* Yugo Nagata
* Steve Singer (arrive tuesday mid-afternoon)
* Jan Wieck (arrive tuesday evening)
* Shigeru Hanada
* Ahsan Hadi
* Ashutosh Bapat
* Bruce Momjian
* Etsuro Fujita
* Tetsuo Sakata
* Amit Langote
* Kyotaro Horiguchi
* Ozgun Erdogan (Wednesday)
* Marco Slot (Wednesday)
* Simon Riggs

== Suggested Sessions ==

For each session below please provide a title and a moderator/leader/speaker for the session.

=== Pgpool-II: toward next major version 3.5 ===

Firstly we report the project progress since last year's Cluster Summit: introducing pgpool-II 3.4. Then we explain the current status of pgpool-II 3.5 which is under development.

Session Leader: Tatsuo Ishii

==== Attendees ====

==== Meeting Notes ====

=== FDW Enhancements ===

Session Leader: Shigeru Hanada and Etsuro Fujita

==== Attendees ====

==== Meeting Notes ====

=== Horizontal Scalability and Sharding ===

Session Leaders: Ahsan Hadi, Ashutosh Bapat

==== Attendees ====

==== Meeting Notes ====

=== Slony ===

What is the future for Slony development? Are users interested in a new Slony based on Logical Decoding? Who's going to work on this?

Session Leaders: Steve Singer, Chris Browne, Jan Wieck

==== Session Notes ====

Slony Development

Leader: Steve Singer, Chris Browne, Jan Wieck

Note-taker: Josh Berkus

In the past year Slony development has been pretty stagnant. Steve has worked on a prototype with Logical Decoding. Works for a demo, but not all features work, and performance was not all that impressive. Part of that is that how much we've optimized the Slony log. Logical slony has some lower write overhead, but the latency for assembly on the other side is not insignificant.

The stability of slony and features are OK, but will it survive with modern features of Postgres. With some features it will be different. It does something which others don't.

Chris: only takes a few hours of work per year to keep up with Postgres releases. But we've been forgetting how to do releases and management.

Jan: mentioned limitations of UDR/BDR. But most users are on older versions. Even if 9.6 has everything built in, people will still be using older versions for 5 years. Slony was built to allow upgrading. If it doesn't make sense to maintain them then we won't.

Slony is still useful for upgrading across architectures and character decodings. Josh mentioned the early stage of development of UDR. Steve said that it would take more than one year to get slony working with Logical Decoding and to make it as stable as trigger-based replication.

Chris asked what the issues were with LD and Slony development. Jan pointed out that for partial replication ... where you're replicating a few low-volume tables ... then saving all the WAL for the replication slot is a big loss. So LD and classic Slony log table need to coexist.

Josh said there's two reasons why he wants LD: (1) easier installation and (2) bloat in the queue tables caused by long-running transactions (3) removal overhead.

Re: bloat Jan mentioned Grittner's snapshot-too-old work. Josh said that work is partly because of Slony. You can't even truncate segments if there's a long-running transaction. Chris mentioned the idea of per-table logs.

Moving the reporting queries onto the replica can fix this issue, but it just moves the bloat around. More discussion around this. Doing this against a non-forwarding replica would prevent bloat.

Other requests:

* Parallel initial data copy.
* Call Slony as a library (mostly Python)
* Add tables to multiple versions of databases
* No table locks
* More visibility commands via Slonik, once it's a library
* Cancel subscribe set in progress

Libraries: one way to get around the multiple library issue is to expose the function API. But according to Jan the function API is not complete. Steve would rather have Slonik as a shared library. Calling the slony SPs directly isn't safe. Dave Cramer asked if we do a Python library how are we going to call it from Java? We'll create a C shared library.

Valentin suggestes exposing the shared libaries via stored procedures. Chris pointed out that the lack of autonomous transactions prevents this. But if the library is wrapped in functions it can be exposed. Or create some kind of REST API. Jan said that they're planning on supporting pgAdmin4, so an abstracted libary would support this. This is a good idea. Some people use GUIs, but some don't.

Steve asked if people are using it for DR. Josh mentioned one multi-region case. Valintin said more availability than DR.

Jan said that more visibility functions for Slonik is a must once it's a library. And it would make the API more stable because you wouldn't have to use the Slony catalog.

Steve asked if Slony could take one exclusive lock at a time would help. One user said no, they don't get enough downtime. But others said yes. Rod really wants an add trigger command which takes an access exclusive lock. That's a major postgres issue in terms of locking and tables. Jan speculated how that would be possible. Valintin tries to resolve this by setting statement_timeout and having it fail. The Slony team has some idea of how to do this. This is similar to concurrent index creation. Jan will look at doing CREATE TRIGGER CONCURRENTLY. Suggested that there should be a general lock state where you ask for a lock concurrently. This should be a function call. Kevin suggested that we should call this "deferrable" rather than Concurrent. With timeouts.

Steve appealed for people to help implement some of these things. Jan will sign up for LOCK TABLE DEFERRABLE. Josh said that he's help test/spec the API. Biggest issues with syntax is varadic arguments.

Chris asked if anyone was using logshipping, and someone said yes, and they're using the daemon to apply the logs.

Chris asked about DDL. Change in 2.2 which allows passing in the DDL string instead of needing a file. PostgresQL 9.5 will have DDL event replication. Do we care about building this? Chris said the most easy change would be to allow dropping a table to automatically drop it from replication. Rejecting DDL not run through execute script would be even better. It should be an option, which you can override. Josh voted on blocking DDL.

Steve's concern with blocking DDL is that people do this for legitimate reasons. Josh mentioned stupid dev tricks where they bring slony down by running rake on the production cluster. Also a library would be compatible with rake.

Do people see Slony as still being relevant in 2-3 years if BDR/UDR succeeds? Hard to be sure, since it's not mature yet.

They then set some targets:

* Parallel initial data copy: Slony 2.3+, requires Postgres 9.3+
* Call Slony as a library (mostly Python): Slony 2.3
* Add tables to multiple versions of databases: works with library
* No table locks: Postgres 9.6+
* More visibility commands via Slonik, once it's a library: Slony 2.4?
* Cancel subscribe set in progress
* Prevent DDL: Postgres 9.4, slony 2.3

Not likely to work on Slony + Logical Decoding, because it's bigger than all of the features above. Making it stable would take years, and it doesnt' perform better. Valintin is working on LD systems and building libaries on top of LD like a python library and replication to Kafka. Steve's code is on Github. Particularly, being able to switch between LD and triggers would be really complex.

Josh mentioned the issue around WAL log volume from the slony buffers. One user requested the ability to execute a command on a group of slaves. So an kind of EXECUTE SCRIPT on a group of nodes.

=== Bi Directional Replication & Logical Decoding ===

Including DDL replication, Online Upgrade, Logical Replication, Bi Directional Replication etc

Session Leaders: Simon Riggs, Andres Freund

==== Meeting Notes ====

BDR and UDR etc.

Leader: Simon

Note-taker: Josh Berkus

This will be about development of BDR, not using BDR. Will be only 10 minutes about BDR.

BDR stands for Bi-Directional Replication, has been put together across a couple years. Has been released a couple months ago. Is being used in production at some sites. Have released 0.9.1 bugfixing 0.9. But it's not 1.0 yet. We want to get it into Postgres.

BDR allows replication to flow in two directions. It's logical replication which permits making changes to the data as it's replicated. Translating the WAL stream (transaction log stream) and uses Logical Decoding to take action and stream changes. Works on a commit-by-commit basis.

Each server has a sender process which talks to an apply process on another server. These are implemented as background workers, a 9.3 feature. 9.4 with the BDR plugin supports one-way replication.

Two-way replication requires handling conflicts. This requests patches on postgres, so they have a "spoon" of Postgres (not quite a fork). Most of this is in 9.5, but there's a couple things still waiting for 9.6. Sequence AM was not included in 9.5, also WAL messaging, to send non-transactional WAL messages. Implemented logical replication, then built upon that to make bi-directional replication possible. Now building a system to handle DDL so that it can be replicated. The DDL is hard because we need it in "absolute form". The DDL deparse code is still in a module.

The other part of the system is zero-downtime upgrade. This uses UDR. It works with version 9.4.

Discussion of logical vs. binary replication. Grant asked what about parallelism on the apply side of the replication. Andres tested this, and it wasn't really a problem; the applier is much faster than the writes on the origin nodes.

Kevin took issue with the assertion that commit order of application doesn't allow seeing anoninalies. He brought up an example case where that's not true with batch processing.

Discussion of features required in core started.

Open items for including features in core are:

* SEQAM - ready for commit
* WAL messages - need to discuss and have a flamewar but otherwise there.
* Metadata - where do we store metadata for the replication system? Connection information?
* Control - still use functions, or implement special-case DDL?
* DDL Replication Code

Metadata is currently stored in a mix of security labels and metadata tables. Is this a conflict with RLS? Shouldn't be, but it's a bit of a hack; it's done because it's extra data which is created and deleted with a table.

UDR Functions:
* subscribue

BDR:
* create group
* join group

Robert asked a question about synchronizing timestamps. This motivated some of the patches to 9.5.

Should we have full DDL for this, or should we have functions? Simon thinks functions are fine. Haas likes DDL because it's more self-documenting. Simon argued that there's been a lot of changes to the functions. It's been iterating. But once it's in core (Haas), people expect very stable APIs, so you won't be able to change them anyway. Some discussion about dump and restore followed. There are some things which can't be restored. Replication slots is a good example of this, Haas feels like that's kind of unfinished. There's a pretty good argument that you want to be able to restore your replication sets.

The problem of deprecating APIs (Smith) already exists. We can add more arguments. Which model do you want? As soon as you put it into syntax, it's a lot harder to change parameters etc. There's also the question of to what extent we want to keep backwards compatibility of replication stuff.

Do we need a generic concept of a supervisor worker, because people keep reinventing this concept?

What about the metadata? We want it to work even if people rename tables, etc. Does this work reasonably well for 100,000 table cases? Should work with relcache, should be fine.

Currently subscribe/group uses pgdump, which requires passing in connections string so that we can dump out the database. That's not the only problem with the dependancy on pgdump. Dependancies on external binaries is kind of an issue. Abstracting out pgdump has been a TODO forever. Slony needs self-connection information too. You could have BDR GUCs, but that didn't work really well. This is different for each database.

pgdump is used to create the initial snapshot of the data and structure. You can use pgbasebackup instead, but that copies everything. Other database tools do it table-by-table. Getting sufficient administration tools into core is critical. We don't want to have 5 separate sets of tools like we do now most of which are buggy. Grant says: make the APIs really well, it's easy to build the tools on top of the APIs. He doesn't want tools which work on the base stuff. We have different sets of tools because they have different use cases.

=== pg_shard v2.0 and Lessons Learned from NoSQL Databases ===

Session Leaders: Ozgun Erdogan, Marco Slot

==== Attendees ====

==== Meeting Notes ====

pg_shard 2.0

Ozgun explained how pg_shard is put together. There's a metadata node, which connects to a bunch of backend nodes. Each backend node contains multiple shards in one database. The shards are tables.

In pg_shard 2.0, the metadata will be fully distributed.

The metatdata node tracks where shards are located. And shards can be redistributed.

So there are a few proposals for how to distribute metadata. There are several use-cases they are trying to answer:
* NoSQL use-case on the eventual consistency model
** real-time analytics over log data
* SAP Hana-like use case. ACID-compliant scalable RDBMS database.

Not like OracleRAC which is shared disk.

The proposals for sharing metadata:
# replication metadata to all nodes assuming communtative writes ... that is write order doesn't matter. So replicate change statements between all nodes. Use BDR.
# Shard health is decoupled shard health from metadata. Delegate health to replication groups. Could be enhanced by streaming replication. Basically failover between pairs of nodes.

They explained the first proposal. If they get inserts onto one table, if an insert fails, that node is marked invalid. Josh questioned whether or not this would ever become consistent. You would need to buffer the writes and replay them or resync-from the one healthy replica. Also requires that events can never conflict. It pretty much only supports the insert-only use case, because all writes have to be incremental. This is the AP proposal out of CAP.

For the 2nd proposal, then RDS could handle that for us. The 2nd proposal relies on having small replication groups, which would fail over in the event of a node failure. Streaming replication could be used between replicas. You'd need small groups with at least 3 nodes.

Josh made a third proposal, using RAFT-like semantics to share metadata and make it mostly consistent. Various issues were pointed out with this.

Alvaro suggested requiring quorum every time you do a read. Some discussion of Paxos etc. ensued.

=== FDW Enhancements ===

[http://www.slideshare.net/babystarmonja/foreign-data-wrapper-enhancements Slides used for this session]

Session Leaders: Shigheru Hanada and Esteru Fujita

Note-taker: Josh Berkus

==== Attendees ====

==== Meeting Notes ====

Enhancements proposed for 9.5:
* Inheritance Support: Committed: foreign table can be parent or child of other tables
* Update push-down: Returned with feedback: updated against Foreign tables without fetching data from the remote node.
* Join Push-down: API committed: allows joining on the remote server.

Update pushdown requires certain conditions in the Update statement. Also it requires a new FDW API, called from nodeModifyTable.

Currently joins are performed on the local server, which can be very slow. If both tables on on the external server and joins are supported we should be able to join over there. The FDW API is committed, but pgsql_fdw changes were not committed. The major issue was "how do we construct the remote query?"

Should we use a parse tree? They would like to support in Oracle and MySQL FDWs. A general SQL deparser would be idea for this, but we don't have one.

We also want sort push-down. But there is a problem selecting the key for the sort. Josh asked why. Shigheru explained that FDW sees only plan tree, and the plan tree generates path information for each key, which includes multiple candidates.

Other possible enhancements:
* sort push-down
* aggregate push-down
* more aggressive join push-down

For sort push-down, we also need to mark a Foreign Scan as sorted. But problems: limiting sort key candidates. Do we need to introduce FOREIGN INDEX concepts? Should we have FDW catalogs? Also, how can we be sure that sorting on the target and the local server are identical (collations etc.)? And what about pre-sorted join results? He asked for ideas on how to implement this.

Tom suggested that if you took the overhead of doing an explain, you could check and see if it's doing a merge join on the remote node. It might be expensive to see the explain plan.

He doesn't have a really concrete idea how to implement aggregate push-down. Maybe they should implement a new FDW API, and replace the Aggregate node with a ForeignScan. Issues include how to determine the semantics of the GROUP BY clause. Also how do we map local functions to remote ones? THere's stuff in the SQL standard for this but not very well defined.

More aggressive join push down would support doing a foreign nestloop scan. One way to do that is with local small tables we can push materialized data cross the FDW and join against it. Or we could do a temporary table or VALUES statement. If we know that the table is replicated on the remote side, we could join against it.

Other ideas?

Paul asked about extended types, like for PostGIS. Geometry operators aren't allowed to pass down through pgsql_fdw. He had to hack pgsql_fdw in order to pass those through. Maybe when you declare a server, you could declare which extensions are installed in the server, which would be checked in FDW. Shigheru thinks this is a good idea. Right now we don't push them down because the operator might be different on the target.

Is an extension a useful unit for this? Tom and Paul think yes. Also we don't actually need to check versions. We also want to create mappings for individual functions though.

Marco asked about CSTORE_FDW. The FDWAPI requires us to read row-by-row, which kills some of the advantages of the column store. Josh asked about COPY protocol; it would be good to copy into remote tables.

They also talked about pushing down pre-aggregates instead of finished aggregates. That is, count/sum instead of AVG. That way it will work with partitioned foreign tables. Basically, we would export the transition function somehow, like a MapReduce system. No idea how to do this. Also, how would it work with non-postgres systems?

PgCon2015ClusterSummit

2015-06-18T17:59:02Z

Hanada: /* FDW Enhancements */ Format

=pgCon 2015 Cluster Hacker Summit=

This year's Cluster Hacker Summit will be part of the [[PgCon_2015_Developer_Unconference]]. As such, this page will be used to coordinate sessions to propose for the Unconference, and eventually to list an agenda for the Clustering Track.

The Cluster Summit covers both general PostgreSQL clustering, as well as PostgresXC and PostgresXL development.

As it is part of the Developer Unconference, Clustering sessions will take place starting in the afternoon of Tuesday, June 16 through 5pm on Wednesday, June 17. If you are participating, and will not be able to make it on Tuesday, please note that in your attendance comments.

== Attendee RSVPs ==

* Josh Berkus (both days)
* Koichi Suzuki
* Tatsuo Ishii
* Yugo Nagata
* Steve Singer (arrive tuesday mid-afternoon)
* Jan Wieck (arrive tuesday evening)
* Shigeru Hanada
* Ahsan Hadi
* Ashutosh Bapat
* Bruce Momjian
* Etsuro Fujita
* Tetsuo Sakata
* Amit Langote
* Kyotaro Horiguchi
* Ozgun Erdogan (Wednesday)
* Marco Slot (Wednesday)
* Simon Riggs

== Suggested Sessions ==

For each session below please provide a title and a moderator/leader/speaker for the session.

=== Pgpool-II: toward next major version 3.5 ===

Firstly we report the project progress since last year's Cluster Summit: introducing pgpool-II 3.4. Then we explain the current status of pgpool-II 3.5 which is under development.

Session Leader: Tatsuo Ishii

==== Attendees ====

==== Meeting Notes ====

=== FDW Enhancements ===

Session Leader: Shigeru Hanada and Etsuro Fujita

==== Attendees ====

==== Meeting Notes ====

=== Horizontal Scalability and Sharding ===

Session Leaders: Ahsan Hadi, Ashutosh Bapat

==== Attendees ====

==== Meeting Notes ====

=== Slony ===

What is the future for Slony development? Are users interested in a new Slony based on Logical Decoding? Who's going to work on this?

Session Leaders: Steve Singer, Chris Browne, Jan Wieck

==== Session Notes ====

Slony Development

Leader: Steve Singer, Chris Browne, Jan Wieck

Note-taker: Josh Berkus

In the past year Slony development has been pretty stagnant. Steve has worked on a prototype with Logical Decoding. Works for a demo, but not all features work, and performance was not all that impressive. Part of that is that how much we've optimized the Slony log. Logical slony has some lower write overhead, but the latency for assembly on the other side is not insignificant.

The stability of slony and features are OK, but will it survive with modern features of Postgres. With some features it will be different. It does something which others don't.

Chris: only takes a few hours of work per year to keep up with Postgres releases. But we've been forgetting how to do releases and management.

Jan: mentioned limitations of UDR/BDR. But most users are on older versions. Even if 9.6 has everything built in, people will still be using older versions for 5 years. Slony was built to allow upgrading. If it doesn't make sense to maintain them then we won't.

Slony is still useful for upgrading across architectures and character decodings. Josh mentioned the early stage of development of UDR. Steve said that it would take more than one year to get slony working with Logical Decoding and to make it as stable as trigger-based replication.

Chris asked what the issues were with LD and Slony development. Jan pointed out that for partial replication ... where you're replicating a few low-volume tables ... then saving all the WAL for the replication slot is a big loss. So LD and classic Slony log table need to coexist.

Josh said there's two reasons why he wants LD: (1) easier installation and (2) bloat in the queue tables caused by long-running transactions (3) removal overhead.

Re: bloat Jan mentioned Grittner's snapshot-too-old work. Josh said that work is partly because of Slony. You can't even truncate segments if there's a long-running transaction. Chris mentioned the idea of per-table logs.

Moving the reporting queries onto the replica can fix this issue, but it just moves the bloat around. More discussion around this. Doing this against a non-forwarding replica would prevent bloat.

Other requests:

* Parallel initial data copy.
* Call Slony as a library (mostly Python)
* Add tables to multiple versions of databases
* No table locks
* More visibility commands via Slonik, once it's a library
* Cancel subscribe set in progress

Libraries: one way to get around the multiple library issue is to expose the function API. But according to Jan the function API is not complete. Steve would rather have Slonik as a shared library. Calling the slony SPs directly isn't safe. Dave Cramer asked if we do a Python library how are we going to call it from Java? We'll create a C shared library.

Valentin suggestes exposing the shared libaries via stored procedures. Chris pointed out that the lack of autonomous transactions prevents this. But if the library is wrapped in functions it can be exposed. Or create some kind of REST API. Jan said that they're planning on supporting pgAdmin4, so an abstracted libary would support this. This is a good idea. Some people use GUIs, but some don't.

Steve asked if people are using it for DR. Josh mentioned one multi-region case. Valintin said more availability than DR.

Jan said that more visibility functions for Slonik is a must once it's a library. And it would make the API more stable because you wouldn't have to use the Slony catalog.

Steve asked if Slony could take one exclusive lock at a time would help. One user said no, they don't get enough downtime. But others said yes. Rod really wants an add trigger command which takes an access exclusive lock. That's a major postgres issue in terms of locking and tables. Jan speculated how that would be possible. Valintin tries to resolve this by setting statement_timeout and having it fail. The Slony team has some idea of how to do this. This is similar to concurrent index creation. Jan will look at doing CREATE TRIGGER CONCURRENTLY. Suggested that there should be a general lock state where you ask for a lock concurrently. This should be a function call. Kevin suggested that we should call this "deferrable" rather than Concurrent. With timeouts.

Steve appealed for people to help implement some of these things. Jan will sign up for LOCK TABLE DEFERRABLE. Josh said that he's help test/spec the API. Biggest issues with syntax is varadic arguments.

Chris asked if anyone was using logshipping, and someone said yes, and they're using the daemon to apply the logs.

Chris asked about DDL. Change in 2.2 which allows passing in the DDL string instead of needing a file. PostgresQL 9.5 will have DDL event replication. Do we care about building this? Chris said the most easy change would be to allow dropping a table to automatically drop it from replication. Rejecting DDL not run through execute script would be even better. It should be an option, which you can override. Josh voted on blocking DDL.

Steve's concern with blocking DDL is that people do this for legitimate reasons. Josh mentioned stupid dev tricks where they bring slony down by running rake on the production cluster. Also a library would be compatible with rake.

Do people see Slony as still being relevant in 2-3 years if BDR/UDR succeeds? Hard to be sure, since it's not mature yet.

They then set some targets:

* Parallel initial data copy: Slony 2.3+, requires Postgres 9.3+
* Call Slony as a library (mostly Python): Slony 2.3
* Add tables to multiple versions of databases: works with library
* No table locks: Postgres 9.6+
* More visibility commands via Slonik, once it's a library: Slony 2.4?
* Cancel subscribe set in progress
* Prevent DDL: Postgres 9.4, slony 2.3

Not likely to work on Slony + Logical Decoding, because it's bigger than all of the features above. Making it stable would take years, and it doesnt' perform better. Valintin is working on LD systems and building libaries on top of LD like a python library and replication to Kafka. Steve's code is on Github. Particularly, being able to switch between LD and triggers would be really complex.

Josh mentioned the issue around WAL log volume from the slony buffers. One user requested the ability to execute a command on a group of slaves. So an kind of EXECUTE SCRIPT on a group of nodes.

=== Bi Directional Replication & Logical Decoding ===

Including DDL replication, Online Upgrade, Logical Replication, Bi Directional Replication etc

Session Leaders: Simon Riggs, Andres Freund

==== Meeting Notes ====

BDR and UDR etc.

Leader: Simon

Note-taker: Josh Berkus

This will be about development of BDR, not using BDR. Will be only 10 minutes about BDR.

BDR stands for Bi-Directional Replication, has been put together across a couple years. Has been released a couple months ago. Is being used in production at some sites. Have released 0.9.1 bugfixing 0.9. But it's not 1.0 yet. We want to get it into Postgres.

BDR allows replication to flow in two directions. It's logical replication which permits making changes to the data as it's replicated. Translating the WAL stream (transaction log stream) and uses Logical Decoding to take action and stream changes. Works on a commit-by-commit basis.

Each server has a sender process which talks to an apply process on another server. These are implemented as background workers, a 9.3 feature. 9.4 with the BDR plugin supports one-way replication.

Two-way replication requires handling conflicts. This requests patches on postgres, so they have a "spoon" of Postgres (not quite a fork). Most of this is in 9.5, but there's a couple things still waiting for 9.6. Sequence AM was not included in 9.5, also WAL messaging, to send non-transactional WAL messages. Implemented logical replication, then built upon that to make bi-directional replication possible. Now building a system to handle DDL so that it can be replicated. The DDL is hard because we need it in "absolute form". The DDL deparse code is still in a module.

The other part of the system is zero-downtime upgrade. This uses UDR. It works with version 9.4.

Discussion of logical vs. binary replication. Grant asked what about parallelism on the apply side of the replication. Andres tested this, and it wasn't really a problem; the applier is much faster than the writes on the origin nodes.

Kevin took issue with the assertion that commit order of application doesn't allow seeing anoninalies. He brought up an example case where that's not true with batch processing.

Discussion of features required in core started.

Open items for including features in core are:

* SEQAM - ready for commit
* WAL messages - need to discuss and have a flamewar but otherwise there.
* Metadata - where do we store metadata for the replication system? Connection information?
* Control - still use functions, or implement special-case DDL?
* DDL Replication Code

Metadata is currently stored in a mix of security labels and metadata tables. Is this a conflict with RLS? Shouldn't be, but it's a bit of a hack; it's done because it's extra data which is created and deleted with a table.

UDR Functions:
* subscribue

BDR:
* create group
* join group

Robert asked a question about synchronizing timestamps. This motivated some of the patches to 9.5.

Should we have full DDL for this, or should we have functions? Simon thinks functions are fine. Haas likes DDL because it's more self-documenting. Simon argued that there's been a lot of changes to the functions. It's been iterating. But once it's in core (Haas), people expect very stable APIs, so you won't be able to change them anyway. Some discussion about dump and restore followed. There are some things which can't be restored. Replication slots is a good example of this, Haas feels like that's kind of unfinished. There's a pretty good argument that you want to be able to restore your replication sets.

The problem of deprecating APIs (Smith) already exists. We can add more arguments. Which model do you want? As soon as you put it into syntax, it's a lot harder to change parameters etc. There's also the question of to what extent we want to keep backwards compatibility of replication stuff.

Do we need a generic concept of a supervisor worker, because people keep reinventing this concept?

What about the metadata? We want it to work even if people rename tables, etc. Does this work reasonably well for 100,000 table cases? Should work with relcache, should be fine.

Currently subscribe/group uses pgdump, which requires passing in connections string so that we can dump out the database. That's not the only problem with the dependancy on pgdump. Dependancies on external binaries is kind of an issue. Abstracting out pgdump has been a TODO forever. Slony needs self-connection information too. You could have BDR GUCs, but that didn't work really well. This is different for each database.

pgdump is used to create the initial snapshot of the data and structure. You can use pgbasebackup instead, but that copies everything. Other database tools do it table-by-table. Getting sufficient administration tools into core is critical. We don't want to have 5 separate sets of tools like we do now most of which are buggy. Grant says: make the APIs really well, it's easy to build the tools on top of the APIs. He doesn't want tools which work on the base stuff. We have different sets of tools because they have different use cases.

=== pg_shard v2.0 and Lessons Learned from NoSQL Databases ===

Session Leaders: Ozgun Erdogan, Marco Slot

==== Attendees ====

==== Meeting Notes ====

pg_shard 2.0

Ozgun explained how pg_shard is put together. There's a metadata node, which connects to a bunch of backend nodes. Each backend node contains multiple shards in one database. The shards are tables.

In pg_shard 2.0, the metadata will be fully distributed.

The metatdata node tracks where shards are located. And shards can be redistributed.

So there are a few proposals for how to distribute metadata. There are several use-cases they are trying to answer:
* NoSQL use-case on the eventual consistency model
** real-time analytics over log data
* SAP Hana-like use case. ACID-compliant scalable RDBMS database.

Not like OracleRAC which is shared disk.

The proposals for sharing metadata:
# replication metadata to all nodes assuming communtative writes ... that is write order doesn't matter. So replicate change statements between all nodes. Use BDR.
# Shard health is decoupled shard health from metadata. Delegate health to replication groups. Could be enhanced by streaming replication. Basically failover between pairs of nodes.

They explained the first proposal. If they get inserts onto one table, if an insert fails, that node is marked invalid. Josh questioned whether or not this would ever become consistent. You would need to buffer the writes and replay them or resync-from the one healthy replica. Also requires that events can never conflict. It pretty much only supports the insert-only use case, because all writes have to be incremental. This is the AP proposal out of CAP.

For the 2nd proposal, then RDS could handle that for us. The 2nd proposal relies on having small replication groups, which would fail over in the event of a node failure. Streaming replication could be used between replicas. You'd need small groups with at least 3 nodes.

Josh made a third proposal, using RAFT-like semantics to share metadata and make it mostly consistent. Various issues were pointed out with this.

Alvaro suggested requiring quorum every time you do a read. Some discussion of Paxos etc. ensued.

=== FDW Enhancements ===

Session Leaders: Shigheru Hanada and Esteru Fujita

Note-taker: Josh Berkus

==== Attendees ====

==== Meeting Notes ====

Enhancements proposed for 9.5:
* Inheritance Support: Committed: foreign table can be parent or child of other tables
* Update push-down: Returned with feedback: updated against Foreign tables without fetching data from the remote node.
* Join Push-down: API committed: allows joining on the remote server.

Update pushdown requires certain conditions in the Update statement. Also it requires a new FDW API, called from nodeModifyTable.

Currently joins are performed on the local server, which can be very slow. If both tables on on the external server and joins are supported we should be able to join over there. The FDW API is committed, but pgsql_fdw changes were not committed. The major issue was "how do we construct the remote query?"

Should we use a parse tree? They would like to support in Oracle and MySQL FDWs. A general SQL deparser would be idea for this, but we don't have one.

We also want sort push-down. But there is a problem selecting the key for the sort. Josh asked why. Shigheru explained that FDW sees only plan tree, and the plan tree generates path information for each key, which includes multiple candidates.

Other possible enhancements:
* sort push-down
* aggregate push-down
* more aggressive join push-down

For sort push-down, we also need to mark a Foreign Scan as sorted. But problems: limiting sort key candidates. Do we need to introduce FOREIGN INDEX concepts? Should we have FDW catalogs? Also, how can we be sure that sorting on the target and the local server are identical (collations etc.)? And what about pre-sorted join results? He asked for ideas on how to implement this.

Tom suggested that if you took the overhead of doing an explain, you could check and see if it's doing a merge join on the remote node. It might be expensive to see the explain plan.

He doesn't have a really concrete idea how to implement aggregate push-down. Maybe they should implement a new FDW API, and replace the Aggregate node with a ForeignScan. Issues include how to determine the semantics of the GROUP BY clause. Also how do we map local functions to remote ones? THere's stuff in the SQL standard for this but not very well defined.

More aggressive join push down would support doing a foreign nestloop scan. One way to do that is with local small tables we can push materialized data cross the FDW and join against it. Or we could do a temporary table or VALUES statement. If we know that the table is replicated on the remote side, we could join against it.

Other ideas?

Paul asked about extended types, like for PostGIS. Geometry operators aren't allowed to pass down through pgsql_fdw. He had to hack pgsql_fdw in order to pass those through. Maybe when you declare a server, you could declare which extensions are installed in the server, which would be checked in FDW. Shigheru thinks this is a good idea. Right now we don't push them down because the operator might be different on the target.

Is an extension a useful unit for this? Tom and Paul think yes. Also we don't actually need to check versions. We also want to create mappings for individual functions though.

Marco asked about CSTORE_FDW. The FDWAPI requires us to read row-by-row, which kills some of the advantages of the column store. Josh asked about COPY protocol; it would be good to copy into remote tables.

They also talked about pushing down pre-aggregates instead of finished aggregates. That is, count/sum instead of AVG. That way it will work with partitioned foreign tables. Basically, we would export the transition function somehow, like a MapReduce system. No idea how to do this. Also, how would it work with non-postgres systems?

PgCon 2015 Developer Unconference

2015-06-15T07:17:45Z

Hanada: /* Topics */

An Unconference-style multi-track (three tracks are currently planned) event for active PostgreSQL developers will be held from the afternoon of Tuesday 16 June, 2015 through Wednesday 17 June 2015 at the University of Ottawa, as part of PGCon 2015. This Unconference will be focused on technical PostgreSQL development discussions ranging from Clustering and replication to the infrastructure which runs postgresql.org.

'''Please add your name to the topics you are interested in attending!'''

== Topics ==

Developers are asked to propose topics which they wish to either present on or which they would like another individual to present on. All topics should be clearly related to PostgreSQL development. The topic should be added to the table below and any required attendees (presumably at least the presenter, and the requester if different) listed. Other attendees of the Unconference who are interested should list themselves as Optional. Note that non-technical topics related to PostgreSQL development will be addressed during the invite-only Developer meeting, being held in advance of the Unconference. Further, the Developer Unconference is for developers of PostgreSQL and user-oriented topics are not appropriate for this venue.

== Slot assignment ==

Slots will be assigned based on the topic's interest among the attendees of the Unconference (the number of individuals who listed themselves as attendees). Final determination on any particular topic will be made by the Unconference organizers. Please only participate if you are confident of your attendance at the Unconference.

== Venue ==

These meetings will be held at the University of Ottawa. The topics selected, the schedule and the specific room assignments will be published closer to the event and will be based on the information provided here. Please direct any questions to Dave Page (dpage@pgadmin.org).

== Sponsorship ==

The Developer Unconference will be sponsored by Salesforce.com, and by NTT Open Source for the Clustering Track.

== Attendees ==

While the Unconference is open to all attendees of PGCon, formal invitations will be sent to specific PostgreSQL developers, including the Core team, Major Contributors, Committers, and other developers who have been involved in the 9.4 release. These invitations are intended to encourage developers to attend the Unconference but we are unable to guarantee every invitee a speaking slot.

== RSVPs ==

The following people have RSVPed to the meeting (in alphabetical order, by surname):

* Ashutosh Bapat
* Oleg Bartunov
* Josh Berkus
* Christopher Browne
* Joe Conway
* Jeff Davis
* Andrew Dunstan
* Ozgun Erdogan
* Andres Freund
* Stephen Frost
* Masao Fujii
* Etsuro Fujita
* Peter Geoghegan
* Kevin Grittner
* Robert Haas
* Ahsan Hadi
* Magnus Hagander
* Shigeru Hanada
* Álvaro Herrera
* Kyotaro Horiguchi
* Thierry Husson (Wednesday @ 11am)
* Ayumi Ishii
* Tatsuo Ishii
* Stefan Kaltenbrunner
* Amit Kapila
* Konstantin Knizhnik
* KaiGai Kohei (arrive tuesday evening)
* Alexander Korotkov
* Ilya Kosmodemiansky
* Tom Lane
* Amit Langote
* Grant McAlister
* Mack McCauley
* Noah Misch
* Bruce Momjian
* Yugo Nagata
* Satoshi Nagayasu
* Jim Nasby
* Dave Page
* Christophe Pettus
* Paul Ramsey
* Kumar Rajeev Rastogi
* Simon Riggs
* Tetsuo Sakata
* Masahiko Sawada
* Dilip Kumar
* Marco Slot (Wednesday)
* Greg Smith
* Steve Singer (arrive tuesday mid-afternoon)
* Jose Luis Tallon (arrives tuesday evening)
* Rod Taylor
* Tomas Vondra
* Jan Wieck (arrive tuesday evening)
* Chris Winters
* Nat Wyatt
* Naoya Anzai (arrive tuesday evening)
* David Steele (arrive tuesday evening)
* Ingmar Alting
* Mehmet Emin KARAKAŞ
* Yasin TATAR
* Fabrízio de Royes Mello
* Euler Taveira
* Fabio Telles
* Dan Shuster
* Arul Shaji
* Motoyuki Kawaba (arrive Tuesday evening)
* Yurie Enomoto

=Topics=

'''Please add any topics you wish covered to the table.'''

'''For any topics you are requesting or presenting on, please add your name in the Required column.'''

'''For any topics you would like to attend, please add your name in the Interested column.'''

{| border="1" cellpadding="4" cellspacing="0"
!Topic
!Policy
!Taker of Notes
!Required Attendees
!Interested Attendees

|- style="background-color:lightgray;"
|Picture!
|Open
|
|All!
|All!

|- style="background-color:lightgray;"
|pgAdmin4
|Open
|
|Dave Page, Stephen Frost
|Magnus Hagander, Joe Conway, David Steele, Fabrízio de Royes Mello, Satoshi Nagayasu

|- style="background-color:lightgray;"
|Infrastructure Q&A
|Open
|
|Dave Page, Stephen Frost, Stefan Kaltenbrunner, Magnus Hagander, Joe Conway
|

|- style="background-color:lightgray;"
|WWW Team Meeting
|Open
|
|Dave Page, Stephen Frost, Stefan Kaltenbrunner, Magnus Hagander
|

|- style="background-color:lightgray;"
|Advocacy Team Meeting
|Open
|
|Stephen Frost
|Magnus Hagander, Greg Smith, Jim Nasby, Josh Berkus, Joe Conway

|- style="background-color:lightgray;"
|Vertical Scalability w.r.t Writes
|Open
|Amit Kapila
|Amit Kapila
|Greg Smith, Hannu Valtonen, Ilya Kosmodemiansky, Tomas Vondra, Grant McAlister, Joe Conway, Peter Geoghegan, Kyotaro Horiguchi, Simon Riggs, Amit Langote, Andres Freund, Robert Haas, David Steele, Rod Taylor, Jim Nasby, Chris Winters, Nat Wyatt, Noah Misch, Masao Fujii, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|Security Team Meeting
|Closed
|
|Heikki Linnakangas, Stephen Frost, Magnus Hagander
|Noah Misch, Álvaro Herrera, Andres Freund, Robert Haas, Tom Lane, Andrew Dunstan

|- style="background-color:lightgray;"
|Native Compilation + LLVM
|Open
|
|Kumar Rajeev Rastogi
|Jeff Davis, Ozgun Erdogan, Tomas Vondra, Peter Geoghegan, Robert Haas, Chris Browne, Josh Berkus, Ingmar Alting, Masao Fujii, Christophe Pettus, Jose Luis Tallon

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Horizontal Scalability / Sharding in PostgreSQL]] - ground covered so far and remaining to be covered.
|Open
|
|Ahsan Hadi, Ashutosh Bapat, Etsuro Fujita
|Hannu Valtonen, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Tetsuo Sakata, Simon Riggs, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Josh Berkus, Chris Winters, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Satoshi Nagayasu, Andrew Dunstan, Mack McCauley, Shigeru HANADA

|- style="background-color:lightgray;"
|[[PGCAC Board Meeting 2015]]
|Open*
|Josh Berkus
|Josh Berkus, Chris Browne, Steve Singer, Dan Langille, Dave Page
|

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|pgPool2 towards version 3.5]]
|Open
|
|Tatsuo Ishii
|Ashutosh Bapat, Ahsan Hadi, Yurie Enomoto

|- style="background-color:lightgray;"
|Partitioning
|Open
|
|Amit Langote
|Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Kyotaro Horiguchi, KaiGai Kohei, Noah Misch, Tetsuo Sakata, Peter Geoghegan, Álvaro Herrera, Thierry Husson, Joe Conway, Naoya Anzai, Robert Haas, David Steele, Chris Browne, Jim Nasby, Josh Berkus, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Jose Luis Tallon, Yurie Enomoto, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Foreign Data Wrapper enhancements]]
|Open
|
|Shigeru Hanada, Etsuro Fujita
|KaiGai Kohei, Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Noah Misch, Tetsuo Sakata, Naoya Anzai, Robert Haas, Jim Nasby, Josh Berkus, Chris Winters, Ingmar Alting, Mehmet Emin KARAKAŞ, Jose Luis Tallon

|- style="background-color:lightgray;"
|Utilization of modern semiconductors - GPU, SSD, NVRAM, FPGA, PMEM...
|Open
|
|KaiGai Kohei
|Matthew Wilcox, Josh Berkus, Satoshi Nagayasu, Jose Luis Tallon, Naoya Anzai, Mack McCauley, Shigeru HANADA

|- style="background-color:lightgray;"
|Native Columnar Storage
|Open
|
|Álvaro Herrera
|Ozgun Erdogan, Tomas Vondra, KaiGai Kohei, Amit Kapila, Josh Berkus, Naoya Anzai, Amit Langote, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Chris Winters, Nat Wyatt, Masao Fujii, Fabrízio de Royes Mello, Euler Taveira, Satoshi Nagayasu, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|Future of PostgreSQL shared-nothing cluster
|Open
|
|Konstantin Knizhnik, Alexander Korotkov, Oleg Bartunov
|Jeff Davis, Amit Langote, Kumar Rajeev Rastogi, Josh Berkus, Simon Riggs, Robert Haas, Jim Nasby, Masao Fujii, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Yurie Enomoto, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|[[PostgreSQL and SMR Drives]] - the future of magnetic storage means very expensive random writes
|Open
|
|Jeff Davis
|Kumar Rajeev Rastogi, Noah Misch, Ilya Kosmodemiansky, Amit Kapila, Simon Riggs, Rod Taylor, Jim Nasby, Josh Berkus, Nat Wyatt, Christophe Pettus, Satoshi Nagayasu

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Slony Development]]
|Open
|
| Steve Singer, Chris Browne, Jan Wieck
| Josh Berkus, Rod Taylor, Jim Nasby, Satoshi Nagayasu, Yurie Enomoto

|- style="background-color:lightgray;"
|[[DockerizingPostgres|Dockerizing Postgres]]
|Open
|
| Josh Berkus
| Simon Riggs, Nat Wyatt, Christophe Pettus, Fabrízio de Royes Mello

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Bi Directional Replication & Logical Decoding|BDR]]
|Open
|
| Simon Riggs
| Andres Freund, Jim Nasby, Josh Berkus, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira

|- style="background-color:lightgray;"
|Autonomous Transactions
|Open
|
| Simon Riggs, Kumar Rajeev Rastogi
| David Steele, Jim Nasby, Josh Berkus, Nat Wyatt, Masao Fujii, Euler Taveira, Andrew Dunstan, Masahiko Sawada

|- style="background-color:lightgray;"
|Audit Logging
|Open
|
| David Steele
| Josh Berkus, Nat Wyatt, Masao Fujii, Christophe Pettus, Fabio Telles, Satoshi Nagayasu, Yurie Enomoto, Mack McCauley, Masahiko Sawada

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|pg_shard v2.0 and Lessons Learned from NoSQL Databases ]]
|Open
|
| Ozgun Erdogan, Marco Slot
| Josh Berkus, Jim Nasby, Josh Berkus, Chris Winters, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Satoshi Nagayasu, Shigeru HANADA

|- style="background-color:lightgray;"
|Direction of json and jsonb
|Open
|
| Andrew Dunstan
| Josh Berkus, Christophe Pettus, Masahiko Sawada

|- style="background-color:lightgray;"
|Native Sparse Set Type
|Open
|
| Andrew Dunstan
| Josh Berkus

|- style="background-color:lightgray;"
|Testing Framework Adequacy
|Open
|
| Andrew Dunstan
| Josh Berkus, Christophe Pettus, Mack McCauley

|}

== pgAdmin4 ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Infrastructure Q&A ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== WWW Team Meeting ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Advocacy Team Meeting ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Vertical Scalability w.r.t Writes ==
Purpose of this discussion:
* Discuss about priority/importance of various performance and scalability problems
* Solution/Idea to solve most important problem('s)
* Is pgbench sufficient to capture various kind of real world workloads?

Some of important performance problems I have in mind are:
* Avoid/Reduce Vacuum Freeze
* Bloat
Heap
Index
* Instability in TPS due to checkpointer flush
* Tuple size
Heap Tuple Header
Alignment in index can lead to bigger index size for simple datatypes
Scalability bottlenecks
* Locks
ProcArrayLock
WALWriteLock
CLOGControlLock
Lock for Relation Extension

* Writes, especially when data doesn't fit in shared buffers.
Write Performance
Double Buffering
In-memory table/tablespaces
=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Security Team Meeting ==

=== Meeting Notes ===
* This will be, ehem, secure so nothing will be written here

== Partitioning ==
Proposal to enhance partitioning support in PostgreSQL was posted to -hackers last year and resulted in discussion of some ideas regarding implementation. Late in the discussion, a crude WIP patch was also posted with some experimental syntax, catalog changes, an idea for internal representation and a proof-of-concept INSERT tuple routing function demonstrating practicality of the internal representation. It would be nice to carry the discussion forward at the same time implementing a patch to be proposed, reviewed early in the 9.6 development cycle. Points to discuss could be:

* New features and old inheritance based implementation
* Planner considerations for new partitioned table
* Need for a new Append-like executor node for partitioned tables
* DML/DDL restrictions on partitioned tables and partitions
* Basically any considerations for partitioned tables and partitions that are explicitly defined so at a layer that's above the storage layer
* Other points that come up

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Utilization of modern semiconductors ==
Recent evolution of semiconductor devices make us re-consider the assumption we stand on, and utilization of its power is key of innovation.
We'd like to have a discussion to get the future direction in short and middle/long term.

* GPU, FPGA - have advantage on simple but massive amount of calculation. It allows DBMS to perform as data processing platform that works nearby data.

* SSD, NVRAM - likely, game changer of storage layer on both of read/write workloads. DBMS also has to pay attention characteristics of these devices.

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Future of PostgreSQL shared-nothing cluster ==

=== Meeting Notes ===
In 2015 PostgreSQL Professional company started project of migration PostgreSQL-XL to codebase of PostgreSQL 9.4 and increasing its stability and usability. At this unconference session we'd like to discuss current progress and further development. Generally we'd like to find ways to reduce difference between PostgreSQL and its shared-nothing cluster fork so that burden of the maintenance become manageable.

=== Attendees ===
* To be filled in

== PostgreSQL and SMR Drives ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Native Columnar Storage ==

See Alvaro's [http://www.postgresql.org/message-id/20150611230316.GM133018@postgresql.org email to Hackers].

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Audit Logging ==

Audit logging is an important part of a RDBMS for many users and applications. Discuss how best to incorporate audit logging into PostgreSQL and what must be included at a minimum to make the feature viable.

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Direction of json and jsonb ==

=== Meeting Notes ===
What are the future needs of the JSON types? Recent suggestions have included an indexable "exists" operator, the json pointer and json patch standards,
recursive merge, intersection, and being able to sssign to a subdocument (json#>path as an lvalue). .What are people using these types for, and what are
the major gaps in functionality?

=== Attendees ===
* To be filled in

== Native Sparse Set Type ==

Sets over small domains can be reasonably modeled by bitmaps, but sets over very large domains can not.
Is there a need for such sets? How would we implement them? Arrays? Balanced trees? Something else?
What types of sets would we allow? Anything with Btree operators, or more restricted? What would the notation look like?

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Testing Framework Adequacy ==

The buildfarm is more than 10 years old, and the testing needs of Postgres and its ofware ecosystem have changed radically in that time.
What do we now need in the way of testing? How do we test complex arrangements such as the various sorts of replication in an automated way?
Do we need a new framwork, or can the existing framework be adapted to our needs?

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

PgCon 2015 Developer Unconference

2015-06-15T07:16:59Z

Hanada: /* Topics */

An Unconference-style multi-track (three tracks are currently planned) event for active PostgreSQL developers will be held from the afternoon of Tuesday 16 June, 2015 through Wednesday 17 June 2015 at the University of Ottawa, as part of PGCon 2015. This Unconference will be focused on technical PostgreSQL development discussions ranging from Clustering and replication to the infrastructure which runs postgresql.org.

'''Please add your name to the topics you are interested in attending!'''

== Topics ==

Developers are asked to propose topics which they wish to either present on or which they would like another individual to present on. All topics should be clearly related to PostgreSQL development. The topic should be added to the table below and any required attendees (presumably at least the presenter, and the requester if different) listed. Other attendees of the Unconference who are interested should list themselves as Optional. Note that non-technical topics related to PostgreSQL development will be addressed during the invite-only Developer meeting, being held in advance of the Unconference. Further, the Developer Unconference is for developers of PostgreSQL and user-oriented topics are not appropriate for this venue.

== Slot assignment ==

Slots will be assigned based on the topic's interest among the attendees of the Unconference (the number of individuals who listed themselves as attendees). Final determination on any particular topic will be made by the Unconference organizers. Please only participate if you are confident of your attendance at the Unconference.

== Venue ==

These meetings will be held at the University of Ottawa. The topics selected, the schedule and the specific room assignments will be published closer to the event and will be based on the information provided here. Please direct any questions to Dave Page (dpage@pgadmin.org).

== Sponsorship ==

The Developer Unconference will be sponsored by Salesforce.com, and by NTT Open Source for the Clustering Track.

== Attendees ==

While the Unconference is open to all attendees of PGCon, formal invitations will be sent to specific PostgreSQL developers, including the Core team, Major Contributors, Committers, and other developers who have been involved in the 9.4 release. These invitations are intended to encourage developers to attend the Unconference but we are unable to guarantee every invitee a speaking slot.

== RSVPs ==

The following people have RSVPed to the meeting (in alphabetical order, by surname):

* Ashutosh Bapat
* Oleg Bartunov
* Josh Berkus
* Christopher Browne
* Joe Conway
* Jeff Davis
* Andrew Dunstan
* Ozgun Erdogan
* Andres Freund
* Stephen Frost
* Masao Fujii
* Etsuro Fujita
* Peter Geoghegan
* Kevin Grittner
* Robert Haas
* Ahsan Hadi
* Magnus Hagander
* Shigeru Hanada
* Álvaro Herrera
* Kyotaro Horiguchi
* Thierry Husson (Wednesday @ 11am)
* Ayumi Ishii
* Tatsuo Ishii
* Stefan Kaltenbrunner
* Amit Kapila
* Konstantin Knizhnik
* KaiGai Kohei (arrive tuesday evening)
* Alexander Korotkov
* Ilya Kosmodemiansky
* Tom Lane
* Amit Langote
* Grant McAlister
* Mack McCauley
* Noah Misch
* Bruce Momjian
* Yugo Nagata
* Satoshi Nagayasu
* Jim Nasby
* Dave Page
* Christophe Pettus
* Paul Ramsey
* Kumar Rajeev Rastogi
* Simon Riggs
* Tetsuo Sakata
* Masahiko Sawada
* Dilip Kumar
* Marco Slot (Wednesday)
* Greg Smith
* Steve Singer (arrive tuesday mid-afternoon)
* Jose Luis Tallon (arrives tuesday evening)
* Rod Taylor
* Tomas Vondra
* Jan Wieck (arrive tuesday evening)
* Chris Winters
* Nat Wyatt
* Naoya Anzai (arrive tuesday evening)
* David Steele (arrive tuesday evening)
* Ingmar Alting
* Mehmet Emin KARAKAŞ
* Yasin TATAR
* Fabrízio de Royes Mello
* Euler Taveira
* Fabio Telles
* Dan Shuster
* Arul Shaji
* Motoyuki Kawaba (arrive Tuesday evening)
* Yurie Enomoto

=Topics=

'''Please add any topics you wish covered to the table.'''

'''For any topics you are requesting or presenting on, please add your name in the Required column.'''

'''For any topics you would like to attend, please add your name in the Interested column.'''

{| border="1" cellpadding="4" cellspacing="0"
!Topic
!Policy
!Taker of Notes
!Required Attendees
!Interested Attendees

|- style="background-color:lightgray;"
|Picture!
|Open
|
|All!
|All!

|- style="background-color:lightgray;"
|pgAdmin4
|Open
|
|Dave Page, Stephen Frost
|Magnus Hagander, Joe Conway, David Steele, Fabrízio de Royes Mello, Satoshi Nagayasu

|- style="background-color:lightgray;"
|Infrastructure Q&A
|Open
|
|Dave Page, Stephen Frost, Stefan Kaltenbrunner, Magnus Hagander, Joe Conway
|

|- style="background-color:lightgray;"
|WWW Team Meeting
|Open
|
|Dave Page, Stephen Frost, Stefan Kaltenbrunner, Magnus Hagander
|

|- style="background-color:lightgray;"
|Advocacy Team Meeting
|Open
|
|Stephen Frost
|Magnus Hagander, Greg Smith, Jim Nasby, Josh Berkus, Joe Conway

|- style="background-color:lightgray;"
|Vertical Scalability w.r.t Writes
|Open
|Amit Kapila
|Amit Kapila
|Greg Smith, Hannu Valtonen, Ilya Kosmodemiansky, Tomas Vondra, Grant McAlister, Joe Conway, Peter Geoghegan, Kyotaro Horiguchi, Simon Riggs, Amit Langote, Andres Freund, Robert Haas, David Steele, Rod Taylor, Jim Nasby, Chris Winters, Nat Wyatt, Noah Misch, Masao Fujii, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|Security Team Meeting
|Closed
|
|Heikki Linnakangas, Stephen Frost, Magnus Hagander
|Noah Misch, Álvaro Herrera, Andres Freund, Robert Haas, Tom Lane, Andrew Dunstan

|- style="background-color:lightgray;"
|Native Compilation + LLVM
|Open
|
|Kumar Rajeev Rastogi
|Jeff Davis, Ozgun Erdogan, Tomas Vondra, Peter Geoghegan, Robert Haas, Chris Browne, Josh Berkus, Ingmar Alting, Masao Fujii, Christophe Pettus, Jose Luis Tallon

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Horizontal Scalability / Sharding in PostgreSQL]] - ground covered so far and remaining to be covered.
|Open
|
|Ahsan Hadi, Ashutosh Bapat, Etsuro Fujita
|Hannu Valtonen, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Tetsuo Sakata, Simon Riggs, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Josh Berkus, Chris Winters, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Satoshi Nagayasu, Andrew Dunstan, Mack McCauley, Shigeru HANADA

|- style="background-color:lightgray;"
|[[PGCAC Board Meeting 2015]]
|Open*
|Josh Berkus
|Josh Berkus, Chris Browne, Steve Singer, Dan Langille, Dave Page
|

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|pgPool2 towards version 3.5]]
|Open
|
|Tatsuo Ishii
|Ashutosh Bapat, Ahsan Hadi, Yurie Enomoto

|- style="background-color:lightgray;"
|Partitioning
|Open
|
|Amit Langote
|Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Kyotaro Horiguchi, KaiGai Kohei, Noah Misch, Tetsuo Sakata, Peter Geoghegan, Álvaro Herrera, Thierry Husson, Joe Conway, Naoya Anzai, Robert Haas, David Steele, Chris Browne, Jim Nasby, Josh Berkus, Masao Fujii, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Andrew Dunstan, Jose Luis Tallon, Yurie Enomoto, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Foreign Data Wrapper enhancements]]
|Open
|
|Shigeru Hanada, Etsuro Fujita
|KaiGai Kohei, Hannu Valtonen, Ashutosh Bapat, Jeff Davis, Amit Langote, Kyotaro Horiguchi, Noah Misch, Tetsuo Sakata, Naoya Anzai, Robert Haas, Jim Nasby, Josh Berkus, Chris Winters, Ingmar Alting, Mehmet Emin KARAKAŞ, Jose Luis Tallon

|- style="background-color:lightgray;"
|Utilization of modern semiconductors - GPU, SSD, NVRAM, FPGA, PMEM...
|Open
|
|KaiGai Kohei
|Matthew Wilcox, Josh Berkus, Satoshi Nagayasu, Jose Luis Tallon, Naoya Anzai, Mack McCauley, Shigeru HANADA

|- style="background-color:lightgray;"
|Native Columnar Storage
|Open
|
|Álvaro Herrera
|Ozgun Erdogan, Tomas Vondra, KaiGai Kohei, Amit Kapila, Josh Berkus, Naoya Anzai, Amit Langote, Robert Haas, David Steele, Rod Taylor, Chris Browne, Jim Nasby, Chris Winters, Nat Wyatt, Masao Fujii, Fabrízio de Royes Mello, Euler Taveira, Satoshi Nagayasu, Mack McCauley, Masahiko Sawada, Shigeru HANADA

|- style="background-color:lightgray;"
|Future of PostgreSQL shared-nothing cluster
|Open
|
|Konstantin Knizhnik, Alexander Korotkov, Oleg Bartunov
|Jeff Davis, Amit Langote, Kumar Rajeev Rastogi, Josh Berkus, Simon Riggs, Robert Haas, Jim Nasby, Masao Fujii, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira, Fabio Telles, Yurie Enomoto, Masahiko Sawada

|- style="background-color:lightgray;"
|[[PostgreSQL and SMR Drives]] - the future of magnetic storage means very expensive random writes
|Open
|
|Jeff Davis
|Kumar Rajeev Rastogi, Noah Misch, Ilya Kosmodemiansky, Amit Kapila, Simon Riggs, Rod Taylor, Jim Nasby, Josh Berkus, Nat Wyatt, Christophe Pettus, Satoshi Nagayasu

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Slony Development]]
|Open
|
| Steve Singer, Chris Browne, Jan Wieck
| Josh Berkus, Rod Taylor, Jim Nasby, Satoshi Nagayasu, Yurie Enomoto

|- style="background-color:lightgray;"
|[[DockerizingPostgres|Dockerizing Postgres]]
|Open
|
| Josh Berkus
| Simon Riggs, Nat Wyatt, Christophe Pettus, Fabrízio de Royes Mello

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|Bi Directional Replication & Logical Decoding|BDR]]
|Open
|
| Simon Riggs
| Andres Freund, Jim Nasby, Josh Berkus, Mehmet Emin KARAKAŞ, Christophe Pettus, Fabrízio de Royes Mello, Euler Taveira

|- style="background-color:lightgray;"
|Autonomous Transactions
|Open
|
| Simon Riggs, Kumar Rajeev Rastogi
| David Steele, Jim Nasby, Josh Berkus, Nat Wyatt, Masao Fujii, Euler Taveira, Andrew Dunstan, Masahiko Sawada

|- style="background-color:lightgray;"
|Audit Logging
|Open
|
| David Steele
| Josh Berkus, Nat Wyatt, Masao Fujii, Christophe Pettus, Fabio Telles, Satoshi Nagayasu, Yurie Enomoto, Mack McCauley, Masahiko Sawada

|- style="background-color:lightgray;"
|[[PgCon2015ClusterSummit|pg_shard v2.0 and Lessons Learned from NoSQL Databases ]]
|Open
|
| Ozgun Erdogan, Marco Slot
| Josh Berkus, Jim Nasby, Josh Berkus, Chris Winters, Mehmet Emin KARAKAŞ, Fabrízio de Royes Mello, Satoshi Nagayasu, Shigeru HANADA

|- style="background-color:lightgray;"
|Direction of json and jsonb
|Open
|
| Andrew Dunstan
| Josh Berkus, Christophe Pettus, Masahiko Sawada

|- style="background-color:lightgray;"
|Native Sparse Set Type
|Open
|
| Andrew Dunstan
| Josh Berkus

|- style="background-color:lightgray;"
|Testing Framework Adequacy
|Open
|
| Andrew Dunstan
| Josh Berkus, Christophe Pettus, Mack McCauley

|}

== pgAdmin4 ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Infrastructure Q&A ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== WWW Team Meeting ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Advocacy Team Meeting ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Vertical Scalability w.r.t Writes ==
Purpose of this discussion:
* Discuss about priority/importance of various performance and scalability problems
* Solution/Idea to solve most important problem('s)
* Is pgbench sufficient to capture various kind of real world workloads?

Some of important performance problems I have in mind are:
* Avoid/Reduce Vacuum Freeze
* Bloat
Heap
Index
* Instability in TPS due to checkpointer flush
* Tuple size
Heap Tuple Header
Alignment in index can lead to bigger index size for simple datatypes
Scalability bottlenecks
* Locks
ProcArrayLock
WALWriteLock
CLOGControlLock
Lock for Relation Extension

* Writes, especially when data doesn't fit in shared buffers.
Write Performance
Double Buffering
In-memory table/tablespaces
=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Security Team Meeting ==

=== Meeting Notes ===
* This will be, ehem, secure so nothing will be written here

== Partitioning ==
Proposal to enhance partitioning support in PostgreSQL was posted to -hackers last year and resulted in discussion of some ideas regarding implementation. Late in the discussion, a crude WIP patch was also posted with some experimental syntax, catalog changes, an idea for internal representation and a proof-of-concept INSERT tuple routing function demonstrating practicality of the internal representation. It would be nice to carry the discussion forward at the same time implementing a patch to be proposed, reviewed early in the 9.6 development cycle. Points to discuss could be:

* New features and old inheritance based implementation
* Planner considerations for new partitioned table
* Need for a new Append-like executor node for partitioned tables
* DML/DDL restrictions on partitioned tables and partitions
* Basically any considerations for partitioned tables and partitions that are explicitly defined so at a layer that's above the storage layer
* Other points that come up

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Utilization of modern semiconductors ==
Recent evolution of semiconductor devices make us re-consider the assumption we stand on, and utilization of its power is key of innovation.
We'd like to have a discussion to get the future direction in short and middle/long term.

* GPU, FPGA - have advantage on simple but massive amount of calculation. It allows DBMS to perform as data processing platform that works nearby data.

* SSD, NVRAM - likely, game changer of storage layer on both of read/write workloads. DBMS also has to pay attention characteristics of these devices.

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Future of PostgreSQL shared-nothing cluster ==

=== Meeting Notes ===
In 2015 PostgreSQL Professional company started project of migration PostgreSQL-XL to codebase of PostgreSQL 9.4 and increasing its stability and usability. At this unconference session we'd like to discuss current progress and further development. Generally we'd like to find ways to reduce difference between PostgreSQL and its shared-nothing cluster fork so that burden of the maintenance become manageable.

=== Attendees ===
* To be filled in

== PostgreSQL and SMR Drives ==

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Native Columnar Storage ==

See Alvaro's [http://www.postgresql.org/message-id/20150611230316.GM133018@postgresql.org email to Hackers].

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Audit Logging ==

Audit logging is an important part of a RDBMS for many users and applications. Discuss how best to incorporate audit logging into PostgreSQL and what must be included at a minimum to make the feature viable.

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Direction of json and jsonb ==

=== Meeting Notes ===
What are the future needs of the JSON types? Recent suggestions have included an indexable "exists" operator, the json pointer and json patch standards,
recursive merge, intersection, and being able to sssign to a subdocument (json#>path as an lvalue). .What are people using these types for, and what are
the major gaps in functionality?

=== Attendees ===
* To be filled in

== Native Sparse Set Type ==

Sets over small domains can be reasonably modeled by bitmaps, but sets over very large domains can not.
Is there a need for such sets? How would we implement them? Arrays? Balanced trees? Something else?
What types of sets would we allow? Anything with Btree operators, or more restricted? What would the notation look like?

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

== Testing Framework Adequacy ==

The buildfarm is more than 10 years old, and the testing needs of Postgres and its ofware ecosystem have changed radically in that time.
What do we now need in the way of testing? How do we test complex arrangements such as the various sorts of replication in an automated way?
Do we need a new framwork, or can the existing framework be adapted to our needs?

=== Meeting Notes ===
* To be filled in

=== Attendees ===
* To be filled in

PgCon2015ClusterSummit

2015-05-27T07:26:43Z

Hanada: /* Attendee RSVPs */

PgCon2015ClusterSummit

2015-05-27T07:26:25Z

Hanada: /* Suggested Sessions */

Developer FAQ/ja

2014-07-04T04:57:30Z

Hanada: /* どんなスタイルが PostgreSQL ソースコードでは使われますか? */

{{Languages}}

== 開発への参加 ==

=== どのようにすれば PostgreSQL の開発に参加できますか? ===

ソースコードをダウンロードして読んでください。詳細は「[[#最新のソースツリーをダウンロードする方法、また最新のソースに追随する方法は？|最新のソースツリーをダウンロードする方法、また最新のソースに追随する方法は？]]」を参照して下さい。

[http://archives.postgresql.org/pgsql-hackers/ pgsql-hackers メーリングリスト] ("hackers" と呼ばれます) に参加し、読んでください。ここはプロジェクトの主要開発者やコアメンバが開発について議論する場です。

=== 最新のソースツリーをダウンロードする方法、また最新のソースに追随する方法は？ ===

ソースツリーを取得する方法は幾つかあります。たまに開発に参加するだけの人は最新のソースツリー・スナップショットを ftp://ftp.postgresql.org/pub/snapshot/ から取得できます。

一般的な開発者はソースコード管理システムに anonymous でアクセスして取得しています。ソースツリーは現在 git で管理されています。git からのソースコード取得について、詳細は http://developer.postgresql.org/pgdocs/postgres/git.html と [[Working with Git]] を参考にしてください。

=== どのような開発環境が必要ですか? ===

PostgreSQL は主にC言語で開発されています。ソースコードの対象は、主な UNIX プラットフォームと Windows (XP, 2000 以降) です。

多くの開発者は Unix ライクなOS上で、[http://gcc.gnu.org GCC], [http://www.gnu.org/software/make/make.html GNU Make], [http://www.gnu.org/software/gdb/gdb.html GDB], [http://www.gnu.org/software/autoconf/ Autoconf] などのオープンソースのツールを利用して開発しています。もしあなたがオープンソースソフトウェアに貢献した経験があれば、これらのツールは既に良くご存知でしょう。Windows これらのツールを使う開発者は [http://www.mingw.org/ MinGW] を使用します。
もっとも、Windows上のほとんどの開発は、今のところマイクロソフトの Visual Studio 2005(version 8)開発環境と付属のツールで行われています。

PostgreSQL をビルドするために必要なソフトウェアの完全な一覧は「[http://www.postgresql.jp/document/current/html/install-requirements.html 必要条件]」を参照してください。

ソースコードを頻繁にリビルドする開発者は configure 時に --enable-depend フラグを指定することもできます。これを使うと、ヘッダファイルを修正した際に、それに依存する全てのソースファイルもリビルドされるようになります。

src/Makefile.custom で環境変数を設定することができます (例: CUSTOM_COPT)。これはすべてのコンパイルで使用されます。

=== どの項目の開発が望まれていますか? ===
未解決の機能は [[Todo]] にまとまっています。

それぞれの項目については、ML の[http://archives.postgresql.org/ アーカイブ]、標準SQL、推奨書籍などを参考にしてください。参照: [[#開発者向きの良書はありますか?|開発者向けの書籍]])

=== どのようにすれば PostgreSQL ウェブサイトの開発に参加できますか? ===
PostgreSQL ウェブサイトの開発は、[http://archives.postgresql.org/pgsql-www/ pgsql-www メーリングリスト]で議論され、インフラチームによって管理されています。postgresql.orgのウェブサイトのソースコードは Subversion のリポジトリに格納され、[http://pgweb.postgresql.org Tracプロジェクト]の一部として提供されています。

== 開発ツールとヘルプ ==

=== ソースコードの構成はどうなっていますか? ===

『[http://www.postgresql.org/developer/ext.backend.html How PostgreSQL Processes a Query] (PostgreSQL のクエリ処理方式)』(これはソースコードの src/tools/backend/index.html にもあります) をブラウザで見てください。データフロー、フローチャートの中のバックエンド構成要素、共有メモリ内の構成について、簡単に記述されています。フローチャート内の矩形をクリックすると説明が表示されます。説明文の中のディレクトリ名をクリックすると、ソースディレクトリにジャンプし、実際のソースコードを読むことができます。他にも、ソースコード・ディレクトリの中に README ファイルが幾つかあり、モジュールの関数を説明しています。ブラウザからそれらのファイルを読むこともできます。

ソースツリーに含まれる文書以外には、コードについて記述されている論文や発表資料が http://www.postgresql.org/developer/coding にあります。素晴らしい発表資料は http://neilconway.org/talks/hacking/ でも見つかります。

=== 開発に利用できるツールには何がありますか? ===

まず、src/tools ディレクトリ内にある全てのファイルは開発者のために用意されたものです。

RELEASE_CHANGES リリースのたびに変更が必要な項目
backend backend ディレクトリ内の説明と処理の流れ
ccsym 使用中のコンパイラが作成する標準 define を見つける
copyright コピーライト

entab スペース文字をタブ文字に変換する (pgindent で使用される)
find_static static 関数に変更できる関数を見つける
find_typedef ソースコード中の typedef を見つける
find_badmacros 括弧の使い方が不適切なマクロを見つける
fsync ファイル同期を行うシステムコールのコストを比較するスクリプト
make_ctags vi 用の 'tags' ファイルを各ディレクトリに作成する
make_diff *.orig とソースの差分を作成する
make_etags emacs 用の 'etags' ファイルを作成する
make_keywords キーワードを SQL'92 と比較する
make_mkid mkid ID ファイルを作成する
pgcvslog それぞれのリリースのための変更リストを作成する
pginclude インクルード・ファイルを追加 / 削除するスクリプト
pgindent ソースファイルのインデントを行う
pgtest 半自動化されたビルドシステム
thread スレッドのテストをするスクリプト

src/include/catalog には以下のファイルもあります。

unused_oids システムカタログ内で使われていないOIDを見つけるスクリプト
duplicate_oids システムカタログ内で重複しているOIDを見つけるスクリプト

tools/backend については既に他の Q&A で説明済みです。

第2に、タグファイルを扱えるエディタが必要です。関数呼び出しから関数定義をタグ付けできます。さらに低いレベルの関数を手繰ることができ、その後元の関数へ戻ることができます。tag または etags をサポートしているエディタは数多くあります。

第3に、id-utils を ftp://ftp.gnu.org/gnu/id-utils/ から取得してください。

tools/make_mkid を実行し、ソースのシンボルのアーカイブを作成することで、高速に検索できます。

cscope (http://cscope.sf.net/) を使う開発者もいます。その他には glimpse (http://webglimpse.net/) も使われます。

tools/make_diff は差分パッチファイルを作成するツールです。パッチはコンテキスト形式になります。メーリングリストにパッチを投稿する場合はこのコンテキスト形式にしてください。

pgindent はソースコードのスタイルを標準の書式に修正するツールです。通常は、開発サイクルの最終段階で実行されます。ソースコードのスタイルについては[[#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F|この質問]]も参考にしてください。

pginclude は #include を必要ならば追加、不要ならば削除するスクリプトです。

型や関数のようなビルトイン・オブジェクトを追加する場合、それらに対して OID を割り当てる必要があります。この際の規約は、1-9999 の範囲の OID を重複が無いように手作業で割り当てることです。(機構的にはそれぞれのシステムカタログ内で一意であれば問題ないのですが、分かりやすくするためシステム全体で一意になるようにしています。) unused_oids というスクリプトが src/include/catalog にあり、現在使用していない OID を表示します。新しい OID を割り当てる際には unused_oids を参照して未使用のものを使ってください。可能ならば、関連する機能を持つ既存のオブジェクトの近くの OID を選びます。また、OID の割り当てミスを検出する duplicate_oids スクリプトもあります。

=== どんなスタイルが PostgreSQL ソースコードでは使われますか? ===

私たちの標準スタイルは BSD 式です。コードのインデントにはタブを用い、タブはスペース4個分としています。使用するエディタやファイルビューアのタブ幅をスペース4個に設定しておいてください。

'''vi''' の場合には <code>.exrc</code> か <code>.vimrc</code> で以下の設定をします:
set tabstop=4 shiftwidth=4 noexpandtab

'''less''' や '''more''' では、<code>-x4</code> を指定すると適切にインデントされます。

tools/editors ディレクトリには emacs, xemacs, vim 用のサンプル設定ファイルがあります。これは PostgreSQL のコーディング・スタイルを維持するのに役立ちます。

pgindent は OS の indent ツールに適切なフラグを指定して実行し、コードを整形します。pgindent は全てのソースコード・ファイルに対して、ベータテストの時期に実行されます。全てのソースファイルは一貫性のある形式に自動で整形されます。記述したとおりに改行されることが必要なコメントは、ブロックコメント形式にする必要があります。コメントを /*------ から開始してください。ブロックコメントは勝手に整形されることはありません。

ドキュメントの『[http://www.postgresql.jp/document/current/html/source-format.html 書式]』も参照してください。また、[http://archives.postgresql.org/message-id/1221125165.5637.12.camel@abbas-laptop この投稿]は変数や関数名の命名方針について述べています。

なぜ私たちがソースコード・スタイルにこれほど気にするのかについては、コーディングスタイルの価値が[http://ezine.daemonnews.org/200112/single_coding_style.html この記事]で述べられています。

=== システムカタログのダイアグラムはありますか? ===

はい。以下を参照してください
* [http://dalibo.org/_media/articles/catalog.png PNG 形式]
* [http://svn.postgresql.fr/repos/materials/advocacy/trunk/posters/catalogs83.svg SVG 形式]

=== 開発者向きの良書はありますか? ===

5冊挙げておきます:
* An Introduction to Database Systems, by C.J. Date, Addison, Wesley
* A Guide to the SQL Standard, by C.J. Date, et. al, Addison, Wesley
* Fundamentals of Database Systems, by Elmasri and Navathe
* Transaction Processing, by Jim Gray and Andreas Reuter, Morgan Kaufmann
* Transactional Information Systems, by Gerhard Weikum and Gottfried Vossen, Morgan Kaufmann

=== configure とは何ですか? ===

configure と configure.in ファイルは GNU autoconf パッケージの一部です。configure を使うと、OS の様々な機能をチェックし、その結果をCプログラムと Makefile の変数に設定します。autoconf は PostgreSQL のメインサーバにインストールされています。configure にオプションを追加するには、configure.in を編集し、その後 autoconf を実行して configure ファイルを生成してください。

configure がユーザに実行される場合、OS の様々な機能をチェックし、その結果を config.status と config.cache に記録します。そして、複数の *.in ファイルを変更します。例えば、Makefile.in がありますが、configure はその中の全ての @var@ パラメータを設定して Makefile ファイルを生成します。

あなたがファイルを編集する必要が生じた場合、configure によって生成されるファイルを変更するのは時間の無駄になります。代わりに *.in ファイルを編集し、再度 configure を実行することでファイルを生成してください。トップディレクトリで make distclean を実行すると、configure が生成する全てのファイルが削除されます。ソースコードとして配布されるファイルだけが残ることになります。

=== 新しい環境へ移植するためにはどうしたら良いですか? ===

新しい環境へ移植 (port) するためには多くの箇所を変更する必要があります。まず src/template から始めましょう。移植先の OS に対応する適切なエントリを追加します。また、src/config.guess を使ってそのOSを src/template/.similar に追加します。OS のバージョンを厳密に一致させてはいけません。configure テストは、最初に正確なOSバージョンを探し、もし見つからなければ、バージョン番号を除いて探そうとします。src/configure.in を編集し、新しいOSを追加します。(上記の configure に関する質問も参照) その後、autoconf を実行するか、src/configure にもパッチを当てます。

次に、src/include/port をチェックし、新しいOS用のファイルを適切に記述して追加します。願わくば、src/include/storage/s_lock.h に既に移植先のCPU用のロックコードがあることを祈りましょう。src/makefiles ディレクトリにも環境ごとの Makefile があります。専用のファイルが必要な場合には、backend/port ディレクトリへ追加します。

=== なぜスレッド, RAWデバイス, 非同期I/O 等の "イケてる" 機能を使わないのですか? ===

OS はサポートしたばかりの最新機能は非常に魅力的ですが、そういった誘惑には抵抗しています。

1つ目に、我々は 15 以上の OS をサポートしているため、採用する前に新機能は広く採用されていいなければなりません。2つ目に、イケてる機能の多くは、実際には劇的な改善に繋がりません。3つ目に、新機能の中には悪い側面を持つものがあり、信頼性を犠牲にしたり、追加のコードを要求されることがあります。それゆえ、我々は新機能にすぐに飛びつきはせず、こなれるまで見送ります。, then ask for testing to show that a measurable improvement is possible.

例として、現在バックエンド・コードでスレッドが使われていない理由を挙げます:

* 歴史的に、スレッドはサポートしない環境とバグがありました。
* 1つのバックエンドでエラーが生じると他のバックエンドにも悪影響が及びます。
* バックエンドのその他の初期化時間と比較して、スレッドの速度面の利点は微々たるものです。
* バックエンド・コードが複雑になります。
* バックエンドプロセスを終了させることにより、OSが完全に素早くリソースを開放でき、メモリーリークとファイルディスクリプタリークを防止することができます。
* スレッド化されたプログラムをデバッグするのは、プロセスをデバッグするのよりもずっと困難です。それに、コアダンプもスレッドではあまり役に立ちません。
* 読み込み専用の実行形式マップと共有バッファを使用するのはプロセスをスレッドのように扱うことになり、非常にメモリ効率が良いです。
* 頻繁にプロセスを生成、消滅させることによりメモリの断片化を防ぐことができます。これは、長い時間動き続けるプロセスでは管理が難しい問題です。

(一つのクエリを複数のコアで処理するために、個々のバックエンドプロセスが複数のスレッドを使うべきかどうかというのはまた別の問題で、ここでは取り扱いません)。

つまり、私たちは新機能を無視しているわけではありません。採用に慎重なだけなのです。TODO リストには、この分野に関する私たちの見解に関する議論がリンクされている場合があります。

=== ブランチはどのように管理されていますか? ===

ブランチの管理とバックポートに関しては、[[Working_with_Git#Using_Back_Branches|Using Back Branches]]と[[Committing with Git]]を参照して下さい。

=== どこでSQL標準のコピーが入手できますか? ===
[http://www.iso.ch/ ISO] や [http://www.ansi.org ANSI] で購入してください。ISO/ANSI 9075 を探します。ANSI のほうが安価ですが、内容は同じです。

SQL標準の公式コピーは高価なので、開発者の多くはインターネット上にあるドラフト版を利用しています。そのいくつかを挙げます:
* SQL-92 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
* SQL:1999 http://www.cse.iitb.ac.in/dbms/Data/Papers-Other/SQL1999/ansi-iso-9075-2-1999.pdf
* SQL:2003 http://www.wiscorp.com/sql_2003_standard.zip
* SQL:2008 (preliminary) http://www.wiscorp.com/sql200n.zip

PostgreSQL 文書には PostgreSQL に関する情報と [http://www.postgresql.jp/document/current/html/features.html SQL準拠] に関する記述があります。

SQL標準に関するウェブページを挙げます:
* http://troels.arvin.dk/db/rdbms/links/#standards
* http://www.wiscorp.com/SQLStandards.html
* http://www.contrib.andrew.cmu.edu/~shadow/sql.html#syntax (SQL-92)
* http://dbs.uni-leipzig.de/en/lokal/standards.pdf (paper)

注意として、SQL標準のコピーを読むことは、PostgreSQL の開発者になるためには必ずしも必要ではありません。SQL標準の記述を理解することは難しく、長年の経験も必要です。そして、どのみち PostgreSQL の多くの機能は、SQL標準では規定されていないのです。

=== 技術的な質問の回答はどこで得られますか? ===

技術的な質問の多くは pgsql-hackers メーリングリストで応えられています。過去のアーカイブは http://archives.postgresql.org/pgsql-hackers/ にあります。

もし過去の議論や回答が見つからない場合には、気軽にメーリングリストに投稿してください。

IRC (irc.freenode.net #postgresql チャネル) でも、新機能の開発に関する質問も含め、主要開発者 (Major contributors) が技術的な質問に答えてくれるでしょう。

=== なぜ CVS を SVN, Git, Monotone, VSS 等に置き換えないのですか? ===

2010年9月、PostgreSQL プロジェクトは CVS を Git に置き換えました。

== 開発プロセス ==

=== 開発項目を選んだ後、何をすればよいですか? ===

あなたがやりたいことの提案書を添えて、email を pgsql-hackers に送ってください (即採用とはいかないことを覚悟してください)。あなた1人だけで考えて開発することはお勧めしません。別の人が同じ TODO 項目に取り組んでいるかもしれませんし、あなたが TODO 項目を誤解しているかもしれないからです。email では、あなたが採用するつもりの内部実装と、ユーザから見える変更 (新しい文法など) の両方を議論してください。複雑なパッチの場合には、実際に開発を始める前にコミュニティのフィードバックを受けることが重要です。そのような手順を踏まなければ、パッチは却下されてしまうでしょう。もしあなたの開発が企業にスポンサーされている場合には、より効率的に行えるよう[http://momjian.us/main/writings/pgsql/company_contributions/ この記事]を読んでください。

レビュー待ちのパッチ・キューは wiki の [[CommitFest]] で管理されています。

=== どのように変更箇所をテストすれば良いですか? ===

==== 基本システムテスト ====

あなたのコードをテストする最も簡単な方法は、最新バージョンのコードでビルドし、コンパイラの警告が出ないことを確認することです。

configure の際に --enable-cassert オプションを指定してビルドするのも良いでしょう。これはソースコード中のアサーションを有効にし、データ破壊やアクセス違反をより多く検知できるようになります。多くの場合、デバッグが容易になります。

その後、psql を使って性能をテストしてください。

==== リグレッションテスト ====

次のステップは、あなたが行った変更に対して既存のリグレッションテスト (回帰テスト) を行うことです。テストするには、ソースツリーのルート・ディレクトリで "make check" を入力してください。失敗した場合には、調査する必要があります。

もしあなたが既存の動作を意図的に変更した場合には、リグレッションテストには失敗するでしょうが、それは実際には間違った動作ではありません。その場合、リグレッションテストを変更するパッチも作成してください。

==== その他の実行時テスト ====

開発には以下のようなツールもよく利用されています。
* valgrind (http://valgrind.kde.org) : メモリテスト
* gprof (GNU binutils に含まれます), oprofile (http://oprofile.sourceforge.net/) : プロファイリング

==== ユニットテスト、統計的解析、モデルチェックなどはどうですか? ====

テスト用フレームワークについては既に多くの議論があり、れらのアイデアを採用している開発者もいます。

Makefile はインクルードファイルに対して依存性を持たないように注意してください。make clean の後でも make が動作する必要があります。もし GCC を使っているのであれば、--enable-depend オプションを configure 時に指定することで、コンパイラに依存性を自動計算させることができます。

=== パッチの開発後、次に何をすれば良いですか? ===

パッチを pgsql-hackers@postgresql.org へ投稿してください。あなたのパッチが迅速にレビューされ、採用されるようにするため、「[[Submitting a Patch|パッチの投稿]]」にあるガイドラインに従うよう努めてください。

=== パッチの投稿後には何がありますか? ===

パッチは他の開発者のレビューを受けることになります。採用されることもあれば、追加開発が必要だとして送り返されることもあります。このプロセスの詳しい解説は、『[[Submitting a Patch#Patch review and commit|パッチを投稿するには]]』にあります。

=== どうすればパッチのレビューに参加できますか? ===

あなたが [[CommitFestInProgress|CommitFest]] に登録されているパッチのレビューに参加することは大歓迎です。詳細は、「[[Reviewing a Patch|パッチのレビュー]]」を参考にしてください。

=== 著作権の譲渡に合意する必要がありますか？ ===

いいえ。貢献者は自分の著作権を保持します(ヨーロッパの国々ではどちらにせよそうなります)。貢献者は、PostgreSQL Global Development Groupの成員であると見なされます(PGDGに著作権を与えることはできません。なぜなら、PGDGは法的な実体がないからです)。これは、Linuxカーネルや、他の多くのオープンソースプロジェクトで採用されている方法です。

=== 私の著作権表示を適当な場所に追加しても良いですか？ ===

いいえ、そうしないでください。私達は法律的な事項に関する表示は、短く明快にしておきたいと考えています。また、営利企業のユーザにはこれが問題になることがあると聞いています。

=== PostgreSQLライセンス自身が著作権を完全な形で表示することを要求しているのではありませんか？ ===

その通りです。また、これがPostgreSQL Global Development Groupがすべての著作権を保持している理由です。ちなみに、合衆国の法律では著作権が認められるために著作権表示をする必要はありません。ヨーロッパの国々の法律でも同様です。

== 技術的な質問 ==
=== どうすればバックエンドのコードからシステムカタログへ効率的なアクセスができますか? ===

最初にあなたが必要とするタプル (行) を見つける必要があります。それには2つの方法があります。1つは SearchSysCache() やその類似関数を呼び、既知のカタログ用インデックスを使ってシステムカタログを取得する方法です。これはシステムカタログにアクセスする方法として推奨されています。なぜなら、初回の呼び出して必要な行がキャッシュにロードされ、それ以降の呼び出しでは元の表にアクセスする必要が無くなるためです。利用可能なキャッシュの一覧は、src/backend/utils/cache/syscache.c に記載されています。src/backend/utils/cache/lsyscache.c は数多くの特定の列を取得するためのキャッシュ検索関数が定義されています。

返却される行はキャッシュで管理されています。そのため、SearchSysCache() から返された行を変更や削除してはいけません。使用後には ReleaseSysCache() で行を解放する必要があります。解放されたキャッシュは必要に応じて破棄されます。もし ReleaseSysCache() を呼ばなかった場合、キャッシュのエントリはトランザクションの終了までロックされます。開発時には良いかもしれませんが、実際にリリースされるコードでは許されません。

もしシステムキャッシュが利用できない場合には、全てのバックエンドで共有されるバッファキャッシュを介して、表から直接データを取得する必要があります。バックエンドは行を自動的にバッファキャッシュに読み込みます。これを行うには、heap_open() で表を開いた後に、その表のスキャンを heap_beginscan() で開始し、heap_getnext() を HeapTupleIsValid() が true を返す限り繰り返し呼び出します。最後に heap_endscan() を呼びます。スキャンの際にはキーも指定できますが、インデックスは使われません。全ての行がキーと比較され、適合する行のみが返却されます。

ブロック番号とオフセット番号が分かっている場合には heap_fetch() で行を取得することもできます。heap_fetch() では、バッファキャッシュ上の行のロック / アンロックは自動的に行われますが、利用後には Buffer ポインタを渡して ReleaseBuffer() を呼び出す必要があります。

行が得られた後、全ての行タイプで共通のデータを取得することができます。t_self と t_oid は、単に HeapTuple 構造体のエントリにアクセスするだけで読み取れます。表ごとに異なる列を取得する場合には、HeapTuple ポインタを GETSTRUCT() マクロに渡します。返却されるポインタは構造体のポインタにキャストして使います。例えば pg_proc ならば Form_pg_proc ポインタ、pg_type ならば Form_pg_type ポインタです。その後は構造体ポインタを介してフィールドにアクセスできます:

((Form_pg_class) GETSTRUCT(tuple))->relnatts

注意としては、この方法は、固定長かつ非NULLであり、そのフィールドよりも前方の列も固定長かつ非NULLの列でのみ利用可能なことです。さもなければ列の位置は不定になるため、heap_getattr() やその類似関数を使って行から値を取り出す必要があります。

また、有効な行に対して構造体のフィールドを直接書き換えることは避けてください。最も良い方法は、heap_modifytuple() に変更前の行と変更内容を渡すことです。palloc された新しい行が返却され、heap_update() に渡すことができます。削除の場合は、行の t_self を heap_delete() に渡します。t_self は heap_update() でも使うことができます。覚えておく必要があるのは、行は、ReleaseSysCache() の呼び出しで解放されるシステムキャッシュにあるコピーでも、eap_getnext(), heap_endscan(), heap_fetch() の場合は ReleaseBuffer() で解放されるディスクバッファから直接読み取った行でも、構わないということです。もしくは、palloc された行であれば、使用後には pfree() で解放する必要があります。

=== なぜ表, 列, 型, 関数, ビューの名前は Name, NameData, char * といった異なる型として参照されるのですか? ===

表, 列, 型, 関数, ビューの名前はシステムテーブルに Name 型の列として保持されています。Name は固定長でヌル終端の文字列です。サイズは NAMEDATALEN バイトです (デフォルト64バイト)。

typedef struct nameData
{
char data[NAMEDATALEN];
} NameData;
typedef NameData *Name;

表, 列, 型, 関数, ビューの名前はユーザクエリを経由して、可変長のヌル終端された文字列としてバックエンドに渡されます。

heap_open() などの多くの関数は、両方の名前型で呼び出れます。Name 型は NULL 終端されているため、char * 型を引数に取る関数に渡しても大丈夫です。ディスク上の名前 (Name) がユーザから渡された名前 (char *) と比較される機会は多く、Name と char * を入れ替えて使える場合も頻繁にあります。

=== なぜデータ構造体を作成するために Node や List を使うのですか? ===
バックエンドの中で柔軟にデータをやり取りする一貫性のある方法だからです。全ての Node はその型を表す NodeTag フィールド持っています。List は複数の Node を保持する単方向リンクリストです。List 内に要素の順序が意味を持つか否かは用途によります。

以下に List 操作コマンドの一部を示します:

;lfirst(i)
;lfirst_int(i)
;lfirst_oid(i)
:データを返します。(それぞれセル i をポインタ, 整数, OID として)

;lnext(i)
:i の次のセルを返します。

;foreach(i, list)
:list をループし、それぞれのセルを i に格納します。

重要なのは、i が ListCell * 型であることです。セルに格納されたデータではありません。lfirst 関数のいずれかを使ってセルのデータを取得する必要があります。

以下はループ処理を行う典型的なコードです。List 型は Var * 型のデータを格納しており、要素それぞれを処理したい場合です:

List *list;
ListCell *i;
...
foreach(i, list)
{
Var *var = (Var *) lfirst(i);
...
/* ここで var を使う */
}

;lcons(node, list)
:node を list の先頭に追加します。list が NIL ならばリストを新規作成します。

;lappend(list, node)
:node を list の末尾に追加します。

;list_concat(list1, list2)
:list1 の末尾に list2 を追加します。

;list_length(list)
:list の長さを返します。

;list_nth(list, i)
:list の i 番目の要素を返します。番号は 0 から数えます。

;lcons_int, ...
:整数版の lcons_int, lappend_int や、OID 版の lcons_oid, lappend_oid もあります。

gdb を使って、ノードを簡単に表示することができます。最初に gdb の表示切り詰めを無効化してください。

(gdb) set print elements 0

List, Node, 構造体の内容を表示するには、gdb 形式で値を表示する代わりに次の2つのコマンドを使うと、詳しい情報を得ることができます。List の中の Node は展開され、Node はその詳細が出力されます。1番目の関数は短い形式、2番目の関数は長い形式で表示します:

(gdb) call print(any_pointer)
(gdb) call pprint(any_pointer)

出力はサーバログに行われますが、postmaster を使わずバックエンドを直接起動していた場合には画面に表示されます。

=== 構造体にフィールドを追加する際、他に何をする必要がありますか? ===

パーサ、リライタ、オプティマイザ、エグゼキュータ (parser, rewriter, optimizer, executor) に渡す構造体の場合には、処理の追加が必要です。構造体の多くは src/backend/nodes で定義されるルーチンをサポートしており、構造体の作成、コピー、読み取り、書き出しを行うことができます。特に、ほとんどのノード型は copyfuncs.c と equalfuncs.c への対応が必須であり、一部は outfuncs.c や readfuncs.c もサポートする必要があります。新しいフィールドをこれらのファイルでもサポートするよう変更してください。そのほかにも追加したフィールドに対応するコードが無いかを探してください。これには既存のフィールドがどのように扱われているかを参考にするのが良いでしょう。mkid が役に立ちます。([[#What_tools_are_available_for_developers.3F|利用可能なツール]] 参照)

=== なぜメモリ確保に palloc() と pfree() を使うのですか? ===

palloc() と pfree() は malloc() と free() の代わりに使われます。その理由は、クエリの完了時に確保した全てのメモリを容易に解放するためです。たとえどこでメモリを確保したのかが分らなくなっても、全てのメモリを解放することが可能になります。クエリ単位ではないメモリ領域もありますが、バックエンドが定義解放します。

=== ereport() とは何ですか? ===

ereport() はフロントエンドにメッセージを送信します。また、実行中のクエリを終了することもできます。使い方の詳細は「[http://www.postgresql.jp/document/current/html/error-message-reporting.html サーバ内部からのエラーの報告]」を参照してください。

=== CommandCounterIncrement() とは何ですか? ===

通常は、コマンド文は自身が変更した行を見ることはできません。これは「UPDATE foo SET x = x + 1」が正常に動作するために必要です。

しかしながら、トランザクションの中で、そのトランザクションが直前に行った変更結果が必要になる場合もあります。これはコマンド・カウンタ (Command Counter) を利用することで実現できます。カウンタを増加させることでトランザクションを断片に分割し、それぞれの断片はそれ以前に実行した断片の結果を読み取ることができるようになります。CommandCounterIncrement() はコマンド・カウンタを増加させ、トランザクションに新しい断片を追加する処理です。

=== 問い合わせ処理に変更を加える必要が出てきました。パーサ関係のファイルについて手短に説明して下さい。 ===

パーサ関係のファイルは「src/backend/parser」ディレクトリにあります。

scan.lは、字句解析器(lexer)を定義します。字句解析機は、SQL文を含む文字列を一連のトークンに分解します。トークンは通常は一個の単語(空白を含まず、空白によって区切られているもの)ですが、単一引用符、二重引用符で囲まれている場合は、文字列全体になり得ます。字句解析機は基本的に正規表現を使って定義されており、様々なトークンのタイプが記述できます。

gram.yは、字句解析器が生成したトークンを基本構成要素として使ってSQL文の文法(構文構造)を定義します。文法は、BNF記法で定義されています。BNFは正規表現に似ていますが、文字ではなく、トークン上で動きます。また、パターン(ルール、あるいはBNFにおける生成規則)には名前が付けられており、再帰的に定義できます。よって、自分自身をパターンとして呼び出すことができます。

実際の字句解析器は、flexというツールを使ってscan.lから生成されます。flexのマニュアルは http://flex.sourceforge.net/manual/ で参照できます。

実際のパーサは、bisonというツールを使ってgram.yから生成されます。bisonのマニュアルは http://www.gnu.org/s/bison/ で参照できます。

一つ注意しておくと、もし以前にflexやbisonを使ったことがない場合、学習曲線はかなり急なものになるでしょう。

=== どのようなデバッグ機能を利用できますか? ===

まず、もしあなたがC言語で開発しているならば、'''必ず''' --enable-cassert と --enable-debug オプションを有効にして configure を行った状態で動作することを確認してください。アサーションを有効にすると多くの正常性確認処理が有効になります。デバッグシンボルはデバッガ (例えば gdb) を使って期待通りに動作しないコードを追うのに役立ちます。

PostgreSQL サーバには -d オプションがあり、これは詳細メッセージをログに記録します (elog または ereport で DEBUGn の情報を出力します)。-d オプションはデバッグレベルの数値を1つ引数に取ります。高いデバッグレベルを指定するとログファイルのサイズが大きくなるので注意してください。

postmaster が実行中ならば、ウィンドウ (コンソール) を1つ開いて psql を開始します。その後、その psql が接続している postgres プロセスの PID を、SELECT pg_backend_pid() を使って取得します。デバッガをその postgres の PID にアタッチします。デバッガからブレークポイントを設定し、その後 psql セッションからクエリを発行します。もしエラーやログメッセージを出力している場所を探しているのであれば、errfinish にブレークポイントを設定するのが良いでしょう。もしセッションの開始処理をデバッグしたいのであれば、環境変数 PGOPTIONS="-W n" を指定してから psql を開始してください。開始処理に n 秒の遅延を行うため、その間に postgres プロセスにデバッガをアタッチすることができます。ブレークポイントを適切に設定した後に開始処理を継続することになります。

もし postmaster が実行されていないならば、postgres バックエンドをコマンドラインから開始し、SQL 文を直接入力することもできます。しかしながら、これはあまり良い方法ではありません。psql ほど使いやすい環境ではなく (例えばコマンド履歴がありません)、並行処理をテストすることもできないためです。initdb が正常に動作しなくなってしまった場合には役立つかもしれませんが、それ以外の状況ではお勧めしません。

どの関数の実行に時間がかかっているかを知るためにプロファイリングを有効にしてコンパイルすることもできます。configure の際に --enable-profiling を指定してください (この時、性能を測定したいのであれば --enable-casserts は使わないでください。アサーションのチェックは無視できるほど軽量ではないためです)。
サーバプロセスからのプロファイル・ファイルは pgsql/data ディレクトリに出力されます。psql 等のクライアントからのプロファイル・ファイルは、クライアントのカレントディレクトリに置かれます。

[[Category:FAQ]]
[[Category:Japanese]]

20120924updaterelease/ja

2012-11-02T01:05:32Z

Hanada: 原文の変更に追従（The 9.2 upgrade docs say to also set vacuum_freeze_table_age to 0）

{{Languages}}

= 2012-09-24 更新リリースのデータ破損問題に関する詳細 =

== 問題の説明 ==

PostgreSQLのバージョン9.1と9.2には、ダーティブロックのメモリからのフラッシュ(または"[http://www.postgresql.jp/document/current/html/wal-configuration.html チェックポイント]")に関して性能改善と新機能の追加(主に[[What%27s_new_in_PostgreSQL_9.1#Unlogged_Tables|ログを取らないテーブル]])の副作用として偶然混入したバグがあります。このバグは、以下の理由によりデータベースがシャットダウンまたは再起動した場合にある種のデータがディスクに書き込まれないことの原因となります。

* PostgreSQLのクラッシュ
* サーバクラッシュまたは電源喪失
* "immediate" シャットダウン (pg_ctl -m immediate)
* postmasterサービスに対する"kill -9" または Out-Of-Memory-Kill
* データベースがスタンバイからマスターに昇格した

これらの状況下では、データベースはリカバリ可能なデータ破損に陥る可能性があります。この破損の特徴は、一見正しいが実際には間違っている問い合わせ結果を返す場合があることです。このため、このデータ破損の影響を受けたかもしれないユーザはただちに復旧手順を実施することが重要です。

第一に、BTREEとGINインデックスの破損の可能性は低いです。正常にシャットダウンすればこの問題の拡散を防ぐことが出来ます。もしデータ破損が起きていた場合、おそらくインデックスが使用された時にエラーメッセージの形で現れるでしょう。

次に、リレーションの可視性マップ(訳注:visibility map)の破損が起こる有意な可能性(スタンバイではほぼ100%)があります。

PostgreSQL Global Development Groupはこの問題による不便についてお詫びします。

== PostgreSQL 9.1 ユーザのための手順 ==

もし9.1を利用しており、かつ過去数ヶ月の間にあなたのデータベースが予期せぬシャットダウンやフェイルオーバーをしていてデータベース破損の影響を受けている疑いがある場合は、以下の手順を実施してください:

# 新しい 9.1.6 のパッケージ群をダウンロードする
# 以下のいずれかの手段でPostgreSQLをクリーンシャットダウンする
#* 起動スクリプトまたはサービスマネージャ
#* pg_ctl -m start stop
#* pg_ctl -m fast stop
# 9.1.6をインストールする
# 9.2 アップグレードのドキュメントに従い、データベースサーバを再起動する前のこの時点でpostgresql.confの[http://www.postgresql.jp/document/current/html/runtime-config-client.html#GUC-VACUUM-FREEZE-TABLE-AGE vacuum_freeze_table_age]を0に設定して、この手順が完了してからそのエントリを削除するのがよいでしょう; vacuum_cost_delayをグローバルに設定できるのはこの時点です。
# データベースシステムを再起動する
# BTreeおよびGINのインデックスを順次再構築する(下記参照)
# データベース全体に対する手動vacuumを都合のよい負荷の低い時間帯にスケジュールする(下記参照)

もしあなたがPostgreSQL 9.2へのアップグレードを計画している場合は、最初にデータベース全体に対するVACUUMを実行することが重要です。

== PostgreSQL 9.2 ユーザのための手順 ==

もし9.2.0を利用しており、かつ過去数ヶ月の間にあなたのデータベースが予期せぬシャットダウンやフェイルオーバーをしていてデータベース破損の影響を受けている疑いがある場合は、以下の手順を実施してください:

# 新しい 9.2.1 のパッケージ群をダウンロードする
# 以下のいずれかの手段でPostgreSQLをクリーンシャットダウンする
#* 起動スクリプトまたはサービスマネージャ
#* pg_ctl -m start stop
#* pg_ctl -m fast stop
# 9.2.1をインストールする
# 9.2 アップグレードのドキュメントに従い、データベースサーバを再起動する前のこの時点でpostgresql.confの[http://www.postgresql.jp/document/current/html/runtime-config-client.html#GUC-VACUUM-FREEZE-TABLE-AGE vacuum_freeze_table_age]を0に設定して、この手順が完了してからそのエントリを削除するのがよいでしょう; vacuum_cost_delayをグローバルに設定できるのはこの時点です。
# データベースシステムを再起動する
# すぐにあなたのデータベース内の全てのテーブルをVACUUMする。
# BTreeおよびGINのインデックスを順次再構築する(下記参照)

== 全てのテーブルをVACUUMする方法 ==

可視性マップの破損を修復するために、ユーザはvacuumを実行してマップ全体をリセットするために全データベースブロックのスキャンを強制しなければなりません。これは結果的にデータベース全体のスキャンを意味するので、相当量のIOを発生させ、大きなデータベースではかなり時間がかかるでしょう。並列で実行されるデータベースの影響を改善する方法、vacuumを拡散させるためにcost delayを使用することです:

SET [http://www.postgresql.jp/document/current/html/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST vacuum_cost_delay] = 50;

=== 対話的VACUUM ===

データベースそれぞれについて、以下の手順を実施する必要があります:

# psqlにPostgresのスーパーユーザでログインする
# もしそうするならば、vacuum_cost_delayを設定する
# "[http://www.postgresql.jp/document/current/html/sql-vacuum.html VACUUM ( FREEZE, VERBOSE, ANALYZE );]"を実行する(ANALYZEは省略可能)

このコマンドはデータベース全体のvacuumの進捗を確認できるように大量の出力を生成します。

vacuumの終わったものと終わっていないものを追跡するために、あなたは全部を順番に実行する代わりに一度に一つずつテーブルをVACUUMすることもできます。

=== vacuumdb ===

もしvacuumするデータベースが複数ある場合は、代わりに[http://www.postgresql.jp/document/current/html/app-vacuumdb.html vacuumdb]を使うほうが便利だと判断するかもしれません。この場合はこのように実行します:

# もしそうするならば、postgresql.confでvacuum_cost_delayを設定する(そしてデータベースをリロードする)
# postgresスーパーユーザで"vacuumdb -F -v -z -a"を実行する

データベースサーバに接続するために追加のパラメータをvacuumdbに指定する必要があるかもしれない点に注意して下さい。-z(analyze)や-v(verbose)オプションは省略可能です。

== BTree/GINインデックスの再構築 ==

更新リリースにより修正された問題によって破損したインデックスはアクセスされるとエラーメッセージを表示するので、容易に識別できそうです。しかし、いくつかのインデックスは(あまりなさそうですが)エラーなしで誤った応答を返すように破損しているかもしれません。

上で推奨されているVACUUM FREEZEは何種類かのインデックス破損を修復します。しかし、データの完全性に関する強い懸念を持つユーザや、サーバで過去に複数回のクラッシュやフェイルオーバーが発生していて特にリスクを感じているユーザは、考えうるあらゆる破損を除去するためにインデックスの再構築を追加手順として実施すべきです。

=== 各インデックスの再構築 ===

予防であってもインデックス破損を発見したためであっても、一度に一つずつインデックスを再構築できます。最も単純な方法は[http://www.postgresql.jp/document/current/html/sql-reindex.html REINDEX]を使うことです。

REINDEX TABLE <tablename>;

または、単一インデックスに対しては:

REINDEX INDEX <indexname>;

利用可能なRAMの1/8(最大で2GB)までmaintainance_work_memを増やして、REINDEXで使えるRAMを増やすこともできます。REINDEXはテーブル全体の書き込みロックを取得し、テーブルのサイズに依存しますが実行にかなりの時間がかかることがあります。同時に存在するデータベース負荷の下でインデックスを再構築するために、CREATE INDEX CONCURRENTLYが利用できます:

CREATE INDEX CONCURRENTLY <indexname>_tmp <index_definition>;
BEGIN;
DROP INDEX <indexname>;
ALTER INDEX <indexname>_tmp RENAME TO <indexname>;
END;

これは最後の削除とリネームの段階でのみテーブルをロックします。ただし、より複雑です。

どちらのアプローチも、大きなテーブルに実行している間は相当量のIOを発生させます。

=== BtreeおよびGINインデックスの一覧の取得 ===

インデックス再構築のアプローチに関わらず、データベース内のBTreeおよびGINインデックスの一覧を取得できます。BTreeは最も一般的なインデックス種別であるため、あなたのデータベース内のほとんどのインデックスがこれに含まれるでしょう。GiSTインデックスはとても大きくなりうることを考慮して、それらを再構築の対象から外すことができます。

このクエリを使ってください:

SELECT tablename, indexname, indexdef
FROM pg_indexes
WHERE ( indexdef ILIKE '%USING btree%'
OR indexdef ILIKE '%USING GIN%' )
AND schemaname <> 'pg_catalog'
ORDER BY tablename, indexname;

=== 全インデックスの再構築 ===

要求されるダウンタイムを許容でき、全ての破損の予防を絶対的に確信したいのであれば、[http://www.postgresql.jp/document/current/html/app-reindexdb.html reindexdb ユーティリティ]を使ってデータベース内の全てのインデックスを再構築することができます。このコマンドは、破損の危険性がないにも関わらずGiSTインデックスも再構築してしまう点に注意してください。

一つのデータベースをインデックス再構築するにはpostgresスーパーユーザで以下を実行してください:

reindexdb <databasename>

または、全てのデータベースをインデックス再構築するには:

reindexdb -a

データベースサーバに接続するために、reindexdbの追加のオプションが必要になるかもしれません。reindexdbは全てのテーブルのロックを一度に一つずつ取得するので、ダウンタイム中に行うのが最適です。

Running & Installing PostgreSQL On Native Windows/ja

2012-09-28T07:58:45Z

Hanada: 原文の変更に追従＋typo修正

{{Languages}}
== サポートされるプラットフォーム ==

=== どのバージョンのWindowsでPostgreSQLは動作しますか？===

少なくともバージョン9.0の時点では、PostgreSQLはWindows XP以上でサポートされます。32ビットおよび64ビットシステムで動作します。

より新しいメジャーバージョンのサーバがリリースされた後にリリースされた新しいバージョンのオペレーティングシステムについては、そのバージョンのサーバはテストされません。たとえばWindows 7はPostgreSQL 8.4の後にリリースされましたので、PostgreSQL 8.3ではWindows 7をサポートしません。同様に近い将来のRHEL 6がリリースされた時、PostgreSQL 9.0.xのみがそこでサポートされます。
私たちは少なくとも、Windowsの新しいバージョンを、そのリリースの後のPostgreSQLのメジャーバージョンでサポートすることを目標としています。

ワンクリックインストーラでサポートされるプラットフォームについては、メインダウンロードページではなく、インストーラのダウンロードページ[http://www.postgresql.org/download/windows download page for windows]を参照してください。

Windows以外のプラットフォームについては、[[FAQ|main FAQ]] and the [http://www.postgresql.org/download/ main download page]を参照してください。

=== どのWindowsプラットフォームはサポート*されない*のでしょうか？ ===

PostgreSQLインストーラは以下ではテストまたはサポートされません。

* Windows XP Embedded
* Windows 2000
* Windows NT 4
* Windows NT 3.5.x
* Windows 95/98/ME/3.x
* Windows CE
* Windows Mobile

これらのプラットフォームはサポートされません。こうしたプラットフォームでの援助をメーリングリストで頼まないでください。

組込みWindowsに関しては多少の情報があります。問題解消のために[[Troubleshooting Installation#Installation fails on windows embedded|installation on embedded versions of windows]]を参照してください。

=== NT4もサポートされていると聞いたのだけれども本当ですか？ ===
公式にはサポートされておらず、また、以下のような若干の問題がありますが、PostgreSQLはWindows NT4でも動作する可能性があります。。
* インストーラは正しく動作しません。そのため、バイナリの.zipリリース、または、コードをコンパイルして、手作業でインストールしなければなりません。
* PostgreSQLはテーブル空間を実装するためにNTFSファイルシステムの"リパースポイント"機能を使用しています。リパースポイントはNT4では使用できませんので、テーブル空間を使用することはできません。
* 標準ではWindows NT4には'runas.exe'コマンドがありません。このため、管理者アカウントからPostgreSQLを起動することが難しくなっています。
また、NT4での動作確認はほとんど行われていないことにも注意してください。
* Windows NT 4またはWindows 2000では試験は行われません。これらのプラットフォームではより新しいバージョンは動作しない可能性があります。

これらの古いプラットフォームについての問い合わせをメーリングリストで行わないでください。
けれども、[http://www.postgresql.org/support/professional_support professional support]の一部の企業が支援してくれるかもしれません。

=== Winsows95や98、MEはどうなっていますか？ ===

PostgreSQLはこれらのプラットフォームでは使用できない機能を必要としていますので、これらのプラットフォームでは動作しません。
こうしたプラットフォームでPostgreSQLを実行する場合は[http://www.postgresql.org/files/documentation/faqs/text/FAQ_CYGWIN Cygwin]の方を確認してください。こちらは9xプラットフォームをサポートしています。

=== Windows用に64ビットで構築されたPostgreSQLはありますか？ ===

[[64bit Windows port]]は[[PostgreSQL 9.0]]でリリースされました。

一般的に、32ビットで構築されたこれまでのバージョンのPostgreSQLは、64ビット版のWindowsでもうまく動作します。これらはおよそ１GB以上のshared_buffersを実質的に使用することができませんが、Windowsカーネルがディスク読み取りをキャッシュするためにメモリを使用しますので、4GB以上のメモリがあることの利点はまだあります。

=== 64ビット版のODBCドライバについては？ ===

執筆時点では、[http://psqlodbc.projects.postgresql.org/ psqlODBC]のソースコード版には64ビットサポートが存在しますが、64ビットODBCドライバの公式バイナリリリースはありません。詳しくはpsqlODBCのwebサイトを確認してください。

== インストール ==

=== WindowsでPostgreSQLをインストールするためには何が必要ですか？ ===

Windowsにおける各種ダウンロード方法、インストール方法については[http://www.postgresql.org/download/windows the PostgreSQL for Windows download page]を参照してください。

WindowsにPostgreSQLをインストールする一番簡単な方法は、EnterPriseDBにより保守されているOne Click installer package]を使用することです。これは上でリンクされたページから入手することができます。
これは、コンパイル済みのバージョンのPostgreSQLをpgAdmin(管理・保守用のグラフィカルインタフェース)と一緒にインストールする他、特別な機能を追加する'contrib'モジュールや手続き言語を選択してインストールします。
必要になるかもしれないODBCやJDBCドライバなどの追加コンポーネントのダウンロードとインストールを補助する、StackBuilderと呼ばれるプログラムがインストールされます。

=== PostgreSQLをソースコードからコンパイルするためには何が必要ですか？ ===

WindowsにおけるPostgreSQLのコンパイル方法、サポートされるコンパイラやツールについては[http://www.postgresql.org/docs/current/static/install-windows.html 文書]を参照してください。

=== なぜPostgreSQLを実行するために管理者以外のアカウントが必要なのですか？ ===

ハッカーがパッケージ内のソフトウェアの不具合を使用してコンピュータへの取っ掛かりを持った場合、ハッカーはそのサービスを稼動しているユーザアカウントの権限を持つことになります。
PostgreSQLではこうした不具合はまだありませんが、ハッカーがPostgreSQLの不具合を見つけ、それを悪用してシステムをハックしたとしても、損害が最小となるように管理者以外のサービスアカウントの使用を強制しています。

これはUnixの世界ではかなり前から常識的な方法でした。
Windowsの世界でも、Microsoftやほかのベンダーがそのシステムのセキュリティを高めるにつれて、標準的な方法になりつつあります。

PostgreSQL リリース8.2では管理者アカウントで実行することが可能であるという点に注意してください。
PostgreSQL 8.2以降では、取り消せない方法で起動時に管理者権限を放棄することができるので、PostgreSQLが乗っ取られるという極めて起こりそうにない出来事があっても、その後のシステムの安全性を保証します。

=== FATパーティションにPostgreSQLをインストールできますか？ ===

FAT32は何らかのデータベースを稼働させるためのファイルシステムとしてはひどいものですので、FAT32ファイルシステム上のPostgreSQLはサポートも試験もされていません。

PostgreSQLの最優先すべきことはデータの整合性を保つことです。FATおよびFAT32ファイルシステムは単純で、必要とする信頼性やクラッシュに対する安全性を提供していません。さらにFATではセキュリティ機能が提供されませんので、データファイルそのものが承認なしに変更されてしまうのを防ぐことはできません。最後に、PostgreSQLは'リパースポイント'という機能を使用してテーブル空間を実装しています。
この機能はFATパーティションでは使用できません。

NTFSファイルシステムはジャーナリングファイルシステムであり、より優れた信頼性とクラッシュ時の復旧機能を持っています。更に、判りやすいアクセス制御システムを持ち、PostgreSQLで使用するリパースポイントも提供します。

こうした理由により、PostgreSQLインストーラパッケージでは、NTFSパーティション以外にデータベースクラスタを初期化しません。
サーバとユーティリティはパーティションの種類は関係なくインストールすることができます。

しかし、FATパーティションしか選べない、まれな場合もあることを把握しています。こうした場合、データベースクラスタを初期化させずに、通常通りにPostgreSQLをインストールすることもできます。インストールが完了した時に、手作業でFATパーティションに対して'initdb.exe'プログラムを実行してください。
しかしセキュリティと信頼性については妥協することになりますし、また、テーブル空間の作成は失敗します。
運用段階ではFAT32でPostgreSQLを使用しないでください。

人々がこの件について質問する最もよくある理由は、彼らがUSBキーや外部ハードドライブを持っていて、PostgreSQLデータベースをそこに置きたいからです。そのようなことはしないでください。USBキーや外部ハードドライブをNTFSでフォーマットすることができますので、もしデータベースをそこで実行したいのであればそうすべきです。FATはクラッシュセーフでなく、Windowsで「安全に取り外す」を用いずにハードドライブを取り外すことはハードドライブに関する限りクラッシュとしてカウントされます。かなりの確率で破損が起こるでしょう。PostgreSQLと共に使用してあなたが気にかけるなんらかのデータを保存するのであれば、リムーバブルドライブをNTFSで再フォーマットすることは大変重要です。

=== PostgreSQLが必要とするファイルシステムの権限は何ですか？ ===
PostgreSQLサービスアカウントには、サービスディレクトリまでの階層のディレクトリ全てに対する''読み取り''権限が必要です。
データディレクトリについては''書き込み''権限''のみ''が必要です。
特に、バイナリファイルを格納するディレクトリに対しては、''読み取り''以外を許可しては''なりません''。
（インストール先のディレクトリ以下にある全てのディレクトリについてはインストーラが設定しますので、何も変更していなければ問題にならないはずです。）

また、PostgreSQLにはkernel32.dllやuser32.dllなどのシステムDLLへの''読み取り''権限が必要です。
これは通常デフォルトで許可されています。
CMD.EXEバイナリも同様ですが、こちらはロックされているかもしれませんのでその場合は解除しなければなりません。

マルチユーザシステムでPostgreSQLを稼動させる場合、PostgreSQLディレクトリから管理者以外の全てのユーザの権限を取り除かなければなりません。
PostgreSQLのファイルに対して権限が必要なユーザは''決して''存在しません。
全ての通信はlibpq接続を介して行われます。
データファイルに直接アクセスすると、情報の漏洩をもたらしたり、システムが不安定になったりします。

=== なぜエンコーディングにUnicodeを選択できないのですか？ ===
PostgreSQL 8.1からWindows上でUTF-8 UNICODE符号化方式を完全にサポートしました。
Unicode ODBCドライバはUTF-16をサポートし、また、JDBCドライバは完全にunicodeをサポートします。

PostgreSQLサーバは２バイトのUTF-16、４バイトのUTF-32 Unicode符号化方式を、内部データ格納やネットワーク通信においてサポートしません。
WindowsにおいてUTF-16がデフォルトの符号化方式であり、Windowsユーザはたいてい「Unicode」といえばこの符号化方式と考えますので、これが問題になるかと想像するかもしれませんが、ODBCおよびJDBCドライバが面倒を見てくれますので、実際のところは問題ありません。
libpqを直接使用するプログラムはこれに注意しなければなりません。が大した作業ではありません。

=== 英語以外の言語でインストールしたのだけれども、表示されるメッセージが全て英語になってます！ ===
インストール処理時の言語の選択はインストール時のみにインストーラが使用する言語を何にするかを決めるものです。
インストール後の製品のメッセージの言語を変更するためには、''Natural language support''機能付きでインストールしなければなりません。
その後、postgresql.confファイルを編集し、''lc_messages''パラメータの値を好みの言語に変更してください。

== インストール時によくあるエラー ==

=== PostgreSQLやインストーラが起動時にクラッシュしたり、起動できなかったり、起動が固まったりします。 ===

WindowsにおけるPostgreSQLのインストールと実行時の問題のよくある原因は、Windows Scripting
Hostの問題、アンチウィルスソフトウェアの問題、（Microsoft以外の）サードパーティ製のソフトウェアファイアウォールです。
またpostgresサービスアカウントのパスワードで問題が発生することもあります。

以下の節でこれらの問題を説明します。インストーラの問題を問い合わせる前にこれらを読み、手順に従ってみてください。

==== アンチウィルスソフトウェア ====

何らかのアンチウィルスソフトをインストールしているのであれば、PostgreSQLで使用されるはずのデータディレクトリを対象から外す'''必要があります'''。
これでうまくいかなければ、アンチウィルスソフトを完全にマシンからアンインストールする必要があるのかもしれません。

PostgreSQLはMicrosoftによる文書に完全にしたがって動作するようにWindows内のファイルアクセスコマンドを要求しますので、アンチウィルスソフトウェアはPostgreSQLの操作に干渉することがあり得ます。
このため多くのアンチウィルスプログラムは、エラーまたは事故のような動作により、これらのコマンドを少し誤作動させてしまうように変更します。
かなり単純な方法でファイルにアクセスしますので、ほとんどのプログラムで気になることはありません。
PostgreSQLは継続的に複数のプロセスから同じファイル群を読み書きしますので、アンチウィルスソフトウェアのプログラムミスや設計ミスをもたらしがちです。
こうした問題のため、不規則かつ予期できないエラー、最悪データ破損を引き起こすことがあり得ます。

またアンチウィルスソフトウェアがPostgreSQLの動作を劇的に遅くすることがよくあります。
このためスキャナが無視するように少なくともpostgres.exeとデータディレクトリを対象から外さなければなりません

===== 問題がないアンチウィルスソフトウエアには何がありますか？ =====

Windowsインストーラを構築する際に使用しているシステムはいずれもSophos AVかAVGの無料版を使用しています。
またこれらのシステムでは、これらのプログラムを実行中であってもPostgreSQLのリグレッション試験を完全に通過しています。
Microsoft Security Essentialsも動作することを把握しています。

''nod32''アンチウィルス製品については問題がすでに報告されています。
この製品を使用している場合は、排除プロセスリストに"postmaster.exe"を追加してください。
(アドバンスオプションから設定可能です。)
この問題については解決のための報告を行っています。

McAfeeやPandaアンチウィルスソフトウェア、および、NetLimiterネットワーク監視ソフトウェアについてもすでに問題が報告されています。
このソフトウェアパッケージと一緒にPostgreSQLが稼動している場合もありますが、一部で動作しないことがあり、これに対するまだ具体的、あるいは推奨する方法はありません。
インストール時に特化した問題かもしれません。
アンインストールが必要な場合もありました。

==== ソフトウェアファイアウォール ====

マシンにサードパーティ製のファイアウォールソフトウェアがインストールされている場合、無効またはアンインストールを試してください。
実際のところ、Microsoftにより提供される組み込みのファイアウォールが優れた処理を行いますので、Windows XP以降ではサードパーティ製のファイアウォールの必要性はありません。
一部のできがよくない、サードパーティ製のファイアウォールは正しくアンインストールすることができません。
このためアンインストールの後、[http://support.microsoft.com/kb/299357 tell Windows to repair its network settings]を行わなければならないかもしれません。

インストールの際に無効にし、アンインストールの時に元に戻すことに失敗する製品が多くあるため、過去サードパーティ製ファイアウォールを使用していて、アンインストールした場合、Windows Firewallが有効に戻っていることを確認してください。

==== インストーラがインストール時に実行エラーで終了してしまいます？ ====

インストーラが''An error occured executing the Microsoft VC++ runtime installer''などのエラーで終了する可能性があります。
これはWindowsでのみ起こり得ます。

これが発生する原因には大きく２つあります。

1) Windows Scripting HostがVBscriptsを実行することができませんでした。
これは、スクリプトホストが無効な場合（あまりありません）、またはインストーレーションが破損していた場合に発生します。
この問題の兆候は、''CScript Error: Can't find script engine "VBScript" for script "C:\...''のようなメッセージです。
これはVBscriptインタプリタを再登録することで解消することがよくあります。
''Start'' -> ''Run''をクリックし、以下を入力、そして''OK''をクリックしてください。

regsvr32 %systemroot%\system32\vbscript.dll

これが失敗する場合、古めのバージョンのWindowsであれば[http://www.microsoft.com/downloads/en/results.aspx?freetext=windows+script+host&displaylang=en&stype=s_basic updating the scripting host]を試してください。

2) インストーラがシステムの''TEMP''ディレクトリで適切にファイルを読み書きすることができませんでした。
これは、''TEMP''または''TMP''環境変数が標準以外の値に設定されている場合に起こります。
これは、ログファイル内の、スクリプトが実行できなかった、または見つからなかったことを示すメッセージによって確認することができます。
この問題を解消するためには、''TEMP''または''TMP''変数が正しい値に設定されていることを確認してください。

==== postgresユーザのパスワードに関する障害 ====

使用されるパスワードの違いと、パスワードリセットなど一般的な問題を解決させる方法について、Dave Pageが[http://pgsnake.blogspot.com/2010/07/postgresql-passwords-and-installers.html ブログ記事]を記述しています。

==== PATH環境変数 ====

''cygwin''をインストールし、かつ、cygwin\binディレクトリがシステムのPATH変数にある場合も問題があります。
このcygwinディレクトリにはインタプリタ言語(TCL、perl、python)に関連したDLLファイルが存在します。
が、これらにはインストーラやインストールされたPostgreSQLをハングさせたりクラッシュさせるような不具合があります。
インストーラを実行する前にパスからcygwin\binディレクトリを消去してください！

libssl、libintl、またはその両方のバージョンを含むディレクトリがPATH環境変数に含まれている場合にも、問題が報告されています。

==== initdbのインストールと実行の時、権限に関するエラーが起こります ====

PostgreSQLサービスアカウントが、インストール先のディレクトリまでの階層全てに権限を持っているか確認してください。
インストーラはインストール先ディレクトリの権限を設定しますが、その親ディレクトリの権限は設定しません。

==== インストーラが指定したアカウントが管理者だと言い張ります。実際は管理者ではありません！ ====

よくあるのは、そのつもりがなかったとしても、指定したアカウントがadministratorまたはpowerユーザであることです。
インストーラで使用している検査は具体的にいうとAdministratorsグループやPower Usersグループのメンバを検査しています。
作業を戻して、「Local Users and Groups」からAdministratorsグループを開き、メンバを確認してください。
更に、どのグループ(ドメインまたはローカル)がAdministratorsグループのメンバになっているか、そしてそのグループのグループメンバなどなどと確認してください。
PostgreSQLは入れ子のグループに対して全てのレベルを検査します。

==== ターミナルサービスセッションからはPostgreSQLをインストールできないというエラーメッセージが現れます ====

残念ながらその通りです。
PostgreSQLのバックエンドはTSセッションからは実行しません。
また、initdbを行うために、インストーラはスタンドアロンのバックエンドを起動しなければなりません。
そのため、インストールはコンソールから行わなければなりません。
Windows Server 2003を使用している場合は、単なる管理用セッションではなく、実際のコンソールにリモートアクセスすることができることに注意してください。
このためには、mstsc /consoleを実行してリモートデスクトップ接続を開始し、その後は通常通りに接続してください。
これはサーバローカルのコンソールをロックし、そのセッション経由のコントロールを提供します。
この状況であれば、PostgreSQLをうまくインストールできるはずです。

==== "the user has not been granted the requested logon type at this computer"などといったエラーになります ====

指定したPostgreSQLアカウントが『サービスとしてログオン』権限と『ローカルにログオン』権限を持っていることを確認してください。
『ローカルにログオン』権限はインストール段階でのみ必要で、セキュリティポリシーが要求している場合インストール後に取り除くことができます。
(権限は『ローカルセキュリティポリシー』MMC スナップインを使用して付与したり削除したりできます。
『ローカルにログオン』権限はデフォルトです。『サービスとしてログオン』権限は通常、インストーラによって自動的に付与されます。)

まだ問題があるのであれば、監査を(『ローカルセキュリティポリシー』スナップインを使用して)有効にし、他にどんな権限がセットアップに必要かを知らせてください。

コンピュータがドメインのメンバである場合、グループポリシーを使用してドメインレベルでセキュリティポリシーが制御されているかもしれません。

==== サービスアカウントの削除方法は？これはユーザリストに出てきません ====

WindowsのGUIツールは時々一部のアカウントを隠しますので、そこから削除することはできません。
これには自動的に作成（過去のインストレーションから引き継がれたのかもしれません）されたPostgreSQLサービスアカウントが含まれます。このアカウントを削除するためには、以下のようにコマンドラインからNETコマンドを使用してください。
NET USER <username> /DELETE
ここで<username>はユーザのWindowsログイン名、たとえば''postgres''です。

== 実行時によるある問題 ==

=== 手続き言語をインストールすると"dynamic load error"というエラーになります。 ===

その手続き言語用の実際の言語DLLが存在しないことを意味する場合がほとんどです。
PostgreSQLのDLLには言語バインディングのみしか含まれてません。
言語の分散DLLはシステムPATHに存在しなければなりません。
現時点の異なる手続き言語で必要なDLLの一覧に関しては、[http://pginstaller.projects.postgresql.org インストール手順]を参照してください。

どのDLLが存在しないかを正確に調べるために、Microsoftが提供する''depends''ツールを使用することができます。
これは、インストール用とは別のWindows CDにあるWindows Support Toolsから利用可能です。
''depends plpython.dll'' (PL/pythonの場合)を実行することで、どのインポートが存在しないかを表示します。

=== サーバを一回だけ起動したのですが、多くのpostgres.exeプロセスが存在します。 ===

これは正常です。
PostgreSQLは複数プロセスアーキテクチャを使用しています。
空のシステムでは、2個から5個のプロセスが存在するかと思います。
クライアントが接続し始めると、プロセス数は増加します。

=== 環境変数はどう設定すればいいのですか。 ===
PostgreSQLは複数の設定のために環境変数を使用します。
ほとんどのバージョンのWindowsでは、環境変数を変更するためにマイコンピュータのプロパティを開き、「詳細設定」を選択します。
2種類の環境変数が存在することに注意してください。
ひとつは全ユーザに適用されるシステム環境変数、もうひとつは現在のユーザ向けの環境変数です。
PostgreSQLサービス向けの設定を行うための環境変数では、システム環境変数を変更しなければなりません。
システム環境変数を変更した後、サービスを再起動しなければなりません。

=== ハードウェアは十分ありますが、一度に125程度以上の接続で動作させることができません。 ===
サービスとして使用すると、おおよそ125以上の同時接続で失敗することを経験するかもしれません。
PostgreSQLが依存するライブラリの一部がuser32.dllに依存することが原因で発生する可能性があります。
user32.dllはデスクトップヒープとして知られる領域からメモリを割り当てます。
デスクトップヒープはログインセッションごとに割り当てられ、通常、非対話型セッションでは512キロバイトが割り当てられます。
通常稼動する各postgresプロセスはおおよそ3.2キロバイトのデスクトップヒープを消費します。
これとその他のオーバーヘッドにより、おおよそ125接続近辺で割り当て可能なヒープがなくなります。
これはコマンドラインから起動した場合には発生しません（より正確には、もっと多くの接続で発生するようになります）。
対話型のログインセッションで通常3メガバイトのデスクトップヒープが割り当てられるからです。

[http://support.microsoft.com/kb/184802 Microsoft ナレッジベースの記事]で紹介されているようにレジストリの第三SharedSection値を変更することで、非対話型デスクトップヒープを増やすことができます。
あまりに大きな値を指定するとシステムが起動できなくなる可能性がありますので、これには十分な注意が必要です。

[[Category:FAQ]]
[[Category:Japanese]]
[[Category:Windows]]

== Windowsのバージョン固有の問題 ==

==== 64ビット版Windowsに32ビット版のPostgreSQLをインストールすることはできますか ====

最近の32ビット版のPostgreSQL（8.3以降）は64ビット版のWindows XP以降にインストールし、使用することができます。しかし最大プロセスアドレス空間（とこれに伴う共有メモリ）に関して32ビットの制限が残っています。

32ビット版のPostgreSQLサーバに、サーバが稼動中のコンピュータ、またはプログラムの実行環境に64ビット版のlibpqもしくはpsqlODBCドライバがインストールされている場合はそのコンピュータ上の64ビット版のプログラムから接続することができます。

32ビット版のPostgreSQLサーバは32ビット版のlibpqとpsqlODBCしかインストールしませんので、追加で64ビット版のODBCドライバかlibpqをインストールしていない限り、サーバをインストールしたコンピュータ上では32ビット版のプログラムのみがそのデータベースを使用することができます。

==== PostgreSQL ODBCドライバはどこにありますか？64ビット版のWindowsで32ビット版のPostgreSQLを使用しています。 ====

32ビット版のドライバを使用して32ビット版のアプリケーション用のデータソースを設定するためには32ビット版のODBC管理を使用しなければなりません。

[http://psqlodbc.projects.postgresql.org psqlODBC]の[[#What about 64-bit ODBC drivers?|64-bit version]]を同時にインストールしていない限り、PostgreSQLの32ビット版のインストールでは32ビット版のODBCドライバしかありません。この32ビットODBCドライバは32ビットプログラムのみで使用することができ、「64ビット版ODBC管理では現れません」。

64ビット版のWindowsの<code>c:\windows\system32\odbcad32</code>は、この名前にも関わらず「64ビット」ODBCドライバ管理ですので、これは混乱を招きます。これはWindows開発の歴史による産物です。多くのアプリケーションとインストーラがこの名前とパスにあるodbcad32.exeに依存していることは明らかです。このためMicrosoftはばかげた名前になったにも関わらず面倒な状態に陥りました。「system32」ディレクトリが64ビット版Windowsでもこの名前であることも同じ理由です。PostgreSQLでこれをどうにかすることはできません。

参考文献: [http://support.microsoft.com/kb/942976 http://support.microsoft.com/kb/942976]

この記事を読めば64ビット版Windowsにおける32ビット版のODBC管理が以下にあることが分かるでしょう。

<pre>
%systemdrive%\Windows\SysWoW64\odbcad32.exe
</pre>

上記パスを"Start->Run"に入力することで、これを起動することができます。32ビット版のODBC管理上にPostgreSQL ODBCドライバが現れます。

64ビット版のアプリケーションでは32ビット版のODBCドライバを使用することは「できません」。つまり、64ビット版のODBCドライバをインストールしていない限り32ビット版のアプリケーションでのみPostgreSQL ODBCドライバを使用することができます。

==== 32ビット版のPostgreSQLサーバで64ビット版のODBCプログラムを使用することはできますか？ ====

64ビット版の[http://psqlodbc.projects.postgresql.org|psqlODBC]ドライバをインストールしている場合のみです。インストール節を参照してください。

20120924updaterelease

2012-09-26T02:58:11Z

Hanada: add Japanese page

{{Languages}}

= Details of 2012-09-24 Update Release Data Corruption Issue =

== Description of the Problem ==

Versions 9.1 and 9.2 of PostgreSQL have a bug with flushing dirty blocks from memory, or "[http://www.postgresql.org/docs/current/static/wal-configuration.html checkpointing]", introduced accidentally as a side effect of performance optimizations and new features, mainly [[What%27s_new_in_PostgreSQL_9.1#Unlogged_Tables|Unlogged Tables]]. This bug can cause data of certain types to not be written to disk if the database shuts down or restarts for any of the following reasons:

* PostgreSQL crash
* Server crash or power loss
* "immediate" shutdown (pg_ctl -m immediate)
* "kill -9" or Out-Of-Memory-Kill of the postmaster service
* database is a standby which was promoted to master

Under these circumstances, the database can suffer from recoverable data corruption. The nature of this corruption is such that it can produce wrong, but seemingly valid, answers to queries, so it is critical that users who may have been affected by this corruption take steps to clean it up very soon.

First, there is a low probability of corruption of BTREE and GIN indexes. Shutting
down cleanly will limit the further spread of this issue. It's very likely that if corruption has occurred that it would be visible in the form of error messages when the index is used.

Second, there is a significant probability of corruption of relation visibility maps (approaching 100% on standbys). This affects 9.1 very differently from 9.2, however. On PostgreSQL 9.1 the worst consequence is some transient inefficiency and/or failure to recover free space during VACUUM. On PostgreSQL 9.2, we use the visibility map during index only scans and so these are likely to produce wrong answers.

The PostgreSQL Global Development Group apologizes for the inconvenience caused by these issues.

== Steps for Users of PostgreSQL 9.1 ==

If you are running 9.1, and suspect that you may be vulnerable to database corruption because your database has shut down unexpectedly or failed over during the last few months:

# Download new 9.1.6 packages
# Do a clean shutdown of PostgreSQL, using one of the following mechanisms:
#* init script or service manager
#* pg_ctl -m smart stop
#* pg_ctl -m fast stop
# Install 9.1.6
# Restart the database system
# Gradually rebuild all of your BTree and GIN indexes (see below)
# Schedule a manual vacuum of the whole database during a convenient slow period (see below)

If you are planning to upgrade to PostgreSQL 9.2 using pg_upgrade, it is critical for you to run the full database VACUUM first.

== Steps for Users of PostgreSQL 9.2 ==

If you are running 9.2.0, and suspect that you may be vulnerable to database corruption because your database has shut down unexpectedly or failed over during the last two weeks:

# Download new 9.2.1 packages
# Do a clean shutdown of PostgreSQL, using one of the following mechanisms:
#* init script or service manager
#* pg_ctl -m smart stop
#* pg_ctl -m fast stop
# Install 9.2.1
# Restart the database system
# VACUUM all tables in your database immediately
# Gradually rebuild all of your BTree and GIN indexes (see below)

== How to VACUUM All Tables ==

To correct corruption of the visibility map, users should run a vacuum and force a scan of all database blocks in order to reset the entire map. Since this means effectively scanning the entire database, it will generate considerable IO and take significant time to execute for large databases. One way to ameliorate the impact on concurrently running database load is to use cost delay to spread out the vacuum:

SET [http://www.postgresql.org/docs/9.2/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST vacuum_cost_delay] = 50;

=== Interactive VACUUM ===

For each database, you should:

# log in to psql as the Postgres superuser
# set vacuum_cost_delay, if doing so
# run "[http://www.postgresql.org/docs/9.2/static/sql-vacuum.html VACUUM ( FREEZE, VERBOSE, ANALYZE );]" (ANALYZE is optional)

This will produce a lot of output, allowing you to track progress of the full-database vacuum.

You can also VACUUM one table at a time instead of doing them all one after the other, provided that you have some way to track which tables have and have not been vacuumed.

=== vacuumdb ===

If you have multiple databases to vacuum, you may find it convenient to use the [http://www.postgresql.org/docs/9.2/static/app-vacuumdb.html vacuumdb utility] instead. This would work by:

# set vacuum_cost_delay in postgresql.conf, if doing so (and reload database)
# run "vacuumdb -F -v -z -a" as the postgres superuser

Note that you may need to give vacuumdb additional parameters in order to connect with the database server. The -z (analyze) and -v (verbose) options are optional.

== Rebuild BTree/GIN Indexes ==

It is likely that any indexes which are corrupted because of the issues fixed in this update release will display error messages when accessed, and can be easily identified. However, it is possible (though unlikely) that a few indexes may be corrupted so that they return incorrect answers without errors.

The VACUUM FREEZE recommended above will correct some types of index corruption. However, users who have strong data integrity concerns, or feel they are especially at risk due to multiple crashes or failovers in their server history, should take the extra step of rebuilding indexes in order to eliminate any possible corruption.

=== Rebuilding an Individual Index ===

Whether you are being precautionary, or because you have found an index corruption error, you can rebuild indexes one at a time. The simplest way is via [http://www.postgresql.org/docs/9.2/static/sql-reindex.html REINDEX].

REINDEX TABLE <tablename>;

or for a single index:

REINDEX INDEX <indexname>;

You may want to increase the RAM available to REINDEX, by increasing maintenance_work_mem, up to 1/8 of your available RAM (up to a maximum of 2GB). REINDEX takes a full table write lock, however, and depending on the size of the table, can take a considerable time to run. In order to rebuild indexes while under concurrent database load, use CREATE INDEX CONCURRENTLY:

CREATE INDEX CONCURRENTLY <indexname>_tmp <index_definition>;
BEGIN;
DROP INDEX <indexname>;
ALTER INDEX <indexname>_tmp RENAME TO <indexname>;
END;

This locks the table only during the final drop and rename stage. It is, however, more complex.

Either approach will generate considerable IO while running on large tables.

=== Getting a List of Btree and GIN Indexes ===

Regardless of your approach towards rebuilding your indexes, you may want to get a list of all BTree and GIN indexes in the database. BTree is the most common type of index, so this will include most of the indexes in your database. Given that GiST indexes can be quite large, though, you may want to omit them from rebuilding.

Use this query:

SELECT tablename, indexname, indexdef
FROM pg_indexes
WHERE ( indexdef ILIKE '%USING btree%'
OR indexdef ILIKE '%USING GIN%' )
AND schemaname <> 'pg_catalog'
ORDER BY tablename, indexname;

=== Reindexing Everything ===

If you can afford the required downtime, and want to be absolutely certain that you've prevented all corruption, you can reindex every index in your database using [http://www.postgresql.org/docs/9.2/static/app-reindexdb.html the reindexdb utility]. Note that this will cause GiST indexes to be rebuilt as well, even though they are not in danger of corruption.

Run the following as the postgres superuser to reindex one database:

reindexdb <databasename>

Or to reindex all databases:

reindexdb -a

Additional options may be required for reindexdb to connect to your database. Since reindexdb will take a lock on entire tables in your installation, one at a time, this is best done during a downtime.

20120924updaterelease/ja

2012-09-26T02:57:11Z

Hanada: translate into Japanese

{{Languages}}

= 2012-09-24 更新リリースのデータ破損問題に関する詳細 =

== 問題の説明 ==

PostgreSQLのバージョン9.1と9.2には、ダーティブロックのメモリからのフラッシュ(または"[http://www.postgresql.jp/document/current/html/wal-configuration.html チェックポイント]")に関して性能改善と新機能の追加(主に[[What%27s_new_in_PostgreSQL_9.1#Unlogged_Tables|ログを取らないテーブル]])の副作用として偶然混入したバグがあります。このバグは、以下の理由によりデータベースがシャットダウンまたは再起動した場合にある種のデータがディスクに書き込まれないことの原因となります。

* PostgreSQLのクラッシュ
* サーバクラッシュまたは電源喪失
* "immediate" シャットダウン (pg_ctl -m immediate)
* postmasterサービスに対する"kill -9" または Out-Of-Memory-Kill
* データベースがスタンバイからマスターに昇格した

これらの状況下では、データベースはリカバリ可能なデータ破損に陥る可能性があります。この破損の特徴は、一見正しいが実際には間違っている問い合わせ結果を返す場合があることです。このため、このデータ破損の影響を受けたかもしれないユーザはただちに復旧手順を実施することが重要です。

第一に、BTREEとGINインデックスの破損の可能性は低いです。正常にシャットダウンすればこの問題の拡散を防ぐことが出来ます。もしデータ破損が起きていた場合、おそらくインデックスが使用された時にエラーメッセージの形で現れるでしょう。

次に、リレーションの可視性マップ(訳注:visibility map)の破損が起こる有意な可能性(スタンバイではほぼ100%)があります。

PostgreSQL Global Development Groupはこの問題による不便についてお詫びします。

== PostgreSQL 9.1 ユーザのための手順 ==

もし9.1を利用しており、かつ過去数ヶ月の間にあなたのデータベースが予期せぬシャットダウンやフェイルオーバーをしていてデータベース破損の影響を受けている疑いがある場合は、以下の手順を実施してください:

# 新しい 9.1.6 のパッケージ群をダウンロードする
# 以下のいずれかの手段でPostgreSQLをクリーンシャットダウンする
#* 起動スクリプトまたはサービスマネージャ
#* pg_ctl -m start stop
#* pg_ctl -m fast stop
# 9.1.6をインストールする
# データベースシステムを再起動する
# BTreeおよびGINのインデックスを順次再構築する(下記参照)
# データベース全体に対する手動vacuumを都合のよい負荷の低い時間帯にスケジュールする(下記参照)

もしあなたがPostgreSQL 9.2へのアップグレードを計画している場合は、最初にデータベース全体に対するVACUUMを実行することが重要です。

== PostgreSQL 9.2 ユーザのための手順 ==

もし9.2.0を利用しており、かつ過去数ヶ月の間にあなたのデータベースが予期せぬシャットダウンやフェイルオーバーをしていてデータベース破損の影響を受けている疑いがある場合は、以下の手順を実施してください:

# 新しい 9.2.1 のパッケージ群をダウンロードする
# 以下のいずれかの手段でPostgreSQLをクリーンシャットダウンする
#* 起動スクリプトまたはサービスマネージャ
#* pg_ctl -m start stop
#* pg_ctl -m fast stop
# 9.2.1をインストールする
# データベースシステムを再起動する
# すぐにあなたのデータベース内の全てのテーブルをVACUUMする。
# BTreeおよびGINのインデックスを順次再構築する(下記参照)

== 全てのテーブルをVACUUMする方法 ==

可視性マップの破損を修復するために、ユーザはvacuumを実行してマップ全体をリセットするために全データベースブロックのスキャンを強制しなければなりません。これは結果的にデータベース全体のスキャンを意味するので、相当量のIOを発生させ、大きなデータベースではかなり時間がかかるでしょう。並列で実行されるデータベースの影響を改善する方法、vacuumを拡散させるためにcost delayを使用することです:

SET [http://www.postgresql.jp/document/current/html/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST vacuum_cost_delay] = 50;

=== 対話的VACUUM ===

データベースそれぞれについて、以下の手順を実施する必要があります:

# psqlにPostgresのスーパーユーザでログインする
# もしそうするならば、vacuum_cost_delayを設定する
# "[http://www.postgresql.jp/document/current/html/sql-vacuum.html VACUUM ( FREEZE, VERBOSE, ANALYZE );]"を実行する(ANALYZEは省略可能)

このコマンドはデータベース全体のvacuumの進捗を確認できるように大量の出力を生成します。

vacuumの終わったものと終わっていないものを追跡するために、あなたは全部を順番に実行する代わりに一度に一つずつテーブルをVACUUMすることもできます。

=== vacuumdb ===

もしvacuumするデータベースが複数ある場合は、代わりに[http://www.postgresql.jp/document/current/html/app-vacuumdb.html vacuumdb]を使うほうが便利だと判断するかもしれません。この場合はこのように実行します:

# もしそうするならば、postgresql.confでvacuum_cost_delayを設定する(そしてデータベースをリロードする)
# postgresスーパーユーザで"vacuumdb -F -v -z -a"を実行する

データベースサーバに接続するために追加のパラメータをvacuumdbに指定する必要があるかもしれない点に注意して下さい。-z(analyze)や-v(verbose)オプションは省略可能です。

== BTree/GINインデックスの再構築 ==

更新リリースにより修正された問題によって破損したインデックスはアクセスされるとエラーメッセージを表示するので、容易に識別できそうです。しかし、いくつかのインデックスは(あまりなさそうですが)エラーなしで誤った応答を返すように破損しているかもしれません。

上で推奨されているVACUUM FREEZEは何種類かのインデックス破損を修復します。しかし、データの完全性に関する強い懸念を持つユーザや、サーバで過去に複数回のクラッシュやフェイルオーバーが発生していて特にリスクを感じているユーザは、考えうるあらゆる破損を除去するためにインデックスの再構築を追加手順として実施すべきです。

=== 各インデックスの再構築 ===

予防であってもインデックス破損を発見したためであっても、一度に一つずつインデックスを再構築できます。最も単純な方法は[http://www.postgresql.jp/document/current/html/sql-reindex.html REINDEX]を使うことです。

REINDEX TABLE <tablename>;

または、単一インデックスに対しては:

REINDEX INDEX <indexname>;

利用可能なRAMの1/8(最大で2GB)までmaintainance_work_memを増やして、REINDEXで使えるRAMを増やすこともできます。REINDEXはテーブル全体の書き込みロックを取得し、テーブルのサイズに依存しますが実行にかなりの時間がかかることがあります。同時に存在するデータベース負荷の下でインデックスを再構築するために、CREATE INDEX CONCURRENTLYが利用できます:

CREATE INDEX CONCURRENTLY <indexname>_tmp <index_definition>;
BEGIN;
DROP INDEX <indexname>;
ALTER INDEX <indexname>_tmp RENAME TO <indexname>;
END;

これは最後の削除とリネームの段階でのみテーブルをロックします。ただし、より複雑です。

どちらのアプローチも、大きなテーブルに実行している間は相当量のIOを発生させます。

=== BtreeおよびGINインデックスの一覧の取得 ===

インデックス再構築のアプローチに関わらず、データベース内のBTreeおよびGINインデックスの一覧を取得できます。BTreeは最も一般的なインデックス種別であるため、あなたのデータベース内のほとんどのインデックスがこれに含まれるでしょう。GiSTインデックスはとても大きくなりうることを考慮して、それらを再構築の対象から外すことができます。

このクエリを使ってください:

SELECT tablename, indexname, indexdef
FROM pg_indexes
WHERE ( indexdef ILIKE '%USING btree%'
OR indexdef ILIKE '%USING GIN%' )
AND schemaname <> 'pg_catalog'
ORDER BY tablename, indexname;

=== 全インデックスの再構築 ===

要求されるダウンタイムを許容でき、全ての破損の予防を絶対的に確信したいのであれば、[http://www.postgresql.jp/document/current/html/app-reindexdb.html reindexdb ユーティリティ]を使ってデータベース内の全てのインデックスを再構築することができます。このコマンドは、破損の危険性がないにも関わらずGiSTインデックスも再構築してしまう点に注意してください。

一つのデータベースをインデックス再構築するにはpostgresスーパーユーザで以下を実行してください:

reindexdb <databasename>

または、全てのデータベースをインデックス再構築するには:

reindexdb -a

データベースサーバに接続するために、reindexdbの追加のオプションが必要になるかもしれません。reindexdbは全てのテーブルのロックを一度に一つずつ取得するので、ダウンタイム中に行うのが最適です。

Slow Counting/ja

2012-09-19T01:16:27Z

Hanada: Translate note about index-only scans in 9.2

{{Languages}}
'''以下の記事は9.2より前のPostgreSQLバージョンにのみ適用されることに注意してください。今はインデックスオンリースキャンが実装されています。'''

以下の例のようなテーブル内の全行数を数えることは、PostgreSQLの性能が遅いことが分かっている操作の１つです。

<code><pre>
SELECT COUNT(*) FROM table
</pre></code>

これが低速となる理由はPostgreSQLの[[MVCC]]実装に関連します。
複数のトランザクションが異なるデータ状態を参照することができることは、"COUNT(*)"のためにテーブル全体に渡るデータをまとめる簡単な方法があり得ないことを意味します。
別の見方をすると、PostgreSQLは'''必ず'''すべての行をたどります。
これは通常、テーブル内の全行に関する情報をシーケンシャルスキャンを使用して読み取ることになります。
問い合わせがどのように進んでいるかを確認する優れた方法はEXPLAIN ANALYZEを使用することです。

<code><pre>
postgres=# EXPLAIN ANALYZE SELECT COUNT(*) FROM accounts;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=4499.00..4499.01 rows=1 width=0) (actual time=465.588..465.591 rows=1 loops=1)
-> Seq Scan on accounts (cost=0.00..4249.00 rows=100000 width=0) (actual time=0.011..239.212 rows=100000 loops=1)
Total runtime: 465.642 ms
(3 rows)
</pre></code>

悲観的にならざるを得ないのがこの厳密な集約構文だけであることがわかることには価値があります。もし次のように"WHERE"句があったとすると、PostgreSQLは限定されたフィールドに対して利用可能なインデックスを利用して、数えなければならないレコードの行数を制限します。

<code><pre>
SELECT COUNT(*) FROM table WHERE status = 'something'
</pre></code>

これによりこうした問い合わせは大きく高速化されます。PostgreSQLはまだ、行が存在するかどうかを検証するために結果行を読み取る必要があります。他のデータベースシステムでは、こうした状況ではインデックスを参照する必要があるだけかもしれません。

== 行数の推定 ==

おおよその行数だけが必要である場合、PostgreSQLには1つの代替方法があります。
これは以下のようにpg_classカタログのreltuplesフィールドを使用することです。

<code><pre>
pgbench=# select reltuples from pg_class where relname='tellers';
reltuples
-----------
250
</pre></code>

この前提となるのは、統計情報が最新情報を維持できるほど十分にテーブルに対してANALYZEを実行していることです。

他にもトリガを基にした機構を使用してテーブル内の行数を数える方法がよく使われます。
これらの技法の片方、または両方の説明は以下にあります。
* [http://www.varlena.com/GeneralBits/120.php Counting Rows]
* [http://www.varlena.com/GeneralBits/49.php Tracking the Row Count]

* 元原稿: [[Why PostgreSQL Instead of MySQL: Comparing Reliability and Speed in 2007|Why PostgreSQL Instead of MySQL]] (これにはMySQLとどのように異なるかについても議論されています)

[[Category:FAQ]]
[[Category:Performance]]
[[Category:Japanese]]

SQL/MED

2012-03-06T11:11:44Z

Hanada: /* FDW routines */ PlanForeignScan can return multiple paths, from 9.2

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification began in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

Check out the list of all the [[foreign data wrappers]]

= Active Work In Progress =
== Add pgsql_fdw as a contrib module ==
[http://commitfest.postgresql.org/action/patch_view?id=667 "pgsql_fdw contrib module"] is under proposal at CF 2011-11. The goal of this proposal is to add pgsql_fdw as a contrib module.
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW. For this purpose, [http://commitfest.postgresql.org/action/patch_view?id=661 "Collecting statistics on foreign tables"] is under proposal at CF 2011-11. This proposal provides a handler function which allows FDWs to handle ANALYZE commands which are executed for foreign tables. In addition, contrib/file_fdw is enhanced to get sample rows from actual data file and calculate statistics by using existing routines in core.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a JOIN (or JOINs) on the remote side would reduce amount of data transferred from external server.
== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Connection caching ==
Currently, connection caching has not been implemented in order to focus on the FDW API. Ideas below once had been implemented but have since been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scan of a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If the reusable connection is not in cache, then call FdwRoutine.ConnectServer() to create a new connection and store it in the connection cache.

Connections are identified by name. A connection's name is the same as the name of the server which the connection uses.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map the foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' has a 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In the first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In the SQL standard, FDW routines are designed to have a portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, a PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOINs on several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hooks may be added to allow FDWs to optimize the query.

=== Version 4 ===
In 9.2, PlanForeignScan is changed so that FDW can return multiple scan paths per a foreign table, and this change get rid of FdwPlan. Planner chooses appropriate path from paths provided by FDW, and creates only one ForeignScan node which has copy of fdw_private of chosen path. Now PlanForeignScan is responsible to create ForeignScan path node and add it to RelOptInfo (baserel). You can use create_foreignscan_path, which is also changed in 9.2, to create a finished ForeignScan path node.

typedef void (*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

PlanForeignScan of FDW which doesn't support any pushing down feature would be like this.

void
fooPlanForeignScan(Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel)
{
double rows;
Cost startup_cost, total_cost;
List *fdw_private;

/* Estimate # of rows returned by this scan */
rows = ...;

/* Estimate costs of this scan */
startup_cost = ...;
total_cost = ...;

/* Store FDW-private information as copy-able objects */
fdw_private = NIL;
fdw_private = lappend(fdw_private, makeNode(...));
...

/* Create path node and add it to baserel */
add_path(baserel, (Path *)
create_foreignscan_path(root, baserel,
rows, /* # of tuples in the table */
startup_cost, /* costs are required */
total_costs,
NIL, /* no pathkeys */
NULL, /* no outer rel eigher */
NIL, /* no param clause */
fdw_private));
}

In other FDW functions, fdw_private is available via ForeignScanState.

List *fdw_private;

fdw_private = ((ForeignScan *) node->ss.ps.plan)->fdw_private;

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
An FDW handler function returns an FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as the container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in which we store the foreign server ID and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for this purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In the planning phase, create_foreignscan_path() calls PlanRelScan() of the related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide a common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such a function, FDWs can update their statistics in their own respective ways.

In version 1, the planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause for a column, and key/value pairs are stored in attfdwoptions of pg_attribute.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== Cost estimation ===
ANALYZE for foreign tables is not supported in 9.0, so we can't store statistics in local PG. One work around is getting EXPLAIN result from remote server, and use its cost values for local planning.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-12-19T02:02:27Z

Hanada: /* Active Work In Progress */ update active works

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

Check out the list of all the [[foreign data wrappers]]

= Active Work In Progress =
== Add pgsql_fdw as a contrib module ==
[http://commitfest.postgresql.org/action/patch_view?id=667 "pgsql_fdw contrib module"] is under proposal at CF 2011-11. The goal of this proposal is to add pgsql_fdw as a contrib module.
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW. For this purpose, [http://commitfest.postgresql.org/action/patch_view?id=661 "Collecting statistics on foreign tables"] is under proposal at CF 2011-11. This proposal provides handler function which allows FDWs to handle ANALYZE commands which is executed for foreign tables. In addition, contrib/file_fdw is enhanced to get sample rows from actual data file and calculate statistics by using existing routines in core.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause for a column, and key/value pairs are stored in attfdwoptions of pg_attribute.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== Cost estimation ===
ANALYZE for foreign tables is not supported in 9.0, so we can't store statistics in local PG. One work around is getting EXPLAIN result from remote server, and use its cost values for local planning.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-11-24T07:06:21Z

Hanada: /* Active Work In Progress */ 9.2 supports per-column FDW options

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

Check out the list of all the [[foreign data wrappers]]

= Active Work In Progress =
== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause for a column, and key/value pairs are stored in attfdwoptions of pg_attribute.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== Cost estimation ===
ANALYZE for foreign tables is not supported in 9.0, so we can't store statistics in local PG. One work around is getting EXPLAIN result from remote server, and use its cost values for local planning.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-11-24T07:03:45Z

Hanada: /* Per-column FDW option */ move description about pg_attribute from "Active work"

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

Check out the list of all the [[foreign data wrappers]]

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause for a column, and key/value pairs are stored in attfdwoptions of pg_attribute.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== Cost estimation ===
ANALYZE for foreign tables is not supported in 9.0, so we can't store statistics in local PG. One work around is getting EXPLAIN result from remote server, and use its cost values for local planning.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-22T07:46:42Z

Hanada: /* PostgreSQL */ use remote costs for local planning

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== Cost estimation ===
ANALYZE for foreign tables is not supported in 9.0, so we can't store statistics in local PG. One work around is getting EXPLAIN result from remote server, and use its cost values for local planning.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-12T02:17:13Z

Hanada: /* WHERE-clause push-down */ use baserestrictinfo for push-down

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.

To push-down a condition, it must consist of only the following node types. For this purpose, we check each element in RelOptInfo.baserestrictinfo list. If there are conditions which can't be pushed down, the remote server will send rows without the conditions, and the local server will evaluate the rows and ignore rows which don't satisfy the conditions.

{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-12T02:04:15Z

Hanada: /* WHERE-clause push-down */ separate SELECT-clause optimization

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== SELECT-clause optimization ===
Currently SELECT clause is constructed as "SELECT col1, col2, col3, ...". If some of columns are not used at all in the original query, they will be replaced with NULL for optimization. For example, if col2 was unused, SELECT clause will be "SELECT col1, NULL, col3, ...". Main purpose of this optimization is to reduce amount of data transferred from remote server.

=== WHERE-clause push-down ===
WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-12T01:45:59Z

Hanada: /* PostgreSQL */ follow recent updates

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It might be able to be integrated with [http://www.postgresql.org/docs/9.1/static/dblink.html contrib/dblink] to share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from FDW options of foreign-data wrapper, foreign server and user mapping, with choosing only connection options because FDW option might include non-connection options such as relname and nspname.
Note that non-superuser MUST specify password in FDW options and require password authentication by the foreign server because of security issues.

In current implementation, FDW options of user mappings are visible to users who has SUPERUSER privilege or USAGE privilege on relevant SERVER, because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, are configurable via FDW option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

We must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy and some overhead. Another solution is registering cleanup function to resource owner, and release PGresult in that cleanup function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-12T01:33:05Z

Hanada: per-column FDW options has been committed.

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, we must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy. Another solution is registering cleanup function to resource owner, and release PGresult in the function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T06:17:58Z

Hanada: /* Retrieving result tuples */ resource owner can be used to cleanup PGresult

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, we must ensure that PGresult is released explicitly in any case because libpq uses malloc rather than palloc. Copying results into a Tuplestorestate is a solution, which is used in contrib/dblink, but it needs extra memory during the copy. Another solution is registering cleanup function to resource owner, and release PGresult in the function. This method has already been used to close libpq connection.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:59:26Z

Hanada: /* for SQL-based FDWs */ pushing internal parameter down seems difficult

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, received tuples have to be copied into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. Further research might show us an another solution.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.
: This seems difficult in some cases, because value of internal parameter is determined '''after''' fetching tuple from a relation.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:55:42Z

Hanada: /* General */ now file_fdw shares exported COPY FROM routines

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, received tuples have to be copied into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. Further research might show us an another solution.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:54:40Z

Hanada: /* DML */ execute-time constraint has benn removed from proposal

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, received tuples have to be copied into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. Further research might show us an another solution.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:52:41Z

Hanada: /* Open questions */ mark some questions closed

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, received tuples have to be copied into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. Further research might show us an another solution.

= Open questions =
There are still several issues in the FDW design and implementation:

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

== Resolved questions ==
; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.
: '''In 9.1, locking foreign table is not supported.'''

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:44:28Z

Hanada: /* Retrieving all tuples at once */ pgsql_fdw uses cursor for huge result

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving result tuples ===
This FDW switches method for retrieving result tuples according to estimated # of result rows.

If the estimated rows is less than the threshold, simple SELECT is used to retrieve all result at once in first call of Iterate() after Begin() or ReScan(). Otherwise, SQL-level cursor is created in that place, and result rows are retrieved when they were necessary.

Two numbers, minimum # of rows to use cursor and # of rows fetched in one FETCH call, can be specified as generic option of SERVER and/or FOREIGN TABLE. If a option was specified on both object, latter overrides former.

Anyway, received tuples have to be copied into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. Further research might show us an another solution.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T04:18:05Z

Hanada: /* file_fdw */ describe what has been done for file_fdw

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
The file_fdw is a foreign-data wrapper implementation, and included in the distribution of PostgreSQL 9.1 as a contrib module. This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command.
Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw can recognize the file formats which are recognized by COPY command, by using exported COPY FROM routines.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving all tuples at once ===
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T03:16:30Z

Hanada: /* Active Work In Progress */ clarify that each work item is done or undone

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
== Per-column FDW option ==
Similar to other kind of FDW objects, column of a foreign table can have FDW options. This means that CREATE/ALTER FOREIGN TABLE syntax accept OPTIONS clause, and key/value pairs are stored in catalog.

Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute need to have new column attfdwoptions which has been typed text[].

== Table partioning ==
Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].
== Smart planning ==
* We might have statistics of external data. ANALYZE command would need to have hook to delegate row sampling to each FDW.
* set_foreign_size_estimates() have to be enhanced to reflect actual statistics.
== JOIN push down ==
Doing a (or more) JOIN on remote side would reduce amount of data transferred from external server.
== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Finished works =
== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ NOT NULL ],
...
)
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables can have generic options with OPTIONS syntax.

In first version, column DEFAULT value and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

=== Version 3 ===
Finally FDW API has been defined in PostgreSQL 9.1 as below:
typedef FdwPlan *(*PlanForeignScan_function) (Oid foreigntableid,
PlannerInfo *root,
RelOptInfo *baserel);

typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);

typedef void (*BeginForeignScan_function) (ForeignScanState *node,
int eflags);

typedef TupleTableSlot *(*IterateForeignScan_function) (ForeignScanState *node);

typedef void (*ReScanForeignScan_function) (ForeignScanState *node);

typedef void (*EndForeignScan_function) (ForeignScanState *node);

typedef struct FdwRoutine
{
NodeTag type;

PlanForeignScan_function PlanForeignScan;
ExplainForeignScan_function ExplainForeignScan;
BeginForeignScan_function BeginForeignScan;
IterateForeignScan_function IterateForeignScan;
ReScanForeignScan_function ReScanForeignScan;
EndForeignScan_function EndForeignScan;
} FdwRoutine;

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
bool fsSystemCol;
struct FdwPlan *fdwplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
struct FdwRoutine *fdwroutine;
void *fdw_state;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fdw_state->private.

= Foreign data wrappers =
== file_fdw ==
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving all tuples at once ===
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T02:45:29Z

Hanada: /* Built-in foreign data wrappers */ move to upper level

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

= Foreign data wrappers =
== file_fdw ==
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

=== using COPY FROM routines ===
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

=== generic options ===
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

== PostgreSQL ==
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

=== Connection options ===
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

=== No transaction management ===
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

=== WHERE-clause push-down ===
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

=== Retrieving all tuples at once ===
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T02:41:48Z

Hanada: /* Active Work In Progress */ remove WIP code disclosure

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

==== using COPY FROM routines ====
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-08-05T02:33:57Z

Hanada: add to category PostgreSQL 9.2

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

==== using COPY FROM routines ====
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]
[[Category:PostgreSQL 9.2]]

SQL/MED

2011-03-15T04:16:25Z

Hanada: /* Current Status */ some features have been merged into core

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

Basic features have been merged in PostgreSQL 9.1Alpha4.
*Make foreign data wrapper functional
*Support FOREIGN TABLEs
contrib/file_fdw is available to retrieve external data from server-side files.

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

==== using COPY FROM routines ====
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]
[[Category:PostgreSQL 9.1]]

SQL/MED

2010-12-24T07:49:33Z

Hanada: /* file_fdw */ mention about exporting COPY FROM routines

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create or alter foreign tables which uses the file_fdw. At least by default.

==== using COPY FROM routines ====
File_fdw uses the file formats which are recognized by COPY command, so exporting COPY FROM routines would help implementing file_fdw.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

Different from COPY, the ''force_not_null'' can be described in per-column generic option with boolean values, not a list of column names.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-24T07:43:50Z

Hanada: /* DML */

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint(not implemented)
: CHECK and/or NOT NULL constraint which are defined on foreign columns can be evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-24T07:36:20Z

Hanada: /* Planner */ move ForeignScan from Executor section

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

In version 1, planner generates a ForeignScan node for each foreign table in the query, and store FdwPlan in it which is returned by PlanRelScan().

typedef struct ForeignScan
{
Scan scan;
FdwPlan *fplan;
} ForeignScan;

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-24T07:33:37Z

Hanada: /* Executor */ use new FdwRoutine

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

;ExecInitForeignScan()
:Create ForeignScanState for the given ForeignScan plan node.
:Call FdwRoutine.BeginScan() with FdwPlan which was stored in ForeignScan to initiate foreign query if the execution was not for EXPLAIN, and receive FdwExecutionState.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table via TupleTableSlot.
:If the scan reaches the end, the slot will be empty after Iterate() call.
;ExecForeignReScan()
:Call FdwRoutine.ReScan() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.EndScan() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
FdwExecutionState *fstate;
} ForeignScanState;

FdwExecutionState has private area which can be used to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.fstate->private.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-24T07:15:28Z

Hanada: /* Planner */ PlanRelScan() is called from create_foreignscan_path()

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, create_foreignscan_path() calls PlanRelScan() of related FDW's FdwRoutine for each ForeignScan node. PlanRelScan() should provide proper costs for the scan which have been estimated in the way each FDW would like to use.

In future, additional planner hooks might be added for:

# Pass-through mode (one ForeignScan node executes whole query)
# Query optimization such as merging multiple foreign tables into one remote query

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-21T08:04:32Z

Hanada: /* pg_catalog.pg_attribute */

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

In first version, syntax for defining column level generic option would be omitted.

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-21T07:58:14Z

Hanada: /* FDW routines */ new FdwRoutine was proposed

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
=== Version 1 ===
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

=== Version 2 ===
Per discussion and [http://archives.postgresql.org/pgsql-hackers/2010-11/msg01713.php Heikki Linnakangas's proposal], FdwRoutine was changed in some points:

* Add FdwPlan as container of FDW-specific planning information.
* Add FdwExecutionState as container of FD-specific execution information.
* Connection management is left to each FDW, because simple FDW, such as file wrapper, would not need connection
* Add planner hook which allow FDWs to generate FDW-specific plan from RelOptInfo and other information. That plan will be passed to BeginScan() to execute the scan.

struct FdwPlan {
NodeTag type; /* FdwPlan need copyObject() support for plan
caching */
char *explainInfo; /* FDW-specific info shown in EXPLAIN VERBOSE */
double startup_cost; /* Optimizer needs costs for each path */
double total_cost;
List *private; /* FDW can store private data as copy-able objects */
};

struct FdwExecutionState
{
void *private; /* FDW-private data */
};

struct FdwRoutine
{
#ifdef IN_THE_FUTURE
FdwPlan *(*PlanNative)(Oid serverid, char *query);
FdwPlan *(*PlanQuery)(PlannerInfo *root, Query query);
#endif
FdwPlan *(*PlanRelScan)(Oid foreigntableid, PlannerInfo *root,
RelOptInfo *baserel);
FdwExecutionState *(*BeginScan)(FdwPlan *plan, ParamListInfo params);
void (*Iterate)(FdwExecutionState *state, TupleTableSlot *slot);
void (*ReScan)(FdwExecutionState *state);
void (*EndScan)(FdwExecutionState *state);
};

In future, more planner hook might be added to allow FDWs to optimize the query.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-21T07:18:22Z

Hanada: /* Syntax */ simplified syntax for version 1

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW handler function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with FDW handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

In first version, NOT NULL constraint, column DEFAULT value, and column level options are omitted to simplify the patch and make review easy.
[http://archives.postgresql.org/pgsql-hackers/2010-12/msg01168.php hackers-ML archive]

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-12-21T06:56:43Z

Hanada: /* Active Work In Progress */ separate fdw_core into two

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_syntax''' branch contains syntax of SQL/MED
* '''fdw_scan''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-25T09:00:26Z

Hanada: /* file_fdw */ file_fdw is a contrib module now, and doesn't support oids

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_core''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented as a contrib module.
Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable, but ''oids'' is not supported by file_fdw because it's a legacy feature.

The ''force_not_null'' is the only option which is read from per-column generic option. It should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-25T08:18:56Z

Hanada: /* Active Work In Progress */ fix typo

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_core''' branch contains core funcionality of SQL/MED
* '''pgsql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented in the core, initially installed on initdb. Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable except ''oids''. The first column of the file is treated as oid automatically if the foreign table has been defined with "WITH OIDS" option.

The ''force_not_null'' is the only option which has been changed from COPY FROM option. ''force_not_null'' should be specified in per-column generic option and should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-25T08:17:50Z

Hanada: /* Active Work In Progress */ chage names of git branch for development

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_core''' branch contains core funcionality of SQL/MED
* '''postgresql_fdw''' branch contains FDW for external PostgreSQL servers
* '''file_fdw''' branch contains FDW for flat files

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented in the core, initially installed on initdb. Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable except ''oids''. The first column of the file is treated as oid automatically if the foreign table has been defined with "WITH OIDS" option.

The ''force_not_null'' is the only option which has been changed from COPY FROM option. ''force_not_null'' should be specified in per-column generic option and should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-18T12:41:39Z

Hanada: /* file_fdw */ oids option has gone

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_select_simple''' branch contains minimal implementation of SQL/MED query support.
* '''fdw_table''' branch contains all features proposed.

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented in the core, initially installed on initdb. Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
Information of the source file such as filename are passed via generic options. Options of COPY FROM statement are acceptable except ''oids''. The first column of the file is treated as oid automatically if the foreign table has been defined with "WITH OIDS" option.

The ''force_not_null'' is the only option which has been changed from COPY FROM option. ''force_not_null'' should be specified in per-column generic option and should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-17T06:39:13Z

Hanada: /* FDW routines */ add BeginScan() for initiation of scan

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_select_simple''' branch contains minimal implementation of SQL/MED query support.
* '''fdw_table''' branch contains all features proposed.

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*BeginScan)(ForeignScanState *scanstate);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented in the core, initially installed on initdb. Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
The file name is passed with ''filename'' generic option, and options which are valid in COPY FROM statement are also acceptable. The ''force_not_null'' is the only option which has been changed from COPY option; it should be specified in per-column generic option and should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]

SQL/MED

2010-11-17T06:37:53Z

Hanada: /* Active Work In Progress */ removed wrong link to git branch

'''SQL/MED''' is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database. There are two components in SQL/MED:

; Foreign Table
: a transparent access method for external data
; [[DATALINK]]
: a special SQL type intended to store URLs in database

= Current Status =
The implementation of this specification has begun in PostgreSQL 8.4 and will over time introduce powerful new features into PostgreSQL.

* [http://www.pgcon.org/2009/schedule/events/142.en.html SQL/MED: Doping for PostgreSQL]
* [http://developer.postgresql.org/pgdocs/postgres/sql-createforeigndatawrapper.html CREATE FOREIGN DATA WRAPPER]

= Active Work In Progress =
This is a project for PostgreSQL 9.1 to add FDW routines into foreign data wrappers so that we can retrieve data from foreign servers through foreign tables. The syntax for them should be same as for normal local tables.

WIP codes are available at: http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=summary
* '''master''' branch is a copy of postgres' HEAD.
* '''fdw_select_simple''' branch contains minimal implementation of SQL/MED query support.
* '''fdw_table''' branch contains all features proposed.

== Syntax ==
In SQL standard, 'CREATE FOREIGN DATA WRAPPER' have 'LIBRARY' option and FDW routines are exported directly from the library, but another approach like '[http://developer.postgresql.org/pgdocs/postgres/sql-createlanguage.html CREATE LANGUAGE]' would be better because we already have pg_proc, an existing function manager.

-- Register a function that returns FDW connector function set.
CREATE FUNCTION postgresql_fdw_handler() RETURNS fdw_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Create a foreign data wrapper with connection handler.
CREATE FOREIGN DATA WRAPPER postgresql
HANDLER postgresql_fdw_handler
VALIDATOR postgresql_fdw_validator;
CREATE FOREIGN DATA WRAPPER has now HANDLER clause, which is used to specify the handler function to be used to access external data.

-- Create a foreign server.
CREATE SERVER remote_postgresql_server
FOREIGN DATA WRAPPER postgresql
OPTIONS ( host 'somehost', port 5432, dbname 'remotedb' );

-- Create a user mapping.
CREATE USER MAPPING FOR postgres
SERVER remote_postgresql_server
OPTIONS ( user 'someuser', password 'secret' );
These two statements are not changed.

-- Create a foreign table.
CREATE FOREIGN TABLE schemaname.tablename (
column_name ''type_name'' [ OPTIONS ( ... ) ] [ ''constraints'' | DEFAULT ''default value'' [...] ],
...
)
INHERTIS ( parent )
SERVER remote_postgresql_server
OPTIONS ( ... );

Foreign tables should support inheritance and [[table partitioning]] for scale-out [[clustering]]. The main parent table is partitioned into multiple foreign tables, and each foreign table is connected to different foreign servers. It can be used like as [[PL/Proxy#Partitioned remote function call|partitioned remote function call]] in [[PL/Proxy]].

Foreign tables and columns of foreign tables can have generic options with OPTIONS syntax. Because of syntax vagueness between "DEFAULT b_expr" and "OPTIONS ( ... )", OPTIONS clause for a column must be specified before any constraints or default value.

== FDW routines ==
In SQL standard, FDW routines are designed to have portable application binary interface. FDW libraries could be used by several DBMSes without recompiling there, but it doesn't seem realistic. Instead, PostgreSQL-specific and C language-specific routine set would be feasible:

/* FDW interface routines */
typedef struct FdwRoutine
{
FSConnection * (*ConnectServer)(ForeignServer *server, UserMapping *user);
void (*FreeFSConnection)(FSConnection *conn);
void (*EstimateCosts(ForeignPath *path, PlannerInfo *root, RelOptInfo *baserel);
void (*Open)(ForeignScanState *scanstate);
void (*Iterate)(ForeignScanState *scanstate);
void (*Close)(ForeignScanState *scanstate);
void (*ReOpen)(ForeignScanState *scanstate);
} FdwRoutine;

FDW routines are designed to be used in the executor module. The executor seems to be the best-balanced layer for query optimization and data abstraction. It would be harder with other approaches like AM (access methods) or storage manager (smgr) layers to optimize complex queries like JOIN several foreign tables in the same foreign server.

Only interfaces of FdwRoutine, FSConnection are defined in PostgreSQL core, and the actual contents are implemented by each FDW library.

In contrast, ForeignServer and UserMapping are implemented in core.

== On-disk structure ==
=== pg_catalog.pg_foreign_data_wrapper ===
A FDW handler function returns FDW routine set. A new pseudo type 'fdw_handler' is added to represent the routine set. FDW handlers take no arguments and return fdw_handler type.

A FDW handler is registered in fdwhandler column of pg_foreign_data_wrapper catalog. InvalidOid for fdwhandler means that the foreign-data wrapper has no FDW handler, so it can't be used to define any foreign table. This specification supports usage in which foreign-data wrapper is used as container of connection information like the past.

CREATE TABLE pg_catalog.pg_foreign_data_wrapper (
fdwname name NOT NULL UNIQUE,
fdwowner oid NOT NULL REFERENCES pg_authid (oid),
fdwvalidator oid NOT NULL REFERENCES pg_proc (oid),
fdwhandler oid NOT NULL REFERENCES pg_proc (oid),
fdwacl aclitem[],
fdwoptions text[]
)
WITH OIDS;

=== pg_catalog.pg_foreign_table ===
A foreign table is registered in pg_class with relkind = 'f' (RELKIND_FOREIGN_TABLE). It also has a corresponding pg_foreign_table tuple, in that we store the foreign server id and generic options for the foreign table.

CREATE TABLE pg_catalog.pg_foreign_table (
ftrelid oid PRIMARY KEY REFERENCES pg_class (oid),
ftserver oid NOT NULL REFERENCES pg_foreign_server (oid),
ftoptions text[]
)
WITHOUT OIDS;

=== pg_catalog.pg_attribute ===
To store per-column generic options, pg_attribute has new column attgenoptions which has been typed text[].

== Planner and Executor changes ==
The access layer of foreign tables will be implemented in the planner module and the executor module. We will have new ForeignPath and ForeignScan nodes for the purpose.

=== Planner ===
The Planner module is responsible to find the best access path, so FDW should provide the cost for a ForeignPath.

In planning phase, cost_foreignscan() calls EstimateCosts() of related FDW's FdwRoutine for each ForeignScan node.

EstimateCosts() should provide proper costs which have been estimated in the way each FDW would like to use.

To estimate costs as correctly as possible, FDWs might want to have their own statistics. In this step, we don't provide common mechanism to store statistics. Once such mechanism has been implemented, FdwRoutine should have another function which is called from ANALYZE. With such function, FDW can update their statistics in their way.

=== Executor ===
The Executor module executes ForeignScan nodes with calling FDW routines.

typedef struct ForeignScan
{
Scan scan;

/* no additional fields now, but might be added later */
} ForeignScan;

;ExecInitForeignScan()
:Collect catalog information about the foreign table.
:Connect to the foreign server if needed (see [[SQL/MED#Connection caching|connection caching]] for detail).
:Call FdwRoutine.Open() to prepare to execute query such as deparsing SQL and so on.
;ExecForeignScan()
:Call FdwRoutine.Iterate() to retrieve a tuple from the foreign table.
;ExecForeignReScan()
:Call FdwRoutine.ReOpen() to re-initialize scanning.
;ExecEndScan()
:Call FdwRoutine.Close() to finalize the foreign scan.
;ExecForeignMarkPos()/ExecForeignRestrPos()
:Currently MarkPos() and RestrPos() for ForeignScan are not supported, so ExecSupportsMarkRestore() returns false　for ForeignScan. The reason not to support is that they are used to perform merge join, and merge join needs sorted results. If a FDW could deparse Sort nodes into ORDER BY clause properly and supports MarkPos() and RestrPos(), then merge join of foreign tables are supported.

ExecInitForeignScan() generates ForeignScanState from ForeignScan and FDW routines use it to manage the status of scan.

typedef struct ForeignScanState
{
ScanState ss;
FdwRoutine *routine;
ForeignDataWrapper *wrapper;
ForeignServer *server;
FSConnection *conn;
UserMapping *user;
ForeignTable *table;
FdwReply *reply;
} ForeignScanState;

FdwReply is an abstract type to pass foreign-data wrapper specific data between FDW routines. Each foreign-data wrapper can define private data structure and store it into ForeignScanState.reply with casting to FdwReply.

== Connection caching ==
Currently, connection caching is not been implemented to focus on FDW API. Ideas below once had been implemented but have been removed.

Connections to foreign servers are cached and reused during the lifetime of the backend. When a scanning to a foreign table is initialized at ExecInitForeignScan(), the backend searches the reusable connection from cache. If reusable connection is not in cache, then call FdwRoutine.ConnectServer() to get concrete connection and store it in the connection cache.

Connections are identified by name. A connection's name is same as the name of the server which the connection use.

The pg_foreign_connections view displays all the foreign connections that are available in the current session.

{| border="1"
!Name
!Type
!Reference
!Description
|-
|connname
|Text
|
|name of the connection
|-
|srvname
|Name
|pg_foreign_server.srvname
|name of the foreign server
|-
|usename
|Name
|pg_authid.rolname
|name of the local role which was used to map foreign user
|-
|fdwname
|Name
|pg_foreign_data_wrapper.fdwname
|name of the foreign data wrapper which was used to connect to the foreign server
|}

== Built-in foreign data wrappers ==
=== file_fdw ===
This can be used to read data from files in the server's local file system like <code>COPY FROM</code> command. It is implemented in the core, initially installed on initdb. Its implementation bases on COPY FROM, but they are not integrated.

Currently, stdin, although allowed in COPY FROM, is not supported.

Because the FDW read from files on server-side, some security issues should be considered. Maybe Non-superuser should not be allowed to create foreign tables which uses the file_fdw. At least by default.

==== generic options ====
The file name is passed with ''filename'' generic option, and options which are valid in COPY FROM statement are also acceptable. The ''force_not_null'' is the only option which has been changed from COPY option; it should be specified in per-column generic option and should be a boolean value such as ''true'' or ''false''.

=== PostgreSQL ===
This can be used to connect external postgres servers.
It is integrated with contrib/[[dblink]], and share the code and connections.
dblink will be installed optionally like as standard contrib modules.

==== Connection options ====
The connection options are constructed from all GENERIC OPTIONS of foreign-data wrapper, foreign server and user mapping, because currently FDW for PostgreSQL assumes all GENERIC OPTIONS are connection options.
Note that non-superuser MUST specify password in GENERIC OPTIONS and require password authentication by the foreign server because of security issues.

In current implementation, password is exposed as same as other options. It might be necessary to hide some of generic options including password because of security issues.

==== No transaction management ====
FDW for PostgreSQL never emit transaction command such as BEGIN, ROLLBACK and COMMIT. Thus, all SQL statements are executed in each transaction when 'autocommit' was set to 'on'.

==== WHERE-clause push-down ====
Currently SELECT clause is always "SELECT *". It could be optimized with replacing unnecessary column name with "NULL".

WHERE clauses in the original query are [http://wiki.postgresql.org/wiki/ClusterFeatures#Function_scan_push-down pushed-down] into the reconstructed query sent to the foreign server.
There are restrictions for the conditions; their PlanState.qual must consist of only the following node types. If there are other conditions, the remote server will send rows without the conditions, and the local server will evaluate the rows with the conditions.
{| border="1"
! Element
! Tag name
! Note
|-
|Constant value
|Const
|
|-
|Table column reference
|Var
|
|-
|Array of some type
|Array
|expression like "'{1, 2, 3}'"
|-
|External parameter
|Param
|"External" means that "Param.paramkind == PARAM_EXTERNAL"
|-
|Bool expression
|BoolExpr
|expressions such as "A AND B", "A OR B", "NOT A"
|-
|NULL test
|NullTest
|expressions like "IS [NOT] NULL"
|-
|Operator
|OpExpr
|pg_operator.opcode MUST be a IMMUTABLE function
|-
|DISTINCT operator
|DistinctExpr
|expressions like "A IS DISTINCT FROM B"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Scalar array operator
|ScalarArrayOpExpr
|expressions such as "ANY (...)", "ALL (...)"
pg_operator.opcode MUST be a IMMUTABLE function
|-
|Function call
|FuncExpr
|MUST be a IMMUTABLE function
|}

Neither ORDER BY, LIMIT, OFFSET, GROUP BY nor HAVING is used in a foreign query.

==== Retrieving all tuples at once ====
The FDW retrieves all of the result tuples at once with libpq when the first call of Iterate() of Open() or ReOpen(). But we could use cursors instead to avoid too much memory consumption for huge result sets.

After it receives tuples as a PGresult, it copies it into Tuplestorestate to avoid memory leaks on error. The libpq uses malloc() rather than palloc() to allocate the memory. We might need research to avoid the copy.

= Open questions =
There are still several issues in the FDW design and implementation:

; FdwRoutine vs. SETOF record function
: Some of fdw routines are similar to SETOF record function. We could merge them or share some of the internal routines. However, it seems to be hard to use SRF instead of FdwRoutine because FDW needs to support a couple of utility functions; connect, disconnect, handle WHERE conditions, etc.

; fdw_handler vs. function table like pg_am
: FDW routines requires a set of functions. The fdw_handler can pack those functions in a C++ like interface. However, we have pg_am for index access methods, that is a table-based approach. Note that we probably need to write fdw routines with C because it accesses executor objects to extract expressions.

; pg_foreign_table.ftoptions vs. pg_class.reloptions
: We could store ftserver and ftoptions into some fields in pg_class, ex. relam and reloptions, because we probably won't use those fields for foreign tables.

; Which user identifier is appropriate to determine USER MAPPING ?
: Current implementation uses OuterUserId but not CurrentUserId to determine USER MAPPING. Because OuterUserId is the role that the user specified explicitly with SET ROLE or SET SESSION AUTHORIZATOIN, on the other hand, CurrentUserId is changed implicitly during execution of a function which have been created with SECURITY DEFINER option. It would not be what the user expect that a access to a foreign table via a SECURITY-DEFINER-function uses the USER MAPPING which related to the owner of the function. Is this an appropriate specification ?

; Which should we export foreign connection management functions from?
: Currently <code>DISCARD ALL</code> disconnects all of connections, but we might provide SQL functions to manage each foreign connection. We could export those functions from the core like pg_connect()/pg_disconnect(), or continue to use contrib/dblink if they are optional.

; Locking a foreign table
: Currently a foreign table can be locked in only ACCESS SHARE mode because only SELECT privilege can be granted on a foreign table. In normal table case, at least one of INSERT/UPDATE/DELETE privilege is required to lock in other modes. Should we relax the restriction if the target is a foreign server ? We must consider about recursive locking via table inheritance.

= Supported features =
== DDL ==
* ALTER FOREIGN DATA WRAPPER name {HANDLER name|NO HANDLER}
* CREATE FOREIGN TABLE name INHERITS (parent)
** Inherit a plain relation (tableoid system attribute is supported too)
* DROP FOREIGN TABLE
* ALTER FOREIGN TABLE name RENAME TO newname
* ALTER FOREIGN TABLE name RENAME COLUMN column TO newname
* ALTER FOREIGN TABLE name {ADD|DROP} column
* ALTER FOREIGN TABLE name {ADD|DROP} constraint
** Only NOT NULL and CHECK constraints are supported.
* ALTER FOREIGN TABLE name OWNER TO owner
* {GRANT|REVOKE} SELECT [(column list)] ON FOREIGN TABLE name {TO|FROM} user
** syntax below are valid too:
*** {GRANT|REVOKE} SELECT [(column list)] ON name {TO|FROM} user
*** {GRANT|REVOKE} SELECT [(column list)] ON TABLE name {TO|FROM} user
* CREATE RULE ... TO foreign_table
* COMMENT ON FOREIGN TABLE name IS 'table comment'
* COMMENT ON COLUMN name.column IS 'column comment'

== DML ==
* SELECT statement using:
** multiple foreign-data wrappers
** multiple foreign servers
** multiple foreign tables (JOIN, UNION, Subquery, etc.)
** PREPARE/EXECUTE statement with parameters
* Deny execution of INSERT/UPDATE/DELETE for a foreign table
* Deny execution of VACUUM/TRUNCATE/CLUSTER for a foreign table
* Lock foreign tables and their children recursively

; Execute-time constraint
: CHECK and/or NOT NULL constraint which are defined on foreign columns are evaluated when actual tuples are retrieved from the foreign server.

; Support tableoid system column
: To have foreign tables support inheritance, tuples from a foreign table should supply tableoid column.

== pg_dump ==
* dumping schema (definition) of foreign tables
** contents of a foreign table are not dumped because they are not part of the database
* dumping foreign-data wrappers with HANDLER specification
* dumping foreign-data wrappers, servers and user mappings excluding built-in objects

= Future improvements =
== General ==
; FDW as a source for COPY FROM
: COPY FROM will be adjusted to use a foreign table as a input source. The traditional TSV and CSV parser is rebuild　as a built-in '''File data wrapper'''. For this purpose, FDW routines should be designed to be able to read many tuples as a stream. Overheads and result caching should be avoided in this layer.

; Smart planning
: ANALYZE command can update pg_statistic and part of pg_class (reltuples and relpages) of the foreign tables with adding FDW routine Analyze(tableoid or tablename) which returns pg_statistic records for the foreign table.
: The costs to access foreign data will be different from the cost to access local data even if the data definition and contents are same. GENERIC OPTION like '''cost_factor''' allow to tell the overhead to planner.

== for SQL-based FDWs ==
; JOINs of two foreign tables in the same server
: They could be merged into one ForeignScan so that the foreign server can return the result after local JOINs in it.

; Optimize SELECT clause
: Some foreign scan need only a part of columns. Unnecessary columns in such a scan are omissible from the SELECT clause.

; Support internal parameter
: A certain kind of a plan, i.e. nested loop, generates internal parameter to pass value(s) from parent node to child node. The number of records acquired from an foreign server can be decreased by applying an internal parameter to external query.

; Optimize parameter
: Some foreign scan uses only a part of parameters of EXECUTE statement. Unused parameters are omissible from the parameter of PQexecParams(). And parameters can be passed in binary format to avoid conversion between text and binary.

; Support cursor mode for huge result
: Currently libpq does not support protocol level cursor, so the FDW for PostgreSQL executes SELECT statement directly via PQexecParams() and retrieves all tuples at once. If parameterized cursor is supported, the FDW for PostgreSQL will be able to retrieve a part of the result at a time to improve response.

; Push-down WHERE clause including CURRENT_TIMESTAMP
: Rewriting query like pgpool, or replacing the FuncExpr node with a Const node representing the result of CURRENT_TIMESTAMP.

= SQL Conformance =
{| border="1"
|+ Foreign table features in the SQL standard
! Identifier
! Description
! Status
|-
| M004
| Foreign data support
|
|-
| M005
| Foreign schema support
|
|-
| M006
| GetSQLString routine
|
|-
| M007
| TransmitRequest
|
|-
| M009
| GetOpts and GetStatistics routines
|
|-
| M010
| Foreign data wrapper support
|
|-
| M018
| Foreign data wrapper interface routines in Ada
| (not planned)
|-
| M019
| Foreign data wrapper interface routines in C
|
|-
| M020
| Foreign data wrapper interface routines in COBOL
| (not planned)
|-
| M021
| Foreign data wrapper interface routines in Fortran
| (not planned)
|-
| M022
| Foreign data wrapper interface routines in MUMPS
| (not planned)
|-
| M023
| Foreign data wrapper interface routines in Pascal
| (not planned)
|-
| M024
| Foreign data wrapper interface routines in PL/I
| (not planned)
|-
| M030
| SQL-server foreign data support
|
|-
| M031
| Foreign data wrapper general routines
|
|}

{| border="1"
|+ Error codes for FDWs
! Code
! Meaning
|-
| HV000
| FDW-specific condition
|-
| HV001
| MEMORY ALLOCATION ERROR
|-
| HV002
| DYNAMIC PARAMETER VALUE NEEDED
|-
| HV004
| INVALID DATA TYPE
|-
| HV005
| COLUMN NAME NOT FOUND
|-
| HV006
| INVALID DATA TYPE DESCRIPTORS
|-
| HV007
| INVALID COLUMN NAME
|-
| HV008
| INVALID COLUMN NUMBER
|-
| HV009
| INVALID USE OF NULL POINTER
|-
| HV00A
| INVALID STRING FORMAT
|-
| HV00B
| INVALID HANDLE
|-
| HV00C
| INVALID OPTION INDEX
|-
| HV00D
| INVALID OPTION NAME
|-
| HV00J
| OPTION NAME NOT FOUND
|-
| HV00K
| REPLY HANDLE
|-
| HV00L
| UNABLE TO CREATE EXECUTION
|-
| HV00M
| UNABLE TO CREATE REPLY
|-
| HV00N
| UNABLE TO ESTABLISH CONNECTION
|-
| HV00P
| NO SCHEMAS
|-
| HV00Q
| SCHEMA NOT FOUND
|-
| HV00R
| TABLE NOT FOUND
|-
| HV010
| FUNCTION SEQUENCE ERROR
|-
| HV014
| LIMIT ON NUMBER OF HANDLES EXCEEDED
|-
| HV021
| INCONSISTENT DESCRIPTOR INFORMATION
|-
| HV024
| INVALID ATTRIBUTE VALUE
|-
| HV090
| INVALID STRING LENGTH OR BUFFER LENGTH
|-
| HV091
| INVALID DESCRIPTOR FIELD IDENTIFIER
|-
| 0X000
| invalid foreign server specification
|-
| 0Y000
| pass-through specific condition
|-
| 0Y001
| INVALID CURSOR OPTION
|-
| 0Y002
| INVALID CURSOR ALLOCATION
|}

[[Category:SQL/MED]]