WWW FDW

From PostgreSQL wiki

(Difference between revisions)
Jump to: navigation, search
(Created page with "== Idea == Provide ability to work with data from web service using SQL. == Benefits == * ability to select data from web service using different criterias (filters, sort, agg…")
 
m (While it may be pronounced "shed-yule" in some parts of the world, it's spelled with a 'c'.)
Line 164: Line 164:
 
Central place is realization of extension core, which will be developed & QAed independently from any callback/web_services implementations. In this case it will be really easy to reuse this extension with minimum of coding for new services. From the other side, any complicated service can be handled with more complex callbacks.
 
Central place is realization of extension core, which will be developed & QAed independently from any callback/web_services implementations. In this case it will be really easy to reuse this extension with minimum of coding for new services. From the other side, any complicated service can be handled with more complex callbacks.
  
== Shedule ==
+
== Schedule ==
  
 
'''TODO'''
 
'''TODO'''
  
 
== TODO ==
 
== TODO ==

Revision as of 14:23, 28 September 2011

Contents

Idea

Provide ability to work with data from web service using SQL.

Benefits

  • ability to select data from web service using different criterias (filters, sort, aggregation, other sql features);
  • ability to join/union data from web service with other table data (probably output of another web service);
  • export of data from web service into table view;
  • common interface for any web service.

Details

Will be implemented using Foreign data wrappers (SQL/MED).

Main complexity here is to build easy interface for wide range of web services. Main goal for this implementation will be RESTful services.

Configuration

Each FDW will have following configuration options:

  • uri - base URI for web service;
  • uri_select - part of URI for selection command, default - "";
  • uri_insert - part of URI for insert command, default - "" (reserved for future);
  • uri_delete - part of URI for delete command, default - "" (reserved for future);
  • uri_update - part of URI for update command, default - "" (reserved for future);
  • uri_callback - callback for URI forming (in case it's not static);
  • method_select - http method for select command, default value - "GET";
  • method_insert - http method for insert command, default value - "PUT";
  • method_delete - http method for delete command, default value - "DELETE";
  • method_update - http method for update command, default value - "POST";
  • request_serialize_callback - callback for forming request (for non standard cases, will be covered later);
  • response_type - web service response content type, possible values:
    • json - default value;
    • xml
    • yaml
    • other
  • response_deserialize_callback - callback for parsing responses of "other" type;
  • response_iterate_callback - callback for extracting/iterating rows from response.

Commands

There is following correspondence between sql commands and http methods:

  • SELECT - GET
  • INSERT - PUT
  • DELETE - DELETE
  • UPDATE - POST

There are configurable options for correspondence above.

In the current version (postgres 9.1) there is no implementation for FDW INSERT/DELETE/UPDATE. But stuff related for INSERT/DELETE/UPDATE is reserved for future.

Request

Since all request parameters to web service are pairs (key,value), therefore default behavior for request preparation is: pack pairs (column, value) from sql query as "column=pair". Operators differ from "=" will raise an error (if request_serialize_callback wasn't specified).

request_serialize_callback is designed for non standard request preparation (aliases for parameters, default parameters etc).

Response

In case response type is one of:

  • json
  • xml
  • yaml

response will be parsed and response elements will be casted to postgres types. If response_iterate_callback is specified it will be called from IterateForeignScan_function with result structure and bigint iterator. Other way first array (if it has all elements of the same type/structure) will be returned (breadth-first search).

postgres has following data types which will cover needed data structures:

In case of "other" response type response_deserialize_callback will be called. It's return value is array of result rows.

Callbacks

Example:

CREATE OR REPLACE FUNCTION example_response_iterate_callback(e anyelement, idx bigint) RETURNS g AS $$
DECLARE
	r	g;
BEGIN
	BEGIN
		IF idx > array_length(e.items, 1) THEN
			RETURN	NULL;
		END IF;
		r.title	:= e.items[idx].title;
		r.link	:= e.items[idx].link;
		r.snippet	:= e.items[idx].snippet;
	EXCEPTION
		WHEN OTHERS THEN
			RAISE EXCEPTION 'error occured, sqlstate: %', SQLSTATE;
	END;
	RETURN	r;
END; $$ LANGUAGE PLPGSQL;

Function is passed as FDW option:

CREATE SERVER www_server FOREIGN DATA WRAPPER www_fdw OPTIONS (response_iterate_callback 'example_response_iterate_callback($1,$2)');

Call to this function is implemented through SPI.

Examples

Extension installation:

-- Install the extension
CREATE EXTENSION www_fdw;

-- Create the foreign server, a pointer to the web service.
CREATE SERVER www_service FOREIGN DATA WRAPPER www_fdw 
    OPTIONS (uri 'URI', ...OTHER_PARAMETERS_HERE...);

CREATE USER MAPPING FOR current_user SERVER www_service;

Example for google search API:

CREATE SERVER www_fdw_google_search_server FOREIGN DATA WRAPPER www_fdw 
    OPTIONS (uri 'https://www.googleapis.com/customsearch/v1?alt=qson&key=KEY');

CREATE FOREIGN TABLE www_fdw_google_search (
  title text,
  link text,
  snippet text
) SERVER www_fdw_google_search_server;

Queries examples:

select title,snippet from www_fdw_google_search where q=’cat dog’ limit 1;
    title                                            |             snippet
-----------------------------------------------------+-----------------------------------------------------------------------------------------
 CatDog - Wikipedia, the free encyclopedia           | CatDog is an American animated television series which first aired on April 4, 1998, ...

-- example of union with full text search:
select * from documents where body @@ plainto_tsquery('cat dog')
union
select title,snippet from www_fdw_google_search where q=’cat dog’ limit 1;
    name                                             |             body
-----------------------------------------------------+-----------------------------------------------------------------------------------------
 cat and dog                                         | cat and dog set on a pavement
 CatDog - Wikipedia, the free encyclopedia           | CatDog is an American animated television series which first aired on April 4, 1998, ...

Other interesting examples:

More

Besides providing specialized callbacks there is another way: building web service "in the middle", which translates formats. In simplest case it implements GET returning json array with row objects.

Central place is realization of extension core, which will be developed & QAed independently from any callback/web_services implementations. In this case it will be really easy to reuse this extension with minimum of coding for new services. From the other side, any complicated service can be handled with more complex callbacks.

Schedule

TODO

TODO

Personal tools