Gsoc08-collation

Proposal

Abstract

Current version of PostgreSQL supports only one collation per database cluster set by initdb. This does not meet the requirements of some users developing multi-lingual applications.

The goal of the work will be to implement collation at database level and make foundations for further national language support development. User will be able to set collation when creating database or change collation of existing one. Particulary commands CREATE DATABASE... COLLATE … and ALTER DATABASE … COLLATE … regarding ANSI standard. Work will also implement.possibility of creating users's own collation collection – commands CREATE COLLATION … FROM … USING and DROP COLLATION regaring ANSI standard.

Further information

This work will be used as my bachelor thesis. Learning PostgreSQL's internals better will be great experience for me. I will continue working on this project as a master thesis and will be adding more functionality. The idea is to implement collation per colmun. For my batchof this work I'm applying for in the Google Summer of Code 2008 will implement collation functionality at database level and create foundation for further multi language support development. This will be a significant benefit for open source community.

The initial part of my work has been completed and submitted as part of a patch contributed by Alexey Slynko. I'm now in stage of adding collation catalogs, that will be important for further multi language support.

Users and developers have been asking for improvement of multi language support. This requirement has been already added to official PostgreSQL TODO list.

Implementation

Catalogs

new catalog pg_collation will be defined
pg_collation will contain SQL standard collations + optional default collation (when set other than SQL standard one)
pg_type, pg_attribute, pg_namespace will be extended with references to default records in pg_collation

initdb

pg_collation will contain pre-defined records regarding SQL standard and optionally one record that will be non-standard set when creating initdb (the one using system locales)
this record will be referenced by pg_type, pg_attribute, pg_namespace in concerned columns and will be concidered as default collation that will be inherited

CREATE DATABASE ... COLLATE ...

after copying the new database the collation will be default (same as cluster collation) or changed by COLLATE statement. Then we update pg_type, pg_attribute and pg_namespace catalogs
reindex database

When changing databases the database collation will be retrieved from type text from pg_type.

Mail archive

Downloads

Work in progress patch

Gsoc08-collation

Contents

Proposal

Abstract

Further information

Implementation

Catalogs

initdb

CREATE DATABASE ... COLLATE ...

Mail archive

Downloads

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Tools

Search