Largeobject Enhancement

From PostgreSQL wiki
Jump to navigationJump to search

This wikipage is a memo of unofficial TODO items to develop the Todo items for Binary Data.

Largeobject interfaces on TOAST values

The TOAST relations are defined as follows, it is an identical structure existing pg_largeobejct.

pg_toast_%u (
    chunk_id        oid,
    chunk_seq       int4,
    chunk_data      bytea,
    unique(chunk_id, chunk_seq)
)
pg_largeobject(
    loid            oid,
    pageno          int4,
    data            bytea,
    unique(loid, pageno)
)

It is quite natural to deploy the contents of largeobjects on its toast relations, and largeobject interfaces to be enhanced to access partial stuff of toast data structure. In this case, the pg_largeobject will become to be used to manage only matadata of largeobjects.

It also enables to resolve a matter in very large TEXT or BYTEA data. In the specification, it allows to store 1Gbytes length varlena data, but it needs to expand whole of the chunk data frames on the local memory when it is fetched. Obviously, it is not suitable to store very large data.

Issues

Toast formatting

When the given data is toasted, it can be compressed to reduce the consumption of storage, for columns with attstorage being 'm' or 'x'. However, it makes hard to estimate what chunk stores required data region, so it is not suitable for partial accesses. The current TEXT and BYTEA types are defined with pg_type.typstorage being 'x', so it needs a new varlena type to store an uncompressed data to allow partial accesses.

When a varlena data is enough small, it can be stored within a tuple inlined, without any external toast relation. However, it makes impossible to update them partially, because it needs whole of the updates.

Thus, a new varlena data type which allows largeobject interfaces must have the following characteristics:

  • It always uses external toast relation, independent from the size of given varlena.
  • It never compress the given varlena, to estimate what chunk stores what offset correctly.

In my plan, a new varlena type BLOB will be added with above requirements.

Holes of chunks

The current TOASTing mechanism does not consider holes of chunks. The largeobject feature can contain discontinuous page frames. If we try to read the holed page, it returns zero'ed pages. The reworked TOASTing mechanism needs to handle holed pages.

An arguable thing is what value should be returned from toast_raw_datum_size() and toast_datum_size().

Schema support

Statements

ALTER LARGE OBJECT

postgres=# CREATE SCHEMA lotest;
CREATE SCHEMA
postgres=# ALTER LARGE OBJECT 1234 SET SCHEMA lotest;
ALTER LARGE OBJECT
postgres=# DROP SCHEMA lotest;
ERROR:  cannot drop schema lotest because other objects depend on it
DETAIL:  largeobject 1234 depends on schema lotest
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
postgres=# DROP SCHEMA lotest CASCADE;
NOTICE:  drop cascades to largeobject 1234
DROP SCHEMA

CREATE LARGE OBJECT

It seems to me that CREATE [TEMP] LARGE OBJECT with fully qualified namespace is a useful feature.

Named large object

Currently, largeobject interfaces only support numeric identifier, however, it is not impossible to use human readable identifier as follows:

postgres=# SELECT lo_open('my_lobject', x'40000'::int);
postgres=# SELECT lo_open('pg_temp.your_lobject', x'40000'::int);

The later example qualifies the name of largeobject by pg_temp schema.

Privilege to create large objects

In the database DAC model, ACL_CREATE is checked on the schema which stores a new database object, however, largeobjects are not dominated by any certain schema. If we need to check permission on creation of largeobjects, it needs to be covered with a certain schema object.

Completed Items

Add security checks for large objects

This offers to support management of ownerships and read/write privileges for large objects. The read and write permissions are obvious. These should be checked on loread(), lowrite() and lo_truncate(). Also, only owner of the large objects (and superuser) can drop them with lo_unlink().

Data structure

A new system catalog to manage the metadata (ownership and security attribute) of a largeobject separately from page frames. The pg_largeobject_metadata.oid points to pg_largeobject.loid.

Statement