Project to implement Column Oriented storage in PostgreSQL
As of 9.4, PostgreSQL is a row-oriented database server. The project and requirements described here have the objective of implementing performance improvements using column storage techniques.
Overview of Column-Oriented storage
Row oriented storage means that columns in the table are generally stored in a single heap, with each column stored on a single tuple.
Various database systems have implemented physical storage so that individual columns are stored separately, with a values of each tuple stored in an array or similar structure, which is known as Column Oriented DBMS: http://en.wikipedia.org/wiki/Column-oriented_DBMS
Column oriented storage has become associated with Data Warehousing and Business Intelligence systems where in conjunction with other related features it can enhance performance. While true, this is a simplistic viewpoint and we need to understand where the speed comes from and how that could work for PostgreSQL.
Particularly good description of columnar storage is here http://docs.aws.amazon.com/redshift/latest/dg/c_columnar_storage_disk_mem_mgmnt.html
What's wrong with TOAST?
TOAST moves data out of the main heap. What's wrong with that?
The Oversized Attribute Storage Technique (TOAST) does a great job at what it does - storing large values for columns. There are some improvements that can be made, such as improving compression performance or making it more configurable. However, those don't change the basic nature of TOAST: it is designed for storage, not as a means to improve performance of queries.
Specifically, TOAST can be slow for queries because
- TOAST stores all column values separately - multiple row values are not stored adjacent to each other
- TOAST chops up values into chunks, though without much pattern - the column value must be reconstructed for use in search queries
So we need a more Column-Oriented Attribute Storage Technique, COAST...
Advantages of COAST
Column Oriented storage would give PostgreSQL
- Faster business intelligence queries
- Ability to improve compression of individual columns
- Ability for PostgreSQL to exceed the current 1600 column limit in one tuple
What would the UI look like?
Just as we set storage parameters for each column for TOAST, we would set them for COAST also...
ALTER TABLE foo ALTER COLUMN foocol SET ORIENTATION COLUMN;
We might also want to store columns in groups together, like so
ALTER TABLE foo ALTER COLUMN foocol1 SET COLUMN GROUP 2, ALTER COLUMN foocol2 SET COLUMN GROUP 2;
which would give us the ability to store columns in secondary relations in groups, like Cassandra.
From a User Interface perspective, that's it. We want these performance gains to be mostly transparent to the user.
But we also want to continue to use Row Oriented Storage as the default choice, probably forever.
Just as a TOASTed tuple needs to be de-TOASTed, COAST would require a step where we reconstruct the final result tuple. Since the tuple value for a column-oriented attribute is not stored in the heap we must fetch it or join it back to the heap. This last stage could be quite expensive if the query returns a large number of columns AND rows. Clearly we could push down aggregation into the individual columns, if that were possible.
Dangers of Simple Solutions
Row vs. Column Stores http://db.csail.mit.edu/projects/cstore/abadi-sigmod08.pdf
"The elevator pitch behind (the column store) performance difference is straightforward: column-stores are more I/O efficient for read-only queries since they only have to read from disk (or from memory) those attributes accessed by a query. This simplistic view leads to the assumption that one can obtain the performance benefits of a column-store using a row-store: either by vertically partitioning the schema, or by indexing every column so that columns can be accessed independently. In this paper, we demonstrate that this assumption is false. We compare the performance of a commercial row-store under a variety of different configurations with a column-store and show that the row-store performance is significantly slower on a recently proposed data warehouse benchmark. We then analyze the performance difference and show that there are some important differences between the two systems at the query executor level (in addition to the obvious differences at the storage layer level). Using the column-store, we then tease apart these differences, demonstrating the impact on performance of a variety of column-oriented query execution techniques, including vectorized query processing, compression, and a new join algorithm we introduce in this paper. We conclude that while it is not impossible for a row-store to achieve some of the performance advantages of a column-store, changes must be made to both the storage layer and the query executor to fully obtain the benefits of a column-oriented approach."
That's all a little bit negative, but it helps to illustrate that the changes are fairly deep, not just surface ones. Not sure why that means a row-store can't implement a column store; in fact, clearly they can, since there is at least one fork of PostgreSQL that is now a column store, possibly more like three+.
Basically, we end up by saying that if we
- store columns individually
- compress them all together and then
- access them without decompression
- using bulk access methods
then we'll go faster.
Which is interesting, but given the current nature of PostgreSQL's transactional mechanisms, that is hard to utilise directly, without a mechanism to pack data structures once transactions have completed.
Anyway, that's going some way forwards into design details, so lets keep it light for now.
Damn, lets just use JSONB
JSONB allows us to jam data together into column groups in a JSON blob. We can access data even while compressed, so we can certainly get query performance improvements.
But this data storage is still row oriented (as is Cassandra, MongoDB, etc - sorry chaps) and so won't offer some of the same performance gains.
All of the quoted advantages of Column Storage, its best to look a little at what the disadvantages are also.
- Read mostly database - loading of data made while holding full table locks
- Optimised for insert only case - no deletes, no updates
- Fixed width fields
- Fixed value ranges within data fields
- Lack of performance support for complex data types
Now comes the hard part...
Funding is provided by European Union under FP7 grant agreement 318633 - AXLE Project - http://www.axleproject.eu/