On columnar storage (2)

From: Alvaro Herrera <alvherre(at)2ndQuadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: On columnar storage (2)
Date: 2015-08-31 22:53:28
Message-ID: 20150831225328.GM2912@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

As discussed in
https://www.postgresql.org/message-id/20150611230316.GM133018@postgresql.org
we've been working on implementing columnar storage for Postgres.
Here's some initial code to show our general idea, and to gather
comments related to what we're building. This is not a complete patch,
and we don't claim that it works! This is in very early stages, and we
have a lot of work to do to get this in working shape.

This was proposed during the Developer's Unconference in Ottawa earlier
this year. While some questions were raised about some elements of our
design, we don't think they were outright objections, so we have pressed
forward on the expectation that any limitations can be fixed before this
is final if they are critical, or in subsequent commits if not.

The commit messages for each patch should explain what we've done in
enough technical detail, and hopefully provide a high-level overview of
what we're developing.

The first few pieces are "ready for comment" -- feel free to speak up
about the catalog additions, the new COLUMN STORE bits we added to the
grammar, the way we handle column stores in the relcache, or the
mechanics to create column store catalog entries.

The later half of the patch series is much less well cooked yet; for
example, the colstore_dummy module is just a simple experiment to let us
verify that the API is working. The planner and executor code are
mostly stubs, and we are not yet sure of what are the executor nodes
that we would like to have: while we have discussed this topic
internally a lot, we haven't yet formed final opinions, and of course
the stub implementations are not doing the proper things, and in many
cases they are even not doing anything at all.

Still, we believe this shows the general spirit of things, which is that
we would like these new objects be first-class citizens in the Postgres
architecture:

a) so that the optimizer will be able to extract as much benefit as is
possible from columnar storage: it won't be at arms-length through an
opaque interface, but rather directly wired into plans, and have Path
representation eventually.

b) so that it is possible to implement things such as tables that live
completely in columnar storage, as mentioned by Tom regarding Salesforce
extant columnar storage.

Please don't think that the commits attached below represent development
history. We played with the early pieces for quite a while before
settling on what you see here. The presented split is intended to ease
reading. We continue to play with the planner and executor code,
getting ourselves familiar with it enough that we can write something
that actually works.

This patch is joint effort of Tomáš Vondra and myself, with
contributions from Simon Riggs. There's a lot of code attribute to me
in the commit messages that was actually authored by Tomáš. (Git
decided to lay blame on me because I split the commits.)

The research leading to these results has received funding from the
European Union’s Seventh Framework Programme (FP7/2007-2015) under grant
agreement n° 318633.

--
Álvaro Herrera Developer, http://www.PostgreSQL.org/

Attachment Content-Type Size
0001-initial-README-for-column-stores.patch text/x-diff 6.9 KB
0002-New-docs-section-on-Data-Definition-Column-Stores.patch text/x-diff 3.9 KB
0003-Add-RELKIND_COLUMN_STORE-to-pg_class.h.patch text/x-diff 1.1 KB
0004-Add-PG_COLSTORE_NAMESPACE-to-pg_namespace.patch text/x-diff 938 bytes
0005-relcache-don-t-consider-nonzero-pg_class.relam-as-an.patch text/x-diff 1.3 KB
0006-Add-column-store-catalogs.patch text/x-diff 7.2 KB
0007-add-syscaches-for-column-store-catalogs.patch text/x-diff 1.8 KB
0008-Add-COLUMN-STORE-clause-to-CREATE-TABLE.patch text/x-diff 20.5 KB
0009-add-pg_class.relhascstore.patch text/x-diff 14.7 KB
0010-Add-ColumnStoreOptInfo-to-RelationData.patch text/x-diff 16.2 KB
0011-Infrastructure-to-create-column-stores.patch text/x-diff 41.0 KB
0012-add-psql-d-support-for-column-stores.patch text/x-diff 2.3 KB
0013-add-colstore-function-to-dbsize.patch text/x-diff 4.2 KB
0014-Add-a-generic-API-for-column-stores-to-implement.patch text/x-diff 13.5 KB
0015-New-command-CREATE-COLUMN-STORE-ACCESS-METHOD.patch text/x-diff 19.3 KB
0016-First-column-store-implementation-colstore_dummy.patch text/x-diff 18.4 KB
0017-Add-ColumnStoreMaterial-node.patch text/x-diff 5.3 KB
0018-initial-planning-of-ColumnStoreMaterialize-nodes.patch text/x-diff 11.0 KB
0019-Some-stub-executor-code.patch text/x-diff 13.1 KB
0020-Add-FormColumnStoreDatum-and-FilterHeapTuple.patch text/x-diff 5.8 KB
0021-initial-implementation-of-nodeModifyTable.patch text/x-diff 22.9 KB
0022-COPY-use-colstore-batch-stuff.patch text/x-diff 3.5 KB
0023-regression-tests-for-cstore.patch text/x-diff 18.6 KB
0024-Add-known-bugs-file.patch text/x-diff 848 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Smitha Pamujula 2015-08-31 23:03:20 Re: pg_upgrade + Extensions
Previous Message Alexander Korotkov 2015-08-31 22:20:52 Re: WIP: Access method extendability