Re: On columnar storage (2)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: On columnar storage (2)
Date: 2015-12-30 20:26:05
Message-ID: 20151230202605.GA58441@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Janes wrote:
> Could we get this rebased past the merge of the parallel execution commits?

Here you go.

Actually, this is not just a rebase, but rather a heavily revamped
version of the previous patch. This is now functional to some degree (I
bet you could break it with complex queries or perhaps even with simple
table inheritance -- but all TPC-H queries work, and many of them are
faster than with the original code), using the design that was proposed
previously: column stores are considered separate relations and added to
the plan tree, with a suitable join condition to their main table.
There's a new executor node called ColumnStoreScan which has special
glue code to call a specific column store implementation, previously
created with the provided CREATE COLUMN STORE ACCESS METHOD command.
We provide a sample access method, called "vertical" (for vertical
partitioning) which is the simplest we could make, to have something to
test. It's not actually columnar oriented.

There's a lot of optimizer trickery to make this thing work (most of it
by David Rowley). We have a first step that mutates the join tree to
add the nodes we need; at that point we also mutate the Var nodes that
point to columns that are in the store, so that they point to the column
store instead of to the relation. David also added code to prune
colstore relations that are "unused" -- this is more tricky than it
sounds because the join code somewhere adds all Vars for the relations
in the range table,

Back on the executor side there's some code to ModifyTable and COPY so
that they put data into the column store, using the access method
routines.

Another thing we needed was to implement "physical attributes", which is
a cut-down version of the logical column mapping patch that Tomas and I
spent so long trying to get to work. This version was implemented from
scratch by David; it's more limited in scope compared to the previous
version but it's enough to get colstores working.

I have a version of this patch that's split in smaller commits, easier
to read. I can share that if anyone's interested.

Now, I don't actually intend that any of this is for application. It's
more to start some discussion on where do we want to go next. Simon,
David, Tomas and I have discussed this at length and we have various
ideas on where to go from here. I (and/or somebody else) will post
later about this.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
column-stores-2.patch text/x-diff 388.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleksii Kliukin 2015-12-30 20:29:59 Re: rows estimate in explain analyze for the BRIN index
Previous Message Tom Lane 2015-12-30 20:12:16 Re: rows estimate in explain analyze for the BRIN index