Re: Table AM modifications to accept column projection lists

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Table AM modifications to accept column projection lists
Date: 2022-09-05 16:51:32
Message-ID: CAN-LCVMkL4DDPn4s53TxATdhTdGNa1v0FJ4B-3sDdB-yizraGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

This is the original patch rebased onto v15 master with conflicts resolved.
I'm currently
studying it and latest comments in the original thread, and would try go
the way that
was mentioned in the thread (last message) -
[1]
https://stratos.seas.harvard.edu/files/stratos/files/columnstoresfntdbs.pdf
[2] https://github.com/zhihuiFan/postgres/tree/lazy_material_v2
I agree it is not in the state for review, so I've decided not to change
patch status,
just revive the thread because we found that Pluggable Storage API is not
somewhat
not sufficient.
Thanks for the recommendations, I'll check that out.

On Mon, Sep 5, 2022 at 7:36 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Mon, Sep 05, 2022 at 05:38:51PM +0300, Nikita Malakhov wrote:
> > Due to experiments with columnar data storage I've decided to revive this
> > thread - Table AM modifications to accept column projection lists
> > <
> https://www.postgresql.org/message-id/flat/CAE-ML+9RmTNzKCNTZPQf8O3b-UjHWGFbSoXpQa3Wvuc8YBbEQw(at)mail(dot)gmail(dot)com
> >
> >
> > To remind:
> >
> > This patch introduces a set of changes to the table AM APIs, making them
> > accept a column projection list. That helps columnar table AMs, so that
> > they don't need to fetch all columns from disk, but only the ones
> > actually needed.
> >
> > The set of changes in this patch is not exhaustive -
> > there are many more opportunities that are discussed in the TODO section
> > below. Before digging deeper, we want to elicit early feedback on the
> > API changes and the column extraction logic.
> >
> > TableAM APIs that have been modified are:
> >
> > 1. Sequential scan APIs
> > 2. Index scan APIs
> > 3. API to lock and return a row
> > 4. API to fetch a single row
> >
> > We have seen performance benefits in Zedstore for many of the optimized
> > operations [0]. This patch is extracted from the larger patch shared in
> > [0].
>
> What parts of the original patch were left out ? This seems to be the
> same size as the original.
>
> With some special build options like -DWRITE_READ_PARSE_PLAN_TREES, this
> currently fails with:
>
> WARNING: outfuncs/readfuncs failed to produce equal parse tree
>
> There's poor code coverage in PopulateNeededColumnsForScan()
> IndexNext(), check_default_partition_contents() and nodeSeqscan.c.
> https://cirrus-ci.com/task/5516554904272896
>
> https://api.cirrus-ci.com/v1/artifact/task/5516554904272896/coverage/coverage/00-index.html
>
> Is it currently possible to hit those code paths in postgres ? If not,
> you may need to invent a minimal columnar extension to allow excercising
> that.
>
> Note that the cirrusci link is on top of my branch which runs "extended"
> checks in cirrusci, but you can also run code coverage report locally
> with --enable-coverage.
>
> When you mail next, please run pgindent first (BTW there's a debian
> package in PGDG for pgindent).
>
> --
> Justin
>

--
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-09-05 17:03:22 Re: pg_upgrade allows itself to be run twice
Previous Message Zhang Mingli 2022-09-05 16:39:30 Remove dead macro exec_subplan_get_plan