Re: Pluggable Storage - Andres's take

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-09-21 07:40:10
Message-ID: CAJrrPGc2ca=iho_FOJpYxQNBqvJQK-3OUvmdo2VG7hDJhfOzzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 21, 2018 at 5:05 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2018-09-21 16:57:43 +1000, Haribabu Kommi wrote:
>
> > For example, in the sequential scan, the heap returns the slot with
> > the tuple or with value array of all the columns and then the data gets
> > filtered and later removed the unnecessary columns with projection.
> > This works fine for the row based storage. For columnar storage, if
> > the storage knows that upper layers needs only particular columns,
> > then they can directly return the specified columns and there is no
> > need of projection step. This will help the columnar storage also
> > to return proper columns in a faster way.
>
> I think this is an important feature, but I feel fairly strongly that we
> should only tackle it in a second version. This patchset is already
> pretty darn large. It's imo not just helpful for columnar, but even for
> heap - we e.g. spend a lot of time deforming columns that are never
> accessed. That's particularly harmful when the leading columns are all
> NOT NULL and fixed width, but even if not, it's painful.
>

OK. Thanks for your opinion.
Then I will first try to cleanup the open items of the existing patch.

> Is it good to pass the plan to the storage, so that they can find out
> > the columns that needs to be returned?
>
> I don't think that's the right approach - this should be a level *below*
> plan nodes, not reference them. I suspect we're going to have to have a
> new table_scan_set_columnlist() option or such.
>

The table_scan_set_columnlist() API can be a good solution to share
the columns that are expected.

> > And also if the projection can handle in the storage itself for some
> > scenarios, need to be informed the callers that there is no need to
> > perform the projection extra.
>
> I don't think that should be done in the storage layer - that's probably
> better done introducing custom scan nodes and such. This has costing
> implications etc, so this needs to happen *before* planning is finished.
>

Sorry, my explanation was wrong, Assuming a scenario where the target list
contains only the plain columns of a table and these columns are already
passed
to storage using the above proposed new API and their of one to one mapping.
Based on the above info, deciding whether the projection is required or not
is good.

Regards,
Haribabu Kommi
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-09-21 08:08:24 proposal: prefix function
Previous Message Haribabu Kommi 2018-09-21 07:09:34 Re: View to get all the extension control file details