Re: Pluggable Storage - Andres's take

From: Andres Freund <andres(at)anarazel(dot)de>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pluggable Storage - Andres's take
Date: 2018-09-21 07:05:34
Message-ID: 20180921070534.tzplrjq4fol2zoti@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-09-21 16:57:43 +1000, Haribabu Kommi wrote:
> During the porting of Fujitsu in-memory columnar store on top of pluggable
> storage, I found that the callers of the "heap_beginscan" are expecting
> the returned data is always contains all the records.

Right.

> For example, in the sequential scan, the heap returns the slot with
> the tuple or with value array of all the columns and then the data gets
> filtered and later removed the unnecessary columns with projection.
> This works fine for the row based storage. For columnar storage, if
> the storage knows that upper layers needs only particular columns,
> then they can directly return the specified columns and there is no
> need of projection step. This will help the columnar storage also
> to return proper columns in a faster way.

I think this is an important feature, but I feel fairly strongly that we
should only tackle it in a second version. This patchset is already
pretty darn large. It's imo not just helpful for columnar, but even for
heap - we e.g. spend a lot of time deforming columns that are never
accessed. That's particularly harmful when the leading columns are all
NOT NULL and fixed width, but even if not, it's painful.

> Is it good to pass the plan to the storage, so that they can find out
> the columns that needs to be returned?

I don't think that's the right approach - this should be a level *below*
plan nodes, not reference them. I suspect we're going to have to have a
new table_scan_set_columnlist() option or such.

> And also if the projection can handle in the storage itself for some
> scenarios, need to be informed the callers that there is no need to
> perform the projection extra.

I don't think that should be done in the storage layer - that's probably
better done introducing custom scan nodes and such. This has costing
implications etc, so this needs to happen *before* planning is finished.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2018-09-21 07:09:34 Re: View to get all the extension control file details
Previous Message Haribabu Kommi 2018-09-21 06:57:43 Re: Pluggable Storage - Andres's take