Re: Extracting only the columns needed for a query

From: Ashwin Agrawal <aagrawal(at)pivotal(dot)io>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extracting only the columns needed for a query
Date: 2019-06-16 18:26:33
Message-ID: CALfoeiugKFT+5PGceRgJEDjNOjEL8bxGe3UzAgVByOsXoVCcMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 15, 2019 at 10:02 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> > Approach B: after parsing and/or after planning
>
> If we wanted to do something about this, making the planner record
> the set of used columns seems like the thing to do. We could avoid
> the expense of doing it when it's not needed by setting up an AM/FDW/
> etc property or callback to request it.
>

Sounds good. In Zedstore patch, we have added AM property to convey the AM
leverages column projection and currently skip physical tlist optimization
based
on it. So, yes can similarly be leveraged for other planning needs.

> Another reason for having the planner do this is that presumably, in
> an AM that's excited about this, the set of fetched columns should
> play into the cost estimates for the scan. I've not been paying
> enough attention to the tableam work to know if we've got hooks for
> the AM to affect scan costing ... but if we don't, that seems like
> a hole that needs plugged.
>

AM callback relation_estimate_size exists currently which planner
leverages. Via
this callback it fetches tuples, pages, etc.. So, our thought is to extend
this
API if possible to pass down needed column and help perform better costing
for
the query. Though we think if wish to leverage this function, need to know
list
of columns before planning hence might need to use query tree.

> > Approach B, however, does not work for utility statements which do
> > not go through planning.
>
> I'm not sure why you're excited about that case? Utility statements
> tend to be pretty much all-or-nothing as far as data access goes.
>

Statements like COPY, CREATE INDEX, CREATE CONSTRAINTS, etc.. can benefit
from
subset of columns for scan. For example in Zedstore currently for CREATE
INDEX we extract needed columns by walking indexInfo->ii_Predicate and
indexInfo->ii_Expressions. For COPY, we currently use cstate->attnumlist to
know
which columns need to be scanned.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-06-16 18:48:23 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Stephen Frost 2019-06-16 18:10:23 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)