Re: Is it possible and worthy to optimize scanRTEForColumn()?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rui Hai Jiang <ruihaijiang(at)msn(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is it possible and worthy to optimize scanRTEForColumn()?
Date: 2017-12-08 19:48:54
Message-ID: 20171208194854.43azdxqsaj3pg33b@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-12-08 14:41:14 -0500, Tom Lane wrote:
> Yeah, if someone were holding a gun on me and saying "make that particular
> function faster", I'd think about a hash table rather than scanning a
> list. Perhaps a hash table with all the column names exposed by FROM,
> not one hash per RTE.

That sounds right.

> However, if you have a FROM that exposes a lot of column names, and
> then the query only looks up a few of them, you might come out behind
> by building a hash table :-(

Hm, I don't think that's that big of a deal - you don't need many
lookups to make a hashtable worthwhile if the alternative is exhaustive
scans through linked lists. I'd be more concerned about the pretty
common case we're most of the time hitting now, where there's just a
handfull of columns selected from about as many available columns, the
additional allocations and such might show up.

> I'm still unconvinced that this is the first place to improve for
> wide tables, anyway.

I've run a few profiles with wide columns lately, and on the read-mostly
side without prepared statements this is very commonly the biggest entry
by a lot. Over 90% of the time in one of them.

If the queries select a large number of the underlying rows tupledesc
computations, and their syscache lookups, become the bottleneck. That's
partially what lead me to microoptimize syscache ~two months back. The
real solution there is going to be to move tupledesc computations to the
planner, but that's a bigger piece of work than I can take on right now.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-08 19:54:49 Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com
Previous Message Tom Lane 2017-12-08 19:41:14 Re: Is it possible and worthy to optimize scanRTEForColumn()?