Re: benchmarking the query planner

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: benchmarking the query planner
Date: 2008-12-11 23:43:48
Message-ID: 13761.1229039028@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Thu, 2008-12-11 at 17:45 -0500, Tom Lane wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> I would like it even more if there was a data type specific default.
>>> Currently we have a special case for boolean, but that's it.
>>
>> No, we don't (or if we do I'd be interested to know where).

> Your commit, selfuncs.c, 7 Jul.

As with Robert's pointer, that's about coping with missing stats,
not about determining what stats to collect.

> ... neither of those were ones I was thinking about. I see 3 main classes:
> * data with small number of distinct values (e.g. boolean, smallint)
> * data with many distinct values
> * data with where every value is typically unique (e.g. text)

These three categories are already dealt with in an entirely
type-independent fashion by the heuristics in compute_scalar_stats.
I think it's quite appropriate to drive them off the number of observed
values, not guesses about what a particular datatype is used for.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-12-11 23:44:04 Re: benchmarking the query planner
Previous Message Tom Lane 2008-12-11 23:38:15 Re: benchmarking the query planner