Re: anti-join chosen even when slower than old plan

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: anti-join chosen even when slower than old plan
Date: 2011-01-20 16:36:05
Message-ID: AANLkTinj+3EybMFibRMtZ=Mu=nsOxH=QKMy450XKy5QX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2011/1/19 Bruce Momjian <bruce(at)momjian(dot)us>:
> Robert Haas wrote:
>> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> >> Yeah. ?For Kevin's case, it seems like we want the caching percentage
>> >> to vary not so much based on which table we're hitting at the moment
>> >> but on how much of it we're actually reading.
>> >
>> > Well, we could certainly take the expected number of pages to read and
>> > compare that to effective_cache_size. ?The thing that's missing in that
>> > equation is how much other stuff is competing for cache space. ?I've
>> > tried to avoid having the planner need to know the total size of the
>> > database cluster, but it's kind of hard to avoid that if you want to
>> > model this honestly.
>>
>> I'm not sure I agree with that.  I mean, you could easily have a
>> database that is much larger than effective_cache_size, but only that
>> much of it is hot.  Or, the hot portion could move around over time.
>> And for reasons of both technical complexity and plan stability, I
>> don't think we want to try to model that.  It seems perfectly
>> reasonable to say that reading 25% of effective_cache_size will be
>> more expensive *per-page* than reading 5% of effective_cache_size,
>> independently of what the total cluster size is.
>
> Late reply, but one idea is to have the executor store hit counts for
> later use by the optimizer.  Only the executor knows how many pages it
> had to request from the kernel for a query.  Perhaps getrusage could
> tell us how often we hit the disk.

AFAIK getrusage does not provide access to real IO counters but
filesystem's ones. :-(

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2011-01-20 17:04:59 Re: the XID question
Previous Message Cédric Villemain 2011-01-20 16:16:05 Re: anti-join chosen even when slower than old plan