Skip site navigation (1) Skip section navigation (2)

Re: anti-join chosen even when slower than old plan

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: anti-join chosen even when slower than old plan
Date: 2010-11-12 09:15:17
Message-ID: AANLkTimRqw+vd=83=RW3VbX8U_4fVJDpJZfMxa6uLvs3@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
2010/11/11 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Yeah.  For Kevin's case, it seems like we want the caching percentage
>>> to vary not so much based on which table we're hitting at the moment
>>> but on how much of it we're actually reading.
>>
>> Well, we could certainly take the expected number of pages to read and
>> compare that to effective_cache_size.  The thing that's missing in that
>> equation is how much other stuff is competing for cache space.  I've
>> tried to avoid having the planner need to know the total size of the
>> database cluster, but it's kind of hard to avoid that if you want to
>> model this honestly.
>
> I'm not sure I agree with that.  I mean, you could easily have a
> database that is much larger than effective_cache_size, but only that
> much of it is hot.  Or, the hot portion could move around over time.
> And for reasons of both technical complexity and plan stability, I
> don't think we want to try to model that.  It seems perfectly
> reasonable to say that reading 25% of effective_cache_size will be
> more expensive *per-page* than reading 5% of effective_cache_size,
> independently of what the total cluster size is.
>
>> Would it be at all workable to have an estimate that so many megs of a
>> table are in cache (independently of any other table), and then we could
>> scale the cost based on the expected number of pages to read versus that
>> number?  The trick here is that DBAs really aren't going to want to set
>> such a per-table number (at least, most of the time) so we need a
>> formula to get to a default estimate for that number based on some simple
>> system-wide parameters.  I'm not sure if that's any easier.
>
> That's an interesting idea.  For the sake of argument, suppose we
> assume that a relation which is less than 5% of effective_cache_size
> will be fully cached; and anything larger we'll assume that much of it
> is cached.  Consider a 4GB machine with effective_cache_size set to
> 3GB.  Then we'll assume that any relation less than 153MB table is
> 100% cached, a 1 GB table is 15% cached, and a 3 GB table is 5%
> cached.  That doesn't seem quite right, though: the caching percentage
> drops off very quickly after you exceed the threshold.
>
> *thinks*
>
> I wondering if we could do something with a formula like 3 *
> amount_of_data_to_read / (3 * amount_of_data_to_read +
> effective_cache_size) = percentage NOT cached.  That is, if we're
> reading an amount of data equal to effective_cache_size, we assume 25%
> caching, and plot a smooth curve through that point.  In the examples
> above, we would assume that a 150MB read is 87% cached, a 1GB read is
> 50% cached, and a 3GB read is 25% cached.


But isn't it already the behavior of effective_cache_size usage ?

See  index_pages_fetched() in costsize.c


>
>> BTW, it seems that all these variants have an implicit assumption that
>> if you're reading a small part of the table it's probably part of the
>> working set; which is an assumption that could be 100% wrong.  I don't
>> see a way around it without trying to characterize the data access at
>> an unworkably fine level, though.
>
> Me neither, but I think it will frequently be true, and I'm not sure
> it will hurt very much when it isn't.  I mean, if you execute the same
> query repeatedly, that data will become hot soon enough.  If you
> execute a lot of different queries that each touch a small portion of
> a big, cold table, we might underestimate the costs of the index
> probes, but so what?  There's probably no better strategy for
> accessing that table anyway.  Perhaps you can construct an example
> where this underestimate affects the join order in an undesirable
> fashion, but I'm having a hard time getting worked up about that as a
> potential problem case.  Our current system - where we essentially
> assume that the caching percentage is uniform across the board - can
> have the same problem in less artificial cases.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>



-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

In response to

Responses

pgsql-performance by date

Next:From: Vitalii TymchyshynDate: 2010-11-12 10:10:01
Subject: Re: anti-join chosen even when slower than old plan
Previous:From: Cédric VillemainDate: 2010-11-12 09:07:31
Subject: Re: anti-join chosen even when slower than old plan

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group