Quick Links

Re: anti-join chosen even when slower than old plan

From:	Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: anti-join chosen even when slower than old plan
Date:	2010-11-12 09:15:17
Message-ID:	AANLkTimRqw+vd=83=RW3VbX8U_4fVJDpJZfMxa6uLvs3@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

2010/11/11 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Yeah. For Kevin's case, it seems like we want the caching percentage
>>> to vary not so much based on which table we're hitting at the moment
>>> but on how much of it we're actually reading.
>>
>> Well, we could certainly take the expected number of pages to read and
>> compare that to effective_cache_size. The thing that's missing in that
>> equation is how much other stuff is competing for cache space. I've
>> tried to avoid having the planner need to know the total size of the
>> database cluster, but it's kind of hard to avoid that if you want to
>> model this honestly.
>
> I'm not sure I agree with that. I mean, you could easily have a
> database that is much larger than effective_cache_size, but only that
> much of it is hot. Or, the hot portion could move around over time.
> And for reasons of both technical complexity and plan stability, I
> don't think we want to try to model that. It seems perfectly
> reasonable to say that reading 25% of effective_cache_size will be
> more expensive *per-page* than reading 5% of effective_cache_size,
> independently of what the total cluster size is.
>
>> Would it be at all workable to have an estimate that so many megs of a
>> table are in cache (independently of any other table), and then we could
>> scale the cost based on the expected number of pages to read versus that
>> number? The trick here is that DBAs really aren't going to want to set
>> such a per-table number (at least, most of the time) so we need a
>> formula to get to a default estimate for that number based on some simple
>> system-wide parameters. I'm not sure if that's any easier.
>
> That's an interesting idea. For the sake of argument, suppose we
> assume that a relation which is less than 5% of effective_cache_size
> will be fully cached; and anything larger we'll assume that much of it
> is cached. Consider a 4GB machine with effective_cache_size set to
> 3GB. Then we'll assume that any relation less than 153MB table is
> 100% cached, a 1 GB table is 15% cached, and a 3 GB table is 5%
> cached. That doesn't seem quite right, though: the caching percentage
> drops off very quickly after you exceed the threshold.
>
> *thinks*
>
> I wondering if we could do something with a formula like 3 *
> amount_of_data_to_read / (3 * amount_of_data_to_read +
> effective_cache_size) = percentage NOT cached. That is, if we're
> reading an amount of data equal to effective_cache_size, we assume 25%
> caching, and plot a smooth curve through that point. In the examples
> above, we would assume that a 150MB read is 87% cached, a 1GB read is
> 50% cached, and a 3GB read is 25% cached.

But isn't it already the behavior of effective_cache_size usage ?

See index_pages_fetched() in costsize.c

>
>> BTW, it seems that all these variants have an implicit assumption that
>> if you're reading a small part of the table it's probably part of the
>> working set; which is an assumption that could be 100% wrong. I don't
>> see a way around it without trying to characterize the data access at
>> an unworkably fine level, though.
>
> Me neither, but I think it will frequently be true, and I'm not sure
> it will hurt very much when it isn't. I mean, if you execute the same
> query repeatedly, that data will become hot soon enough. If you
> execute a lot of different queries that each touch a small portion of
> a big, cold table, we might underestimate the costs of the index
> probes, but so what? There's probably no better strategy for
> accessing that table anyway. Perhaps you can construct an example
> where this underestimate affects the join order in an undesirable
> fashion, but I'm having a hard time getting worked up about that as a
> potential problem case. Our current system - where we essentially
> assume that the caching percentage is uniform across the board - can
> have the same problem in less artificial cases.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support

In response to

Re: anti-join chosen even when slower than old plan at 2010-11-11 20:29:40 from Robert Haas

Responses

Re: anti-join chosen even when slower than old plan at 2010-11-12 16:30:24 from Robert Haas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Vitalii Tymchyshyn	2010-11-12 10:10:01	Re: anti-join chosen even when slower than old plan
Previous Message	Cédric Villemain	2010-11-12 09:07:31	Re: anti-join chosen even when slower than old plan