Quick Links

Re: Erroneous cost estimation for nested loop join

From:	KAWAMICHI Ryoji <kawamichi(at)tkl(dot)iis(dot)u-tokyo(dot)ac(dot)jp>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: Erroneous cost estimation for nested loop join
Date:	2015-11-30 07:29:43
Message-ID:	1883234852.1782871.1448868583436.JavaMail.zimbra@tkl.iis.u-tokyo.ac.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> - If we're sequential scanning a small table, let's say less than 1/4
> of shared_buffers, which is the point where synchronized scans kick
> in, then assume the data is coming from shared_buffers.
> - If we're scanning a medium-sized table, let's say less than
> effective_cache_size, then assume the data is coming from the OS
> cache. Maybe this is the same cost as the previous case, or maybe
> it's slightly more.
> - Otherwise, assume that the first effective_cache_size pages are
> coming from cache and the rest has to be read from disk. This is
> perhaps unrealistic, but we don't want the cost curve to be
> discontinuous.

I think this improvement is so reasonable, and I expect it will be merged
into current optimizer code.

> A problem with this sort of thing, of course, is that it's really hard
> to test a proposed change broadly enough to be certain how it will
> play out in the real world.

That’s the problem we’re really interested in and trying to tackle.

For example, with extensive experiments, I’m really sure my modification of
cost model is effective for our environment, but I can’t see if it is also
efficient or unfortunately harmful in general environments.

And I think that, in postgres community, there must be (maybe buried)
knowledge on how to judge the effectiveness of cost model modifications
because someone should have considered something like that at each commit.
I’m interested in it, and hopefully would like to contribute to finding
a better way to improve the optimizer through cost model refinement.

Thanks.
Ryoji.

In response to

Re: Erroneous cost estimation for nested loop join at 2015-11-17 20:37:12 from Robert Haas

Responses

Re: Erroneous cost estimation for nested loop join at 2015-12-03 01:42:10 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2015-11-30 07:53:03	Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
Previous Message	Peter Geoghegan	2015-11-30 06:14:34	Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore