Re: anti-join chosen even when slower than old plan

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: anti-join chosen even when slower than old plan
Date: 2010-11-10 22:43:43
Message-ID: 4CDACBBF020000250003755B@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Wow. That's fascinating, and if you don't mind, I might mention
> this potential problem in a future talk at some point.

I don't mind at all.

> For example, in your case, it would be sufficient to estimate the
> amount of data that a given query is going to grovel through and
> then applying some heuristic to choose values for random_page_cost
> and seq_page_cost based on the ratio of that value to, I don't
> know, effective_cache_size.

That's where my day-dreams on the topic have been starting.

> Unfortunately, to know how much data we're going to grovel
> through, we need to know the plan; and to decide on the right
> plan, we need to know how much data we're going to grovel through.

And that's where they've been ending.

The only half-sane answer I've thought of is to apply a different
cost to full-table or full-index scans based on the ratio with
effective cache size. A higher cost for such scans is something
which I've already postulated might be worthwhile for SSI, because
of the increased risk of rw-conflicts which could ultimately
contribute to serialization failures -- to attempt to model, at
least in some crude way, the costs associated with transaction
retry.

-Kevin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-11-10 23:07:15 Re: anti-join chosen even when slower than old plan
Previous Message Robert Haas 2010-11-10 22:07:52 Re: anti-join chosen even when slower than old plan