Re: cost_rescan (was: match_unsorted_outer() vs. cost_nestloop())

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cost_rescan (was: match_unsorted_outer() vs. cost_nestloop())
Date: 2010-04-19 17:24:45
Message-ID: 27752.1271697885@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> One problem with the current implementation of cost_rescan() is that
> it ignores caching effects.

Well, that's intentional, per the head comment for the function.
We might want to extend it later but I'd like to get some field
experience with what it's trying to model now. I believe that it
is covering the first-order effects, and possible cache effects
would be second-order.

> It seems to be faster to rescan a
> materialize node than it is to rescan a seqscan of a table, even if
> there are no restriction clauses, presumably because you get to skip
> tuple visibility checks and maybe some other overhead, too.

Exactly. IIRC, tuplestore's on-disk representation is also more compact
(less header overhead, no dead tuples, etc) so the amount of I/O needed
will also be less, if you're doing any at all. But the code already
knows that scanning a tuplestore is cheaper than scanning a table ---
that doesn't seem to me to be relevant to the question of whether we
need to model cache effects in cost_rescan.

> Another potential problem is that materializing a whole-table seqscan
> to avoid repeating the tuple visibility checks may be a win in some
> strict sense, but there are externalities: it's also going to use a
> lot more memory/disk than just rescanning the table.

This is not specific to materialize, it is part of the generic problem
that we don't model the true costs of using work_mem in each of several
parts of a query. There have been discussions about how to fix that
before, but no particularly good ideas have emerged.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2010-04-19 18:17:56 Re: Re: [COMMITTERS] pgsql: Tune GetSnapshotData() during Hot Standby by avoiding loop
Previous Message Tom Lane 2010-04-19 17:01:04 Re: Standalone backends run StartupXLOG in an incorrect environment