Re: SeqScan costs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SeqScan costs
Date: 2008-08-13 03:22:16
Message-ID: 8070.1218597736@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> On Tue, 2008-08-12 at 15:46 -0400, Tom Lane wrote:
>>> This is only going to matter for a table of 1 block (or at least very
>>> few blocks), and for such a table it's highly likely that it's in RAM
>>> anyway. So I'm unconvinced that the proposed change represents a
>>> better model of reality.

> I think the first block of a sequential scan is clearly a random access. If
> that doesn't represent reality well then perhaps we need to tackle both
> problems together.

The point I was trying to make (evidently not too well) is that fooling
around with fundamental aspects of the cost models is not something that
should be done without any evidence. We've spent ten years getting the
system to behave reasonably well with the current models, and it's quite
possible that changing them to be "more accurate" according to a
five-minute analysis is going to make things markedly worse overall.

I'm not necessarily opposed to making this change --- it does sound
kinda plausible --- but I want to see some hard evidence that it does
more good than harm before we put it in.

> People lower random_page_cost because we're not doing a good job estimating
> how much of a table is in cache.

Agreed, the elephant in the room is that we lack enough data to model
caching effects with any degree of realism.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ITAGAKI Takahiro 2008-08-13 03:23:59 Re: pgbench duration option
Previous Message Tom Lane 2008-08-13 03:07:47 Re: Planner question