Quick Links

Re: More tablescanning fun

From:	"Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: More tablescanning fun
Date:	2003-04-25 14:38:01
Message-ID:	20030425093800.C66185@flake.decibel.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Fri, Apr 25, 2003 at 01:23:10AM -0400, Tom Lane wrote:
> I believe the endpoints s*T and PF*random_cost, I think, but the curve
> between them is anyone's guess. It's also quite possible that the
> correlation stat that we currently compute is inadequate to model what's
> going on.

In this case, the interpolation can't be at fault, because correlation
is 1 (unless the interpolation is backwards, but that doesn't appear to
be the case).

One possibility is that IndexSelectivity isn't taking
most_common_(vals|freqs) into account.

Looking at this from an idea case, most (or all) of this query should be
retrieved by simply incrementing through both the index and the tuples
at the same time. We should end up pulling 0.7% of the index and raw
pages combined. Analyze thinks that using the index will be about 50%
more expensive, though. (3258557 v. 2274866)

A thought that comes to mind here is that it would be incredible if
pgsql could take metrics of how long things actually take on a live
system and incorporate them... basically learning as it goes. A first
step in this case would be to keep tabs on how close real page-read
counts come to what the optimizer predicted, and storing that for later
analysis. This would make it easier for you to verify your linear
correlation assumption, for example (it'd also make it easier to
validate the PF formula).

> >> No. It's not apparent to me how you could do that without abandoning
> >> MVCC, which we're not likely to do.
>
> > Hmm... does MVCC mandate inserts go at the end?
>
> Anywhere that there's free space. The point is that you can't promise
> updates will fit on the same page as the original tuple. So whatever
> desirable physical ordering you may have started with will surely
> degrade over time.

Yes, updates are the tricky part to clustered indexes, and MVCC might
make it harder. What Sybase 11 (which only supports page locking) does
is see if the update moves the tuple off it's current page. If it
doesn't, it just shuffles the page around as needed and goes on with
business. If it needs to move, it grabs (and locks) the page it needs to
move to, inserts it on that page (possibly incurring a page split), and
deletes it from the old page. My guess is that with MVCC, you can't
simply delete the old tuple... you'd have to leave some kind of 'bread
crumb' behind for older transactions to see (though, I guess this would
already have to be happening somehow).

The reason to do this in this case is well worth it though... we end up
with one table (simplifies code) that should essentially act as if it
was multiple (5 in this case) tables, so performance should still be
very good.

> > On the other hand, it might be possible to get the advantages of a
> > clustered index without doing a *true* clustered index. The real point
> > is to be able to use indexes; I've heard things like 'if you need to
> > access more than 10% of a table then using an index would be
> > disasterous', and that's not good... that number should really be over
> > 50% for most reasonable ratios of fields indexed to fields in table (of
> > course field size plays a factor).
>
> If you have to read 50% of a table, you certainly should be doing a
> linear scan. There will be hardly any pages you can skip (unless the
> table is improbably well clustered), and the extra I/O needed to read
> the index will buy you nothing.

Yes, and it's that 'improbably well clustered' case that I have here. :)
But even if you're only 25% clustered, I think you'll still see a huge
gain on a very large table, especially if the index tuples are
substantially smaller than the raw tuples (which they normally should
be).
--
Jim C. Nasby (aka Decibel!) jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Re: More tablescanning fun at 2003-04-25 05:23:10 from Tom Lane

Responses

Re: More tablescanning fun at 2003-04-30 14:14:46 from Manfred Koizar

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jim C. Nasby	2003-04-25 15:09:05	Re: Important speed difference between a query and a function with the same query
Previous Message	Hannu Krosing	2003-04-25 07:01:31	Re: Important speed difference between a query and a