Re: question about index cost estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Hoffmann <jeff(at)propertykey(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: question about index cost estimates
Date: 2000-05-18 02:35:33
Message-ID: 21125.958617333@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Hoffmann <jeff(at)propertykey(dot)com> writes:
> i understand that, but still, if you have a tuple thats, say, 144 bytes,
> you're going to have a higher chance of revisiting the same page than if
> the tuple is 4k.

Are you? If you are pulling ten tuples from a thousand-page relation,
seems to me the odds of revisiting any one page are somewhere around
0.01 (too lazy to do the exact combinatorics right now). Doesn't really
matter what the average tuple size is --- or if you prefer, we already
accounted for that because we are looking at target tuples divided by
total pages rather than target tuples divided by total tuples.

> maybe my question should be the slope of the curve. it turns out that
> playing around with my queries, it seems like when i was selecting about
> 10% of the records, the page fetch estimate is pretty accurate, although
> when the selectivity falls to significantly below that, the page fetch
> estimate stays quite a bit higher than actual.

Hmm. I would think that the behavior for very low selectivity ought to
be pretty close to the model: it's hard to see how it can be anything
but one page fetch per selected tuple, unless you are seeing strong
clustering effects. I do *not* claim that the slope of the curve is
necessarily right for higher selectivities though... that needs to be
looked at.

> part of what i'm doing is looking at the executor stats for each query.
> can you explain what the shared blocks, local blocks, and direct blocks
> mean?

The filesystem blocks in/out are from the kernel getrusage() call.
On my box, at least, these seem to mean physical I/O operations
initiated on behalf of the process --- AFAICT, touching a page that
is already in kernel disk buffers does not increment the filesystem
blocks count.

The other numbers are from PrintBufferUsage() in bufmgr.c. It looks
like the shared block counts are the number of block read and write
requests made to the kernel by bufmgr.c, and the hit rate indicates
the fraction of buffer fetch requests made to bufmgr.c that were
satisfied in Postgres' own buffer area (ie, without a kernel request).

The local block counts are the same numbers for non-shared relations,
which basically means tables created in the current transaction.
They'll probably be zero in most cases of practical interest.

The "direct" counts seem to be broken at the moment --- I can't find
any code that increments them. It looks like the intent was to count
block I/O operations on temporary files (sorttemp and suchlike).
This is not of interest for pure indexscans...

> are the shared blocks shared amongst all of the backends or does each
> have its own pool?

Shared among all --- that's what the shared memory block is (mostly)
for...

> i'm getting a 90%+ buffer hit rate on index scans,
> but i'm assuming that's because i'm the only one doing anything on
> this machine right now and that would go down with more processes.

It also suggests that your test table isn't much bigger than the buffer
cache ;-) --- or at least the part of it that you're touching isn't.

> i'm still taking everything with a grain of salt until i can
> explain it, though.

Good man. I don't think anyone but me has looked at this stuff in a
long while, and it could use review.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-05-18 03:18:54 Re: pg_dump return failed sanity check
Previous Message Bruce Momjian 2000-05-18 02:31:51 Re: table level locking different in 7.0?