Re: Further open item (Was: Status of 7.2)

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: "Tille, Andreas" <TilleA(at)rki(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject: Re: Further open item (Was: Status of 7.2)
Date: 2001-11-19 13:54:01
Message-ID: 3BF90EF9.7040608@tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tille, Andreas wrote:

>On Fri, 16 Nov 2001, Bruce Momjian wrote:
>
>>I personally would like to have index scans that look up heap rows
>>record the heap expired status into the index entry via one bit of
>>storage. This will not _prevent_ checking the heap but it will prevent
>>heap lookups for index entries that have been exipred for a long time.
>>However, with the new vacuum, and perhaps autovacuum coming soon, may be
>>little need for this optimization.
>>
>>The underlying problem the user is seeing is how to _know_ an index
>>tuple is valid without checking the heap,
>>
I'd propose a memory-only (or heavily cached) structure of tuple death
transaction
ids for all transactions since the oldest live trx. And when that oldest
finishes then
the tombstone marks for all tuples deleted between that and the new
oldest are
moved to relevant indexes (or the index keys are deleted) by concurrent
vacuum
or similar process.

We could even try to use the index itself as that structure by favoring
changed index pages
when making caching decisions. It is much safer to cache indexes than it
is to cache data
pages as for indexes we only need to detect (by keeping info in WAL for
example) that it
is broken and not what it contained as it can always be rebuilt after
computer crash.

The problem with using an ndex for this is _which_ index to use when
there are many per table.
Perhaps a good choice would be the PRIMARY KEY.

OTOH, keeping this info in index and not in a dedicated structure makes
the amount of
data needing to be cached well bigger and thus the whole operation more
expensive.

>> and I don't see how to do that
>>unless we start storing the transaction id in the index tuple, and that
>>requires extra storage.
>>
>For my special case I think doubling main memory is about the same
>price as a MS SQL server license. I can´t say which further problems
>might occure.
>
Then you must have really huge amounts of memory already ;)

------------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message P.J. "Josh" Rovero 2001-11-19 13:59:02 Re: Delete Performance
Previous Message mlw 2001-11-19 13:45:36 postgresql.conf