HOT patch - version 15

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: HOT patch - version 15
Date: 2007-09-04 16:31:20
Message-ID: 2e78013d0709040931j25f9d964n41c3d78951037bcc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On 9/1/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
>
> I see this the other way around: if it doesn't work on system catalogs,
> it probably doesn't work, period. I'm not in favor of treating the
> catalogs differently.
>
>

Please see the revised patches attached. The combo patch applies cleanly
over
current HEAD. The incremental patch contains changes since the last patch.
The patch passes all but three regression tests. The failures are
cosmetic in nature.

Although I am comfortable with the patch, I wanted to send this out today
because tomorrow is my last work day before I go on leave for rest of
the week. I shall try to respond to mails and work a bit during that period,
but I won't be active as usual. I can fix any minor left over things
tomorrow
though.

The patch now supports HOT update on system catalogs. This requires us
to wait for deleting/inserting transaction if we encounter
DELETE_IN_PROGRESS/INSERT_IN_PROGRESS tuple while building an index.
I got worried if this could add a new deadlock scenario, but I believe
system catalog reindexing is not very common, so may be we can live
with that.

Another thing that worries be is the handling of RECENTLY_DEAD tuples
while rebuilding system catalogs indexes. For non-system relations, we
don't make the new index available for queries if any RECENTLY_DEAD
tuples are skipped while building the index. We can do the same for
system catalogs, though we may need more work to make that happen.
But since system catalogs are always run with SnapshotNow, do we need
to worry about RECENTLY_DEAD tuples ? ISTM that we should never return
RECENTLY_DEAD tuples with SnapshotNow. Does this sound ok ?

The changes to heap_update() signature are reverted. The caller can check if
the
new tuple satisfies HeapTupleIsHeapOnly - which signifies that the update
was HOT update. This has also helped us limit the changes to support
system catalogs, because there are several invocation of
simple_heap_update(), followed by CatalogUpdateIndexes(). We can
now check for heap-only tuple in CatalogUpdateIndexes and skip
index inserts for HOT tuples.

As per Tom's suggestion, relcache is now responsible for building list
of HOT attributes. This needs to happen only once unless there are schema
changes. Also, we use just a single bitmap to track normal and system
attributes - offset by FirstLowInvalidHeapAttributeNumber

As per discussion, the page pruning and repair fragmentation is clubbed
together. In the SELECT path, we conditionally try for exclusive lock. If
the exclusive lock is also cleanup lock, we prune and defrag the page.
As the patch stands, during index fetch we try to prune only the chain
originating at that tuple. I am wondering if we should change that
and prune all the tuple chains in the page ?

I haven't removed the avgFSM logic used to time the pruning of
the page. But we can remove that if we feel its not worth the
complexity. May be a simple test of free space as factor of the
BLCKSZ would suffice.

MaxHeapTuplesPerPage is reverted back to the original value.

I could have removed hot_update from xl_heap_update. But that
would have complicated the heap_xlog_update() code. So I
left it unchanged right now. But we still feel thats worth doing, I
will make that change.

The IsPlanValidForThisTransaction() nows uses PlannerGlobal
as per Tom's suggestion, thus making it reentrant. I am not sure if
I have got it completely right though.

Apart from the changes based on recent feedback, I have also
added the previously discussed changes to track dead_space in
the relation and trigger autovaccum when the percentage of
dead_space goes beyond the threshold. vac_update_scale
is still a percentage of dead_space to the relation size.
vac_update_threshold is right now number of blocks, but we need
to discuss and finalize the changes. One good thing is if we do
it this way, autoanalyze trigger will be based on total number of
HOT and non-HOT updates (along with inserts/deletes)

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
PG83-HOT-incremental-v15.0.patch.gz application/x-gzip 21.1 KB
PG83-HOT-combo-v15.0.patch.gz application/x-gzip 56.0 KB

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-09-04 16:46:19 Re: [HACKERS] enum types and binary queries
Previous Message Gregory Stark 2007-09-04 14:47:23 Re: HOT documentation README