Re: WIP patch: reducing overhead for repeat de-TOASTing

From: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: WIP patch: reducing overhead for repeat de-TOASTing
Date: 2008-07-01 11:15:17
Message-ID: 486A11C5.1040809@siriusit.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Attached is a worked-out patch for the approach proposed here:
> http://archives.postgresql.org/pgsql-hackers/2008-06/msg00777.php
> namely, that cache management for de-TOASTed datums is handled
> by TupleTableSlots.
>
> To avoid premature detoasting of values that we might never need, the
> patch introduces a concept of an "indirect TOAST pointer", which has
> the same 0x80 or 0x01 header as an external TOAST pointer, but can
> be told apart by having a different length byte. Within that we have
> * pointer to original toasted field within the Slot's tuple
> * pointer to the owning Slot
> * pointer to decompressed copy, or NULL if not decompressed yet
> Some fairly straightforward extensions to the TupleTableSlot code,
> heaptuple.c, and tuptoaster.c make it all go.
>
> My original thoughts had included turning FREE_IF_COPY() into a no-op,
> but on investigation that seems impractical. One case that still
> depends on that pfree is where we have palloc'd a 4-byte-header copy
> of a short-header datum to support code that needs properly aligned
> datum content. The solution adopted in the patch is to arrange for
> pfree() applied to a cacheable detoasted object to be a no-op, whereas
> it still works normally for non-cached detoasted objects. We do this
> by inserting a dummy chunk header that points to a dummy memory context
> whose pfree support method does nothing. I think this part of the patch
> would be required for any toast caching method, not just this one.
>
> What I like about this patch is that it's a fairly small-footprint
> change, it doesn't add much overhead, and it covers caching of
> decompression for in-line-compressed datums as well as the out-of-line
> case.
>
> One thing I really *don't* like about it is that it requires everyplace
> that copies Datums to know about indirect pointers: in general, the copy
> must be a copy of the original toasted Datum, not of the indirect
> pointer, else we have indirect pointers that can outlive their owning
> TupleTableSlot (or at least outlive its current tuple cycle). There
> only seem to be about half a dozen such places in the current code,
> but still it seems a rather fragile coding rule.
>
> After playing with it for a little bit, I'm not convinced that it buys
> enough performance win to be worth applying --- the restriction of cache
> lifespan to one tuple cycle of a TupleTableSlot is awfully restrictive.
> (For example, sorts that involve toasted sort keys continue to suck,
> because the tuples being sorted aren't in Slots.) It would probably
> fix the specific case that the PostGIS hackers were complaining of,
> but I think we need something more.
>
> Still, I wanted to get it into the archives because the idea of indirect
> toast pointers might be useful for something else.

Hi Tom,

Thanks very much for supplying the WIP patch. In the interest of testing
and feedback, I've just downloaded PostgreSQL CVS head and applied your
patch, compiled up a slightly modified version of PostGIS (without
RECHECKs on the GiST opclass) and loaded in the same dataset.

Unfortunately I have to report back that with your WIP patch applied,
timings seem to have become several orders of magnitude *worse*:

pg84(at)zeno:~$ psql -d postgis
psql (8.4devel)
Type "help" for help.

postgis=# explain analyze select count(*) from geography where centroid
&& (select the_geom from geography where id=69495);

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=7100.28..7100.29 rows=1 width=0) (actual
time=18238.932..18238.934 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7092.00 rows=1 width=4387)
(actual time=27.472..69.223 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Index Scan using geography_centroid_idx on geography
(cost=0.00..8.27 rows=1 width=0) (actual time=118.371..18196.041
rows=32880 loops=1)
Index Cond: (centroid && $0)
Total runtime: 18239.918 ms
(7 rows)

In fact, even sequential scans seem to have gone up by several orders of
magnitude:

postgis=# set enable_indexscan = 'f';
SET
postgis=# set enable_bitmapscan = 'f';
SET
postgis=# explain analyze select count(*) from geography where centroid
&& (select the_geom from geography where id=69495);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate (cost=14184.01..14184.01 rows=1 width=0) (actual
time=9117.022..9117.024 rows=1 loops=1)
InitPlan
-> Seq Scan on geography (cost=0.00..7092.00 rows=1 width=4387)
(actual time=23.780..54.362 rows=1 loops=1)
Filter: (id = 69495::numeric)
-> Seq Scan on geography (cost=0.00..7092.00 rows=1 width=0)
(actual time=55.016..9073.084 rows=32880 loops=1)
Filter: (centroid && $0)
Total runtime: 9117.174 ms
(7 rows)

ATB,

Mark.

--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2008-07-01 11:49:47 Re: [HACKERS] odd output in restore mode
Previous Message Heikki Linnakangas 2008-07-01 10:44:22 Re: [HACKERS] odd output in restore mode