Re: WIP patch: reducing overhead for repeat de-TOASTing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch: reducing overhead for repeat de-TOASTing
Date: 2008-07-01 18:43:50
Message-ID: 17434.1214937830@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ back on-list ]

Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk> writes:
> Thanks very much for supplying the WIP patch. In the interest of testing
> and feedback, I've just downloaded PostgreSQL CVS head and applied your
> patch, compiled up a slightly modified version of PostGIS (without
> RECHECKs on the GiST opclass) and loaded in the same dataset.
> Unfortunately I have to report back that with your WIP patch applied,
> timings seem to have become several orders of magnitude *worse*:

OK, I've reproduced the test case locally. I believe that when you
say "worse", you mean "worse than 8.3", right? And you did tell me
offlist that you were testing with --enable-cassert. CVS HEAD has
very substantially greater cassert overhead because of the
randomize_memory addition --- oprofile output for this test looks like

samples % image name symbol name
1239580 78.7721 postgres randomize_mem
143544 9.1218 libc-2.7.so memcpy
48039 3.0528 libc-2.7.so memset
13838 0.8794 postgres LWLockAcquire
12176 0.7738 postgres index_getnext
11697 0.7433 postgres LWLockRelease
10406 0.6613 postgres hash_search_with_hash_value
4739 0.3012 postgres toast_fetch_datum
4099 0.2605 postgres _bt_checkkeys
3905 0.2482 postgres AllocSetAlloc
3751 0.2384 postgres PinBuffer
3545 0.2253 postgres UnpinBuffer

I'm inclined to think that we'd better turn that off by default,
since it's not looking like it's catching anything new.

However, even with cassert turned off, the de-TOAST patch isn't
helping your test case; though it does help if I turn the query
into a join. Here's what I get for

8.3 HEAD HEAD + patch

original query [1] 4575 4669 4613
original + force_2d [2] 196 195 201
converted to join [3] 4603 4667 209
original, indexscans off 2335 2391 2426
join, indexscans off 2350 2373 124

[1] select count(*) from geography where centroid && (select the_geom from geography where id=69495);

[2] select count(*) from geography where centroid && (select force_2d(the_geom) from geography where id=69495);

[3] select count(*) from geography g1, geography g2 where g1.centroid && g2.the_geom and g2.id=69495;

All times in msec, median of 3 trials using psql \timing (times seem
to be reproducible within about 1%, if data is already cached).
Default parameters all around.

The join form of the query produces the results I expected, with
g2.the_geom coming from the outer side of a nestloop join and getting
detoasted only once. After some digging I found the reason why the
original query isn't getting any benefit: it's copying the_geom up
from an InitPlan, and nodeSubplan.c does that like this:

/*
* We need to copy the subplan's tuple into our own context, in case
* any of the params are pass-by-ref type --- the pointers stored in
* the param structs will point at this copied tuple! node->curTuple
* keeps track of the copied tuple for eventual freeing.
*/
if (node->curTuple)
heap_freetuple(node->curTuple);
node->curTuple = ExecCopySlotTuple(slot);

/*
* Now set all the setParam params from the columns of the tuple
*/
foreach(l, subplan->setParam)
{
int paramid = lfirst_int(l);
ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);

prm->execPlan = NULL;
prm->value = heap_getattr(node->curTuple, i, tdesc,
&(prm->isnull));
i++;
}

So the Param is pointing at a bare toasted Datum, not an indirect
pointer in a Slot, and there's no way to avoid detoasting each time
the Param is referenced.

It would be simple enough to fix nodeSubplan.c to copy the data into
an upper-level Slot rather than a bare tuple. But this makes me wonder
how many other places are like this. In the past there wasn't that much
benefit to pulling data from a Slot instead of a bare tuple, so I'm
afraid we might have a number of similar gotchas we'd have to track
down.

The other thing that worries me is that the force_2d query measurements
suggest a runtime penalty of two or three percent for the patch, which
is higher than I was hoping for. But that isn't that much more than
the noise level in this test.

On the whole I'm still feeling pretty discouraged about this patch ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shane Ambler 2008-07-01 18:46:56 Re: A new take on the foot-gun meme
Previous Message Bruce Momjian 2008-07-01 18:30:26 Re: Access to localized_str_tolower()