Skip site navigation (1) Skip section navigation (2)

GIST and TOAST

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: GIST and TOAST
Date: 2007-03-02 16:23:20
Message-ID: 87irdjr4w7.fsf@stark.xeocode.com (view raw or flat)
Thread:
Lists: pgsql-hackers
I have a problem getting packed varlenas to work with GIST indexes. Namely,
the GIST code doesn't call pg_detoast_datum() enough. Instead of using the
DatumGetFoo() macros it uses DatumGetPointer() and calls PG_DETOAST_DATUM only
when it thinks it'll be necessary.

I've temporarily made the packed varlena code ignore user defined data types.
This isn't very satisfactory though since it means domains over things like
text don't get packed.

And in any case even with this in place it doesn't fix all the problems. The
Postgres regression tests pass but I can still trigger problems because the
GIST code doesn't reliably call detoast even on user data types they're given.

I think these are actual bugs. If you happened to provide a large enough datum
to the gist code it would cause the same problem I'm seeing. The packed
varlena patch just makes it easier to trigger.

I've fixed the problems I'm seeing with the following patch to _int_gist.c.
But I'm not sure it's correct. What I'm afraid of is that I'm not sure where
these functions are being called from and whether they expect to be leaking
memory. If they're expected to not leak memory then they're now leaking the
detoasted datum and that's a problem.

I'm also wondering if there aren't similar problems in the dozens of other
gist indexing modules...

I wouldn't mind a second pair of eyes on the _int_gist.c changes if there's
someone who can tell me whether any of these functions require a
PG_FREE_IF_COPY or equivalent.


Index: _int_gist.c
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/contrib/intarray/_int_gist.c,v
retrieving revision 1.16
diff -u -r1.16 _int_gist.c
--- _int_gist.c	4 Oct 2006 00:29:45 -0000	1.16
+++ _int_gist.c	2 Mar 2007 16:09:26 -0000
@@ -1,6 +1,6 @@
 #include "_int.h"
 
-#define GETENTRY(vec,pos) ((ArrayType *) DatumGetPointer((vec)->vector[(pos)].key))
+#define GETENTRY(vec,pos) (DatumGetArrayTypeP((vec)->vector[(pos)].key))
 
 /*
 ** GiST support methods
@@ -39,7 +39,7 @@
 	if (strategy == BooleanSearchStrategy)
 	{
 		retval = execconsistent((QUERYTYPE *) query,
-								(ArrayType *) DatumGetPointer(entry->key),
+								DatumGetArrayTypeP(entry->key),
 								GIST_LEAF(entry));
 
 		pfree(query);
@@ -58,7 +58,7 @@
 	switch (strategy)
 	{
 		case RTOverlapStrategyNumber:
-			retval = inner_int_overlap((ArrayType *) DatumGetPointer(entry->key),
+			retval = inner_int_overlap(DatumGetArrayTypeP(entry->key),
 									   query);
 			break;
 		case RTSameStrategyNumber:
@@ -70,21 +70,21 @@
 									PointerGetDatum(&retval)
 					);
 			else
-				retval = inner_int_contains((ArrayType *) DatumGetPointer(entry->key),
+				retval = inner_int_contains(DatumGetArrayTypeP(entry->key),
 											query);
 			break;
 		case RTContainsStrategyNumber:
 		case RTOldContainsStrategyNumber:
-			retval = inner_int_contains((ArrayType *) DatumGetPointer(entry->key),
+			retval = inner_int_contains(DatumGetArrayTypeP(entry->key),
 										query);
 			break;
 		case RTContainedByStrategyNumber:
 		case RTOldContainedByStrategyNumber:
 			if (GIST_LEAF(entry))
 				retval = inner_int_contains(query,
-								  (ArrayType *) DatumGetPointer(entry->key));
+								  DatumGetArrayTypeP(entry->key));
 			else
-				retval = inner_int_overlap((ArrayType *) DatumGetPointer(entry->key),
+				retval = inner_int_overlap(DatumGetArrayTypeP(entry->key),
 										   query);
 			break;
 		default:
@@ -282,10 +282,10 @@
 	float		tmp1,
 				tmp2;
 
-	ud = inner_int_union((ArrayType *) DatumGetPointer(origentry->key),
-						 (ArrayType *) DatumGetPointer(newentry->key));
+	ud = inner_int_union(DatumGetArrayTypeP(origentry->key),
+						 DatumGetArrayTypeP(newentry->key));
 	rt__int_size(ud, &tmp1);
-	rt__int_size((ArrayType *) DatumGetPointer(origentry->key), &tmp2);
+	rt__int_size(DatumGetArrayTypeP(origentry->key), &tmp2);
 	*result = tmp1 - tmp2;
 	pfree(ud);
 
@@ -297,8 +297,8 @@
 Datum
 g_int_same(PG_FUNCTION_ARGS)
 {
-	ArrayType  *a = (ArrayType *) PointerGetDatum(PG_GETARG_POINTER(0));
-	ArrayType  *b = (ArrayType *) PointerGetDatum(PG_GETARG_POINTER(1));
+	ArrayType  *a = PG_GETARG_ARRAYTYPE_P(0);
+	ArrayType  *b = PG_GETARG_ARRAYTYPE_P(1);
 	bool	   *result = (bool *) PG_GETARG_POINTER(2);
 	int4		n = ARRNELEMS(a);
 	int4	   *da,


-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Responses

pgsql-hackers by date

Next:From: Pavan DeolaseeDate: 2007-03-02 17:04:50
Subject: Re: HOT - preliminary results
Previous:From: Florian G. PflugDate: 2007-03-02 16:02:28
Subject: Re: UPSERT

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group