Re: Making strxfrm() blobs in indexes work

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Making strxfrm() blobs in indexes work
Date: 2014-02-12 23:58:04
Message-ID: CAM3SWZQxUnM+wHf6ngjcSqda-03jhpSbOb4+ESWXEZ7uZMZn5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 12, 2014 at 3:30 PM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> (A bit late to the party). This idea has come up before and the most
> annoying thing is that braindead strxfrm api. Namely, to strxfrm a
> large strings you need to strxfrm it completely even if you only want
> the first 8 bytes.

I think that in general strxfrm() must have that property. However, it
does not obligate us to store the entire string, provided that we
don't trust blob comparisons that indicate equality (since I believe
that we cannot do so generally with a non-truncated blob, we might as
well take advantage of this, particularly given we're doing this with
already naturally dissimilar inner pages, where we'll mostly get away
with truncation provided there is a reliable tie-breaker).

Besides all this, I'm not particularly worried about the cost of
calling strxfrm() if that only has to occur when a page split occurs,
as we insert a downlink into the parent page. The big picture here is
that we can exploit the properties of inner pages to do the smallest
amount of work for the largest amount of benefit.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-02-12 23:58:09 Re: narwhal and PGDLLIMPORT
Previous Message Haribabu Kommi 2014-02-12 23:30:55 Re: contrib/cache_scan (Re: What's needed for cache-only table scan?)