Re: 7.4 beta 1 getting out of swap

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bertrand Petit <elrond(at)phoe(dot)frmug(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: 7.4 beta 1 getting out of swap
Date: 2003-08-15 02:46:20
Message-ID: 3F3C497C.1050200@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Tom Lane wrote:
> Bertrand Petit <elrond(at)phoe(dot)frmug(dot)org> writes:
>> And I just got another one, much simpler, that failed the same
>>way with the same data set:
>>UPDATE rimdb_atitles SET aka_title=convert(byte_title,charset,'UTF8');
>
> [ where rimdb_atitles has an index on column "attribs varchar[]" ]
>
> Uh-huh. Actually, any large insert or update on that table will run out
> of memory, I bet. The problem appears to be due to the newly-added
> support for indexing array columns --- array_cmp() leaks memory, which
> is verboten for index support operators.

Ugh.

> I can think of a number of ways we might attack this, but none seem
> especially attractive ---
>
> 1. Have the index AMs create and switch into a special memory context
> for each call, rather than running in the main execution context.
> I am not sure this is workable at all, since the AMs tend to think they
> can create data structures that will live across calls (for example a
> btree lookup stack). It'd be the most general solution, if we could
> make it work.

This seems like a risky change at this point.

> 2. Modify the index AMs so that the comparison function FmgrInfo is
> preserved across a whole query. I think this requires changes to the
> index AM API (index_insert for instance has no provision for sharing
> data across multiple calls). Messy, and would likely mean an initdb.
> It would probably be the fastest answer though, since lookups wouldn't
> need to be done more than once per query.

This seems like a fairly big change this late in the game too.

> 3. Set up a long-lived cache internal to the array functions that can
> translate element type OID to the needed lookup data, and won't leak
> memory across repeated calls. This is not the fastest or most general
> solution, but it seems the most localized and safest fix.
>

I think I like #3 the best, but maybe that's because it's the one I
think I understand the best ;-)

It seems to me that #3 is the least risky, and even if it isn't the best
possible performance, this is the initial implementation of indexes on
arrays, so it isn't like we're taking away something. Maybe solution #2
is better held as a performance enhancement for 7.5.

Do you want me to take a shot at this since I created the mess?

Joe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Larry Rosenman 2003-08-15 03:23:17 UnixWare on Current CVS: Success!
Previous Message Christopher Kings-Lynne 2003-08-15 02:22:38 Re: threading and FreeBSD

Browse pgsql-performance by date

  From Date Subject
Next Message mixo 2003-08-15 06:33:57 Benchmark
Previous Message Bruce Momjian 2003-08-15 00:41:52 Re: 7.4 beta 1 getting out of swap