Re: tuplestore API problem

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: tuplestore API problem
Date: 2009-03-27 02:43:00
Message-ID: e08cc0400903261943w38b479acl45868b6253ab625a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/3/27 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> By chance I discovered that this query in the regression tests
>
> SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;
>
> stops working if work_mem is small enough: it either dumps core or
> delivers wrong answers depending on platform.
>
> After some tracing I found out the reason.  ExecWindowAgg() does this:
>
>    if (!tuplestore_gettupleslot(winstate->buffer, true,
>                                 winstate->ss.ss_ScanTupleSlot))
>        elog(ERROR, "unexpected end of tuplestore");
>
> and then goes off and calls the window functions (ntile() here), and
> expects the ScanTupleSlot to still be valid afterwards.  However,
> ntile() forces us to read to the end of the input to find out the number
> of rows.  If work_mem is small enough, that means the tuplestore is
> forced into dump-to-disk mode, which means it releases all its in-memory
> tuples.  And guess what: the ScanTupleSlot is pointing at one of those,
> it doesn't have its own copy of the tuple.  So we wind up trying to read
> from a trashed bit of memory.
>
> A brute-force solution is to change tuplestore_gettupleslot() so that it
> always copies the tuple, but this would be wasted cycles for most uses
> of tuplestores.  I'm thinking of changing tuplestore_gettupleslot's API
> to add a bool parameter specifying whether the caller wants to force
> a copy.
>
> Comments, better ideas?

Is this tuplestore API problem? ISTM this is window function's
problem. I think my early code was holding heaptuple instead of
tupleslot for the current row. At a glance, the issue appears in only
current row in window function, which fetches row and uses it later
after storing following rows in some cases. So a brute-force solution
might be that ExecWindowAgg() copies the current row from tuplestore
instead of pointing directly to inside tuplestore memory, not changing
tuplestore API.

Regards,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2009-03-27 02:48:01 Re: typedefs for indent
Previous Message Andrew Dunstan 2009-03-27 02:42:45 Re: typedefs for indent