From: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: tuplestore API problem |
Date: | 2009-03-28 06:31:17 |
Message-ID: | e08cc0400903272331q4e3ab3a7m60e05d605c7c6944@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2009/3/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> 2009/3/27 Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>:
>>> 2009/3/27 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>>> A brute-force solution is to change tuplestore_gettupleslot() so that it
>>>> always copies the tuple, but this would be wasted cycles for most uses
>>>> of tuplestores. I'm thinking of changing tuplestore_gettupleslot's API
>>>> to add a bool parameter specifying whether the caller wants to force
>>>> a copy.
>
>> Here's the patch. Hope there are no more on the same reason. It seems
>> that we'd need to implement something like garbage collector in
>> tuplestore, marking and tracing each row references, if the complete
>> solution is required.
>
> I don't like this; I'm planning to go with the aforementioned API
> change instead. The way you have it guarantees an extra copy cycle
> even when tuplestore is already making a copy internally; and it doesn't
> help if we find similar problems elsewhere. (While I'm making the
> API change I'll take a close look at each call site to see if it has
> any similar risk.)
You're right. It kills performance even after dumptuples(). Thinking
more, I found the cause is only around dumptuples(). If you can trace
TupleTableSlots that points to memtuples inside tuplestore, you can
materialize them just before WRITETUP() in dumptuples().
So I tried pass EState.es_tupleTables to tuplestore_begin_heap() to
trace those TupleTableSlots. Note that if you pass NULL the behavior
is as before so nothing's broken. Regression passes but I'm not quite
sure EState.es_tupleTable is only place that holds TupleTableSlots
passed to tuplestore...
I know you propose should_copy boolean parameter would be added to
tuplestore_gettupleslot(). That always adds overhead even if
tuplestore *doesn't* dump tuples. That case doesn't need copy tuples,
I guess.
Regards,
--
Hitoshi Harada
Attachment | Content-Type | Size |
---|---|---|
tuplestore_apichange.20090328.patch | application/octet-stream | 12.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-03-28 07:20:30 | Re: Should SET ROLE inherit config params? |
Previous Message | Euler Taveira de Oliveira | 2009-03-28 05:25:52 | Re: psql \d* and system objects |