Re: pgsql_fdw, FDW for PostgreSQL server

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2012-02-16 17:38:01
Message-ID: CADyhKSUMW2+pHp8MSG3E-xNCE1GmEoyzq-V=XABe1W6kRM8siw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012年2月16日13:41 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> Kaigai-san,
>
> Thanks for the review. Attached patches are revised version, though
> only fdw_helper_v5.patch is unchanged.
>
> (2012/02/16 0:09), Kohei KaiGai wrote:
>> [memory context of tuple store]
>> It calls tuplestore_begin_heap() under the memory context of
>> festate->scan_cxt at pgsqlBeginForeignScan.
>
> Yes, it's because tuplestore uses a context which was current when
> tuplestore_begin_heap was called. I want to use per-scan context for
> tuplestore, to keep its content tuples alive through the scan.
>
>> On the other hand, tuplestore_gettupleslot() is called under the
>> memory context of festate->tuples.
>
> Yes, result tuples to be returned to executor should be allocated in
> per-scan context and live until next IterateForeignScan (or
> EndForeignScan), because such tuple will be released via ExecClearTuple
> in next IterateForeignScan call. If we don't switch context to per-scan
> context, result tuple is allocated in per-tuple context and cause
> double-free and server crash.
>
>> I could not find a callback functions being invoked on errors,
>> so I doubt the memory objects acquired within tuplestore_begin_heap()
>> shall be leaked, even though it is my suggestion to create a sub-context
>> under the existing one.
>
> How do you confirmed that no callback function is invoked on errors? I
> think that memory objects acquired within tuplestore_begin_heap (I guess
> you mean excluding stored tuples, right?) are released during cleanup of
> aborted transaction. I tested that by adding elog(ERROR) to the tail of
> store_result() for intentional error, and execute large query 100 times
> in a session. I saw VIRT value (via top command) comes down to constant
> level after every query.
>
Oops, I overlooked the point where MessageContext and its children get
reset. However, as its name, I don't believe it was right usage of memory
context.
As the latest version doing, es_query_cxt is the right way to acquire
memory object with per-query duration.

>> In my opinion, it is a good choice to use es_query_cxt of the supplied EState.
>> What does prevent to apply this per-query memory context?
>
> Ah, I've confused context management of pgsql_fdw... I fixed pgsql_fdw
> to create per-scan context as a child of es_query_cxt in
> BeginForeignScan, and use it for tuplestore of the scan. So, tuplestore
> and its contents are released correctly at EndForeignScan, or cleanup of
> aborted transaction in error case.
>
I believe it is right direction.

>> You mention about PGresult being malloc()'ed. However, it seems to me
>> fetch_result() and store_result() once copy the contents on malloc()'ed
>> area to the palloc()'ed area, and PQresult is released on an error using
>> PG_TRY() ... PG_CATCH() block.
>
> During thinking about this comment, I found double-free bug of PGresult
> in execute_query, thanks :)
>
Unfortunately, I found the strange behavior around this code.
I doubt an interaction between longjmp and compiler optimization,
but it is not certain right now.

I'd like to push this patch to committer reviews after this problem got closed.

Right now, I don't have comments on this patch any more.

> But, sorry, I'm not sure what the concern you show here is. The reason
> why copying tuples from malloc'ed area to palloc'ed area is to release
> PGresult before returning from the IterateForeingScan call. The reason
> why using PG_TRY block is to sure that PGresult is released before jump
> back to upstream in error case.
>
>> [Minor comments]
>> Please set NULL to "sql" variable at begin_remote_tx().
>> Compiler raises a warnning due to references of uninitialized variable,
>> even though the code path never run.
>
> Fixed. BTW, just out of curiosity, which compiler do you use? My
> compiler ,gcc (GCC) 4.6.0 20110603 (Red Hat 4.6.0-10) on Fedora 15,
> doesn't warn it.
>
I uses Fedora 16, and GCC 4.6.2.

[kaigai(at)iwashi pgsql_fdw]$ gcc --version
gcc (GCC) 4.6.2 20111027 (Red Hat 4.6.2-1)

It is not a matter related to compiler version, but common manner in
PostgreSQL code. You can likely found source code comments
like "/* keep compiler quiet */"

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2012-02-16 17:42:26 Re: Command Triggers
Previous Message Dan Scales 2012-02-16 17:18:23 possible new option for wal_sync_method