Re: pgsql_fdw, FDW for PostgreSQL server

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2011-10-31 12:24:09
Message-ID: CAFj8pRAxoo8c0K8wx6cS9XPZecyUcLGhmcaNTVFsEJdEoMjSCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/10/31 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> (2011/10/30 11:34), Shigeru Hanada wrote:
>> 2011/10/30 Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> I think we have to.  Even if we estimate that a given scan will return
>>> only a few rows, what happens if we're wrong?  We don't want to blow out
>>> memory on the local server by retrieving gigabytes in one go.
>>
>> Oh, I overlooked the possibility of wrong estimation.  Old PostgreSQL uses
>> 1000 as default estimation, so big table which has not been analyzed may
>> crashes the backend.
>>
>> To ensure the data retrieving safe, we need to get actual amount of result,
>> maybe by executing SELECT COUNT(*) in planning phase.  It sounds too heavy
>> to do for every scan, and it still lacks actual width.
>>
>> One possible idea is to change default value of min_cursur_rows option to 0
>> so that pgsql_fdw uses CURSOR by default, but it seems not enough.  I'll
>> drop simple SELECT mode from first version of pgsql_fdw for safety.
>
> I removed simple SELECT mode from pgsql_fdw, and consequently also
> removed min_cursor_rows FDW option.  This fix avoids possible memory
> exhaustion due to wrong estimation gotten from remote side.
>
> Once libpq has had capability to retrieve arbitrary number of rows from
> remote portal at a time without server-side cursor in future, then we
> will be able to revive simple SELECT.  Then it's enough safe even if we
> don't have actual data size, but (maybe) faster than cursor mode because
> we can reduce # of SQL commands.  Though of course proof of performance
> advantage should be shown before such development.

If you need a less SQL commands, then you can increase fetch_count
parameter - default 1000 is maybe too small, maybe 10000 lines as
default (not more).

For more complex queries can be interesting to set a cursor_tuple_fraction

Pavel

>
> --
> Shigeru Hanada
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-10-31 12:44:16 Re: So, is COUNT(*) fast now?
Previous Message Marcin Mańk 2011-10-31 10:54:34 Re: Thoughts on "SELECT * EXCLUDING (...) FROM ..."?