Re: [POC] FETCH limited by bytes.

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, mkellycs(at)gmail(dot)com, ashutosh(dot)bapat(at)enterprisedb(dot)com, Dave Cramer <pg(at)fastcrypt(dot)com>
Subject: Re: [POC] FETCH limited by bytes.
Date: 2015-12-27 01:45:21
Message-ID: CADkLM=fqQYDHsHeR3pid=P1VpPBVmPH9=vZ+pQb_7KZmWLhztA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Dec 26, 2015 at 6:16 AM, Vladimir Sitnikov <
sitnikov(dot)vladimir(at)gmail(dot)com> wrote:

> >Have you got numbers showing any actual performance win for postgres_fdw?
>
> For JDBC purposes, it would be nice to have an ability of asking
> backend "to stop fetching if produced more than X MiB of response
> data".
> For small table (4 int4 fields), having decent fetchSize (~1000) makes
> result processing 7 times faster than with fetchSize of 50 rows (14 ms
> -> 2 ms for 2000 rows).
> Here are the measurements: [1] and [2].
>
> Note: it is not required to precisely follow given "max fetch bytes"
> limit. It would be enough just to stop after certain amount of data
> was sent.
> The whole thing of using limited fetch size is to avoid running out of
> memory at client side.
> I do not think developers care how many rows is fetched at once. It
> they do, they should rather use "limit X" SQL syntax.
>
> Do you have a suggestion for such a use case?
>
> For fixed-size data types, JDBC driver can estimate "max sane fetch
> size", however:
> 1) In order to know data types, a roundtrip is required. This means
> the first fetch must be conservative, thus small queries would be
> penalized.
> 2) For variable length types there is no way to estimate "sane number
> of rows", except of using "average row size of already received data".
> This is not reliable, especially if the first rows have nulls, and
> subsequent ones contain non-empty strings.
>
> [1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473
> [2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387
>
> Vladimir
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

I believe that Kyotaro proposed something like that, wherein the FDW would
be more adaptive based on the amount of memory available, and fetch a
number of rows that, by its estimation, would fit in the memory available.
I don't know the progress of that patch.

This patch is a far less complicated solution and puts the burden on the
DBA to figure out approximately how many rows would fit in memory based on
the average row size, and set the per-table option accordingly. If it is
later determined that the rows are now too heavy to fit into the space
allotted, the fetch size can be altered for that table as needed.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2015-12-27 01:58:16 Re: WIP: Covering + unique indexes.
Previous Message Joe Conway 2015-12-27 01:08:41 oldest/newestCommitTs output by pg_controldata