Re: ECPG FETCH readahead

From: Noah Misch <noah(at)leadboat(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Meskes <meskes(at)postgresql(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ECPG FETCH readahead
Date: 2012-03-29 17:03:41
Message-ID: 20120329170341.GA4142@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 29, 2012 at 12:59:40PM +0200, Boszormenyi Zoltan wrote:
> 2012-03-29 02:43 keltez?ssel, Noah Misch ?rta:
>> On Sat, Mar 24, 2012 at 10:49:07AM +0100, Boszormenyi Zoltan wrote:

>>> + to<literal>READAHEAD number</literal>. Explicit<literal>READAHEAD number</literal> or
>>> +<literal>NO READAHEAD</literal> turns cursor readahead on (with<literal>number</literal>
>>> + as the size of the readahead window) or off for the specified cursor,
>>> + respectively. For cursors using a non-0 readahead window size is 256 rows,
>> The number 256 is no longer present in your implementation.
>
> Indeed. Oversight, that part of the sentence is deleted.
>
>>
>>> + the window size may be modified by setting the<literal>ECPGFETCHSZ</literal>
>>> + environment variable to a different value.
>> I had in mind that DECLARE statements adorned with READAHEAD syntax would
>> always behave precisely as written, independent of ECPGFETCHSZ or "ecpg -R".
>> Unadorned DECLARE statements would use ECPGFETCHSZ if set, then the value
>> passed to "ecpg -R" if provided, and finally a value of 1 (no readahead) in
>> the absence of both ECPGFETCHSZ and "ecpg -R". Did you do it differently for
>> a particular reason? I don't particularly object to what you've implemented,
>> but I'd be curious to hear your thinking.
>
> What I had in mind was:
>
> NO READAHEAD == READAHEAD 0. This will translate into 1 tuple sized
> readahead window that cannot be modified by ECPGFETCHSZ.
>
> READAHEAD 1 also means uncached by default but ECPGFETCHSZ may
> modify the readahead window size.

To me, it feels too magical to make READAHEAD 1 and READAHEAD 0 differ only in
whether ECPGFETCHSZ can override. I'm willing to go with whatever consensus
arises on this topic, though.

> This part is policy that can be debated and modified accordingly,
> it doesn't affect the internals of the caching code.

Agreed.

>>> +</para>
>>> +
>>> +<para>
>>> +<command>UPDATE</command> or<command>DELETE</command> with the
>>> +<literal>WHERE CURRENT OF</literal> clause, cursor readahead may or may not
>>> + actually improve performance, as<command>MOVE</command> statement has to be
>>> + sent to the backend before the DML statement to ensure correct cursor position
>>> + in the backend.
>>> +</para>
>> This sentence seems to be missing a word near its beginning.
>
> Sounds like a corner case to me, but I am not a native english speaker.
> Which one sounds better:
>
> UPDATE or DELETE with the WHERE CURRENT OF clause...
>
> or
>
> UPDATE or DELETE statements with the WHERE CURRENT OF clause...
> ?

Here is the simplest change to make the original grammatical:

Given an UPDATE or DELETE with the WHERE CURRENT OF clause, ...

I might write the whole thing this way:

To execute an UPDATE or DELETE statement bearing the WHERE CURRENT OF
clause, ECPG may first execute an implicit MOVE to synchronize the server
cursor position with the local cursor position. This can reduce or
eliminate the performance benefit of readahead for affected cursors.

>> one of the new sections about readahead should somehow reference the hazard
>> around volatile functions.
>
> Done.

I don't see the mention in your latest patch. You do mention it for the
sqlerrd[2] compatibility stuff.

>>> +<varlistentry>
>>> +<term><literal>OPEN RETURNS LAST ROW POSITION</literal></term>
>>> +<term><literal>OPEN RETURNS 0 FOR LAST ROW POSITION</literal></term>
>>> +<listitem>
>>> +<para>
>>> + When the cursor is opened, it's possible to discover the size of the result set
>>> + using<command>MOVE ALL</command> which traverses the result set and move
>>> + the cursor back to the beginning using<command>MOVE ABSOLUTE 0</command>.
>>> + The size of the result set is returned in sqlca.sqlerrd[2].
>>> +</para>
>>> +
>>> +<para>
>>> + This slows down opening the cursor from the application point of view
>>> + but may also have side effects. If the cursor query contains volatile function
>>> + calls with side effects, they will be evaluated twice because of traversing
>>> + the result set this way during<command>OPEN</command>.
>>> +</para>
>>> +
>>> +<para>
>>> + The default is not to discover.
>>> +</para>
>> I mildly oppose having syntax to enable this per-cursor. Readahead is a
>> generally handy feature that I might use in new programs, but this feature is
>> more of a hack for compatibility with some old Informix version. For new
>> code, authors should do their own MOVEs instead of using this option.
>>
>> The patch also adds an undocumented ECPGOPENRETURNSRESULTSETSIZE environment
>> variable to control this. I see no use for such a thing, because a program's
>> dependency on sqlerrd[2] is fixed at build time. If a program does not
>> reference sqlerrd[2] for a cursor, there's no benefit from choosing at runtime
>> to populate that value anyway. If a program does reference it, any option
>> needed to have it set correctly had better be set at build time and apply to
>> every run of the final program.
>>
>> IOW, I suggest having only the "ecpg"-time option to enable this behavior.
>> Thoughts?
>
> I wanted to make it available for non-Informix mode and a way to
> disable it if the command line option enables it.

I can understand the desire to have a way to disable it. Perhaps you have a
bunch of old Informix code, so you enable the global option and later discover
some expensive cursors that don't use sqlerrd[2]. It would be nice to locally
suppress the behavior for those cursors.

Still, we're looking at dedicated ECPG syntax, quite visible even to folks
with no interest in Informix. We have eschewed littering our syntax with
compatibility aids, and I like it that way. IMO, an option to the "ecpg"
preprocessor is an acceptable level of muddle to provide this aid. A DECLARE
CURSOR syntax extension goes too far.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniele Varrazzo 2012-03-29 17:30:22 Re: Potential reference miscounts and segfaults in plpython.c
Previous Message Robert Haas 2012-03-29 16:59:51 Re: Command Triggers patch v18