Re: Limiting setting of hint bits by read-only queries; vacuum_delay

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Limiting setting of hint bits by read-only queries; vacuum_delay
Date: 2013-03-26 12:30:02
Message-ID: CA+U5nM+nh-TfrnpdLDMKTH0hhTexjM_mRTTyZ6JLF5wDZ1YHwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 March 2013 11:33, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Mar 26, 2013 at 5:27 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 26 March 2013 01:35, Greg Stark <stark(at)mit(dot)edu> wrote:
>>> On Tue, Mar 26, 2013 at 12:00 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> I'll bet you all a beer at PgCon 2014 that this remains unresolved at
>>>> that point.
>>>
>>> Are you saying you're only interested in working on it now? That after
>>> 9.3 is release you won't be interested in working on it any more?
>>>
>>> As you said we've been eyeing this particular logic since 2004, why
>>> did it suddenly become more urgent now? Why didn't you work on it 9
>>> months ago at the beginning of the release cycle?
>>
>> I'm not sure why your comments are so confrontational here, but I
>> don't think it helps much. I'm happy to buy you a beer too.
>>
>> As I explained clearly in my first post, this idea came about trying
>> to improve on the negative aspects of the checksum patch. People were
>> working on ideas 9 months ago to resolve this, but they have come to
>> nothing. I regret that; Merlin and others have worked hard to find a
>> way: Respect to them.
>>
>> My suggestion is to implement a feature that takes 1 day to write and
>> needs little testing to show it works.
>
> Any patch in this area isn't likely to take much testing to establish
> whether it improves some particular case. The problem is what happens
> to all of the other cases - and I don't believe that part needs little
> testing, hence the objections (with which I agree) to doing anything
> about this now.
>
> If we want to change something in this area, we might consider
> resurrecting the patch I worked on for this last year, which had, I
> believe, a fairly similar mechanism of operation to what you're
> proposing, and some other nice properties as well:
>
> http://www.postgresql.org/message-id/AANLkTik5QzR8wTs0MqCWwmNp-qHGrdKY5Av5aOB7W4Dp@mail.gmail.com
> http://www.postgresql.org/message-id/AANLkTimGKaG7wdu-x77GNV2Gh6_Qo5Ss1u5b6Q1MsPUy@mail.gmail.com
>
> ...but I think the main reason why that never went anywhere is because
> we never really had any confidence that the upsides were worth the
> downsides. Fundamentally, postponing hint bit setting (or hint bit
> I/O) increases the total amount of work done by the system. You still
> end up writing the hint bits eventually, and in the meantime you do
> more CLOG lookups. Now, as a compensating benefit, you can spread the
> work of writing the hint-bit updated pages out over a longer period of
> time, so that no single query carries too much of the burden of
> getting the bits set. The worst-case-latency vs. aggregate-throughput
> tradeoff is one with a long history and I think it's appropriate to
> view this problem through that lens also.

I hadn't realised so many patches existed that were similar. Hackers
is bigger these days.

Reviewing the patch, I'd say the problem is that it is basically
implementing a new automatic heuristic. We simply don't have any
evidence that any new heuristic will work for all cases, so we do
nothing.

Whether we apply my patch, yours or Merlin's, my main thought now is
that we need a user parameter to control it so it can be adjusted
according to need and not touched at all if there is no problem.

My washing machine has a wonderful feature "15 min wash" and it works
great for the times I know I need it; but in general, the auto wash
mode works fine since often you don't care that it takes 90 minutes.
It's much easier to see that the additional user option is beneficial,
but much harder to start arguing that the default wash cycle should be
85 or 92 minutes. It'd be great if the washing machine could work out
that I need my clothes quickly and that on-this-day-only I don't care
about the thoroughness of the wash, but it can't. I don't think the
washing machine engineers are idiots for not being able to work that
out, but if they only offered a single option because they thought
they knew better than me, I'd be less than impressed.

In the same way, we need some way to say "big queries shouldn't do
cleanup" even if autovacuum ends up doing more I/O over time (though
in fact I doubt this is the case, detailed argument on other post).

So please, lets go with a simple solution now that allows users to say
what they want.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2013-03-26 13:02:55 Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)
Previous Message Pavel Stehule 2013-03-26 12:10:46 Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)