Re: Default setting for enable_hashagg_disk

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-06-22 20:23:58
Message-ID: 129c8f1739957589584d9525e1d8782eae3f412f.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Mon, 2020-06-22 at 15:28 -0400, Robert Haas wrote:
> The weirdness is the problem here, at least for me. Generally, I
> don't
> like GUCs of the form give_me_the_old_strange_behavior=true

I agree with all of that in general.

> I don't think it necessarily implies that either. I do however have
> some concerns about people using the GUC as a crutch.

Another way of looking at it is that the weird behavior is already
there in v12, so there are already users relying on this weird behavior
as a crutch for some other planner mistake. The question is whether we
want to:

(a) take the weird behavior away now as a consequence of implementing
disk-based HashAgg; or
(b) support the weird behavior forever; or
(c) introduce a GUC now to help transition away from the weird behavior

The danger with (c) is that it gives users more time to become more
reliant on the weird behavior; and worse, a GUC could be seen as an
endorsement of the weird behavior rather than a path to eliminating it.
So we could intend to do (c) and end up with (b). We can mitigate this
with documentation warnings, perhaps.

> I am slightly
> worried that this is going to have hard-to-fix problems and that
> we'll
> be stuck with the GUC for that reason.

Without the GUC, it's basically a normal cost-based decision, with all
of the good and bad that comes with that.

> Now if that is the case, is
> removing the GUC any better? Maybe not. These decisions are hard, and
> I am not trying to pretend like I have all the answers.

I agree that there is no easy answer.

My philosophy here is: if a user does experience a plan regression due
to my change, would it be reasonable to tell them that we don't have
any escape hatch or transition period at all? That would be a tough
sell for such a common plan type.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Steve Estes 2020-06-23 18:12:50 Re: Summary of DDL/DML statement return/output values?
Previous Message Robert Haas 2020-06-22 19:28:14 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-06-22 20:27:46 Re: Parallel Seq Scan vs kernel read ahead
Previous Message Andres Freund 2020-06-22 20:09:39 Re: Backpatch b61d161c14