From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <jdavis(at)postgresql(dot)org> |
Subject: | Re: pg13dev: explain partial, parallel hashagg, and memory use |
Date: | 2020-08-05 02:01:18 |
Message-ID: | CAAaqYe-sg7cHgayWwKWZtSyFr5LQEiMExiqmjeHUOKXxHKxWjQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 4, 2020 at 9:44 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 5 Aug 2020 at 13:21, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> >
> > I'm testing with a customer's data on pg13dev and got output for which Peak
> > Memory doesn't look right/useful. I reproduced it on 565f16902.
>
> Likely the sanity of those results depends on whether you think that
> the Memory Usage reported outside of the workers is meant to be the
> sum of all processes or the memory usage for the leader backend.
>
> All that's going on here is that the Parallel Append is using some
> parallel safe paths and giving one to each worker. The 2 workers take
> the first 2 subpaths and the leader takes the third. The memory usage
> reported helps confirm that's the case.
>
> Can you explain what you'd want to see changed about this? Or do you
> want to see the non-parallel worker memory be the sum of all workers?
> Sort does not seem to do that, so I'm not sure if we should consider
> hash agg as an exception to that.
I've always found the way we report parallel workers in EXPLAIN quite
confusing. I realize it matches the actual implementation model (the
leader often is also "another worker", but I think the natural
expectation from a user perspective would be that you'd show as
workers all backends (including the leader) that did work, and then
aggregate into a summary line (where the leader is displayed now).
In the current output there's nothing really to hint to the use that
the model is leader + workers and that the "summary" line is really
the leader. If I were to design this from scratch, I'd want to propose
doing what I said above (summary aggregate line + treat leader as a
worker line, likely with a "leader" tag), but that seems like a big
change to make now. On the other hand, perhaps designating what looks
like a summary line as the "leader" or some such would help clear up
the confusion? Perhaps it could also say "Participating" or
"Non-participating"?
James
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2020-08-05 02:11:09 | Re: [DOC] Document concurrent index builds waiting on each other |
Previous Message | David Rowley | 2020-08-05 01:44:17 | Re: pg13dev: explain partial, parallel hashagg, and memory use |