Re: Correct the documentation for work_mem

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Correct the documentation for work_mem
Date: 2023-04-21 17:39:45
Message-ID: CABwTF4XAHt7efd=8bhsgsh-vjEtSXBnSLrOUe6TxAcRrPFqHbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 21, 2023 at 10:15 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
> > On 21.04.23 16:28, Imseih (AWS), Sami wrote:
> >> I suggest a small doc fix:
> >> “Note that for a complex query, several sort or hash operations might be
> >> running simultaneously;”
>
> > Here is a discussion of these terms:
> > https://takuti.me/note/parallel-vs-concurrent/
>
> > I think "concurrently" is the correct word here.
>
> Probably, but it'd do little to remove the confusion Sami is on about,

+1.

When discussing this internally, Sami's proposal was in fact to use
the word 'concurrently'. But given that when it comes to computers and
programming, it's common for someone to not understand the intricate
difference between the two terms, we thought it's best to not use any
of those, and instead use a word not usually associated with
programming and algorithms.

Aside: Another pair of words I see regularly used interchangeably,
when in fact they mean different things: precise vs. accurate.

> especially since the next sentence uses "concurrently" to describe the
> other case. I think we need a more thorough rewording, perhaps like
>
> - Note that for a complex query, several sort or hash operations might be
> - running in parallel; each operation will generally be allowed
> + Note that a complex query may include several sort or hash
> + operations; each such operation will generally be allowed

This wording doesn't seem to bring out the fact that there could be
more than one work_mem consumer running (in-progress) at the same
time. The reader to could mistake it to mean hashes and sorts in a
complex query may happen one after the other.

+ Note that a complex query may include several sort and hash operations, and
+ more than one of these operations may be in progress simultaneously at any
+ given time; each such operation will generally be allowed

I believe the phrase "several sort _and_ hash" better describes the
possible composition of a complex query, than does "several sort _or_
hash".

> I also find this wording a bit further down to be poor:
>
> Hash-based operations are generally more sensitive to memory
> availability than equivalent sort-based operations. The
> memory available for hash tables is computed by multiplying
> <varname>work_mem</varname> by
> <varname>hash_mem_multiplier</varname>. This makes it
>
> I think "available" is not le mot juste, and it's also unclear from
> this whether we're speaking of the per-hash-table limit or some
> (nonexistent) overall limit. How about
>
> - memory available for hash tables is computed by multiplying
> + memory limit for a hash table is computed by multiplying

+1

Best regards,
Gurjeet https://Gurje.et
Postgres Contributors Team, http://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-04-21 17:41:08 Re: Commitfest 2023-03 starting tomorrow!
Previous Message Regina Obe 2023-04-21 17:37:25 RE: Order changes in PG16 since ICU introduction