Re: postgres documentation - proposed improvement/clarification

From: "Graeme B(dot) Bell" <grb(at)skogoglandskap(dot)no>
To: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: postgres documentation - proposed improvement/clarification
Date: 2015-06-04 09:02:08
Message-ID: DA2D80A5-4717-48A0-9BB8-3E59C3452A00@skogoglandskap.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>>
>> permitted values (in kB): -1 (auto-tuning) and 32-65536.
> ===> what does '32-65536' mean? I know what it means, but if someone is very stressed and looking at it for the first time, it looks like nonsense!

Indeed, this is an interesting special case that I used to draw out this problem.
The documentation mentions that values below 32 are treated as 32.
But I don't think we should try to squeeze the entire documentation into that line.

In this particular example though, saying 0-65536 might be considered misleading since some of those values change into 32.
32-65536 is confusing in a slightly different sense, because 0-31 are actually valid possibilities, but they definitely don't do what you'd expect.

The underlying issue is:

a) should you list the complete range of inputs that postgresql will accept as a permitted value?
b) or should you list the complete range of 'sensible' inputs that postgresql will accept as a permitted value?

Perhaps you are right and (a) is an easier choice to maintain.

Graeme.

On 02 Jun 2015, at 21:19, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> wrote:

> On 02/06/15 23:58, Graeme B. Bell wrote:
>> Hi everyone
>>
>> The documentation for postgres is generally great, but I noticed a problem today while using the doc webpages to reply to a user on the pgsql-performance mailing list.
>>
>> The problem relates to how default settings are communicated in the documentation. Keep in mind that not all postgresql admins have English as their first language, so it should not be necessary to guess from the phrasing of a paragraph about what is set by default. Also, many people keep around older config files, or don't have a vanilla postgresql.conf file handy to check for reference (and you never know, someone might have modified your vanilla reference .conf...). So the documentation is for many the primary reference and it should be clear exactly what postgres does in the absence of actively chosen configuration settings. It should also be clear what those settings can be, and how they should be entered.
>>
>> Take a look at this page, as an example: http://www.postgresql.org/docs/9.4/static/runtime-config-wal.html
>>
>> Thoughts:
>>
>> 1. Default values are not always specified for each setting, but should be.
>> Example: documentation for fsync (boolean) doesn't have the default specified.
>>
>>
>> 2. Default values are not specified in a consistent place or style in the text.
>> Examples: take a look at
>> wal_level (enum)
>> full_page_writes (boolean)
>> wal_buffers (integer)
>>
>>
>> 3. Information about default values are sometimes mixed into longer sentences on another topic. This isn't a big problem but it makes it harder to spot the default value in the paragraph.
>> Example:
>> wal_buffers (integer)
>>
>>
>> 4. Default values are sometimes documented in a slightly different style or format to their actual use in the config file. For example, integers like 5 are given as text 'five'. This isn't a big problem but it makes it harder to find the default value in the paragraph; you're looking for an integer in the text, but the number is written as a string. It might be better to break the writing convention of putting some numbers as text in English. This is a document explaining what to type into the config file. Examples or defaults should always be valid cases if copied directly into the config file.
>>
>> Example:
>> commit_delay (integer)
>> "The default commit_delay is zero (no delay)" (actual commit_delay default is '0', of course, not 'zero')
>> vs.
>> checkpoint_completion_target (floating point)
>> "The default is 0.5."
>>
>>
>> 5. Where the type is specified as 'boolean', the normal & default values are not 'true/false' or '1/0', as would be expected for a boolean typed parameter. Yes, I know on/off is also boolean, but I bet if you surveyed 100 programmers and asked them about likely default values for a boolean setting, few would say 'on' in reply. It actually makes me wonder if this is better described to users as a 2-value enum type.
>> Example:
>> full_page_writes (boolean)
>> "The default is on."
>>
>>
>> 6. The present method of documenting the datatype alongside the name isn't actually that helpful for most people reading the documentation. How many readers are helped by knowing that wal_sync_method is an (enum) as the first thing they read about it?
>>
>>
>> 7. Default units? And should units be included in the setting value?
>>
>> Look at this example. Can anyone tell me, using *only* reference to this parameter documentation, if the parameter can be set to "8", "8kB", "8KB" or "8MB" in the config file?
>> Again, using only this documentation, can you tell for certain that if I choose '8' it will be bytes , or kb, or a configuration error?
>>
>> =====
>> wal_buffers (integer)
>> The amount of shared memory used for WAL data that has not yet been written to disk. The default setting of -1 selects a size equal to 1/32nd (about 3%) ofshared_buffers, but not less than 64kB nor more than the size of one WAL segment, typically 16MB. This value can be set manually if the automatic choice is too large or too small, but any positive value less than 32kB will be treated as 32kB. This parameter can only be set at server start.
>>
>> The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once. The auto-tuning selected by the default setting of -1 should give reasonable results in most cases.
>> =====
>>
>>
>>
>> Proposed solutions.
>>
>> Perhaps it might be worth extending or replacing the type information in the header, by including info about the default, possibly replacing the type info at that part of the document.
>>
>>
>> e.g. How about this style?
>>
>> synchronous_commit (default: on)
>>
>> Specifies whether transaction commit will wait for WAL records to be...
>>
>>
>> or this style?
>>
>> synchronous_commit (enum, default: on)
>>
>> Specifies whether transaction commit will wait for WAL records to be...
>>
>>
>> or this?
>>
>> synchronous_commit (enum)
>> permitted values: on, remote_write, local, off
>> default: on
>>
>>
>> wal_buffers (integer)
>> permitted values (in kB): -1 (auto-tuning) and 32-65536.
> ===> what does '32-65536' mean? I know what it means, but if someone is very stressed and looking at it for the first time, it looks like nonsense!
>> default: -1
>>
>>
>>
>>
>> In most cases, this information is there in the paragraph somewhere, but presenting the config option in this way would make it easier to refer to without needing to parse and understand the entire description to understand the default and permitted settings.
>>
>> This would make it easier for people to quickly check how their server is setup if a) the config file is lacking the setting or b) may have been modified in the past or c) may have been retained from a previous version of postgres with different defaults.
>>
>> It also means that we don't need e.g. duplicate specification of default values in the text description - e.g. take a look at wal_buffers (integer), which specifies it twice.
>>
>> Thoughts?
>>
>> Graeme Bell
>>
>>
>>
>>
>>
>>
> I suggest that boolean values should use either true or false, consistently..
>
>
> Cheers,
> Gavin
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2015-06-04 21:24:29 Re: Incorrect processing of CREATE TRANSFORM with DDL deparding
Previous Message Michael Paquier 2015-06-03 05:35:01 Re: Incorrect processing of CREATE TRANSFORM with DDL deparding