Re: Should we put command options in alphabetical order in the doc?

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should we put command options in alphabetical order in the doc?
Date: 2023-04-19 21:33:47
Message-ID: CAAKRu_aN7O4XVt+fqMqA6eQHa4P_ZJZQvEiapgrRVWZegzx-4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 19, 2023 at 2:39 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Wed, Apr 19, 2023 at 3:04 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > > While I'm certain that nobody will agree with me on every little
> > > detail, I have to imagine that most would find my preferred ordering
> > > quite understandable and unsurprising, at a high level -- this is not
> > > a hopelessly idiosyncratic ranking, that could just as easily have
> > > been generated by a PRNG. People may not easily agree that "apples are
> > > more important than oranges, or vice-versa", but what does it matter?
> > > I've really only put each option into buckets of items with *roughly*
> > > the same importance. All of the details beyond that don't matter to
> > > me, at all.
> >
> > I agree with you that roughly bucketing items is a good approach.
> > Within each bucket we can then sort alphabetically.
>
> I think of these buckets as working at a logarithmic scale. The FULL,
> FREEZE, VERBOSE, and ANALYZE options are multiple orders of magnitude
> more important than most of the other options, and maybe one order of
> magnitude more important than the PARALLEL, TRUNCATE, and
> INDEX_CLEANUP options. With differences that big, you have a structure
> that generalizes across all users quite well. This doesn't seem
> particularly subjective.

I actually favor query/command order followed by alphabetical order for
most of the commands David included in his patch.

Of course the parameter argument types, like boolean and integer, should
be grouped together separate from the main parameters. David fit this
into the alphabetical paradigm by doing uppercase alphabetical followed
by lowercase alphabetical. There are some specific cases where I think
this isn't working quite as intended in his patch. I've called those out
in my command-by-command code review below.

I actually think we should consider having a single location which
defines all argument types for all SQL command parameters. Then we
wouldn't need to define them for each command. We could simply link to
the definition from the synopsis. That would clean up these lists quite
a bit. Perhaps there is some variation from command to command in the
actual definitions, though (I haven't checked). I would be happy to try
and write this patch if folks are interested in the idea.

As for alphabetical ordering vs importance ordering: while I do think
that if a user does not know what parameter they are looking for, an
alphabetical ordering is unhelpful, I also think the primary issue with
grouping them by "importance" is that it is difficult to maintain. Doing
so requires a discussion of importance for every new option added. That
seems like an annoying bit of overhead to give ourselves. Having a
subjective ordering seems worse than having a rule-based ordering. I
think command/query order followed by alphabetical order is a reasonable
rule-based ordering.

I went and took a look at some of the other SQL commands' documentation
and noticed that they are all pretty different (for good reason).

ALTER ROLE parameters [1], for example, have a seemingly meaningless
order except for the fact that there are pairs of parameters. SUPERUSER
and NOSUPERUSER, INHERIT and NOINHERIT, etc. It might be a bit odd for
these to follow an absolute alphabetical ordering rule.

Many of the CREATE type SQL commands don't really have this problem
because there are only one or two options within each section of the
command and otherwise the order the parameters must appear in the query
dictates their order [2].

Others, like EXPLAIN [3], for example, obviously benefit from an
alphabetical ordering of parameters -- which David has done in the
patch. I think most of the commands that David has patched here are
good candidates for alphabetical ordering.

Below I've reviewed each command in the patch specifically:

For ANALYZE, I think this looks good in its new alphabetized form.
Though table_name is alphabetically last for the lower case parameters
and thus doesn't pose an issue, if it were alphabetically earlier, I
would still favor putting it at the end to maintain a query order then
alphabetical order ordering.

For CLUSTER, I think alphabetical order isn't working well. I think we
should maintain query order followed by alphabetical order. Even though
table_name is optional, in the event that it is included, it would
precede index_name. So, perhaps the order should be VERBOSE, boolean,
table_name, index_name -- which pretty much cancels out alphabetizing.

For COPY, I think the new ordering of COPY has some issues. table_name
is no longer first even though for COPY FROM it is required before the
other parameters. I think this is confusing. Perhaps the options should
be after the other parameters are defined. I think having the options
alphabetized at the end of the others would be nice. So, my suggested
ordering is table_name, column_name, filename, PROGRAM, STDIN, STDOUT,
then the WITH options alphabetically, WHERE, and then the parameter
argument types alphabetically. The last one (where to put the parameter
argument types) I'm not so sure about.

EXPLAIN looks good to me as is.

For REINDEX, I would again suggest a query ordering followed by
alphabetical ordering. CONCURRENTLY, TABLESPACE, VERBOSE, DATABASE,
INDEX, SCHEMA, SYSTEM, TABLE, name, then all of the parameter argument
types alphabetically. (Also, you can put CONCURRENTLY in two different
places in the REINDEX command?)

For VACUUM, I'd perhaps suggest the options in alphabetical order
followed by table_name and then column_name and then putting the
parameter argument types at the end alphabetically.

Of course, we could decide VACUUM is special and group its options by
importance because this is especially helpful for users. I think that
there are other SQL commands whose options' importance is not
particularly worth debating.

I do think we should consider deprecating and dropping documentation of
the options that are supported without parentheses (relevant to commands
like ANALYZE, CLUSTER, VACUUM, and others). It is fine if we keep the
code to make ANALYZE VERBOSE work, but I don't think it is useful to
keep that documented. That is not a concern of this patch, however.

- Melanie

[1] https://www.postgresql.org/docs/devel/sql-alterrole.html
[2] https://www.postgresql.org/docs/devel/sql-createindex.html
[3] https://www.postgresql.org/docs/devel/sql-explain.html

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2023-04-19 21:44:21 Re: [PATCH] Allow Postgres to pick an unused port to listen
Previous Message Andres Freund 2023-04-19 19:20:51 Re: Wrong results from Parallel Hash Full Join