Re: Add A Glossary

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Jürgen Purtz <juergen(at)purtz(dot)de>, pgsql-hackers(at)postgresql(dot)org, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Roger Harkavy <rogerharkavy(at)gmail(dot)com>
Subject: Re: Add A Glossary
Date: 2020-05-20 11:38:28
Message-ID: ffda7f2726ed86fa4199a6499b6307c0f8ccefc7.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Wed, 2020-05-20 at 13:17 +0200, Jürgen Purtz wrote:
> > FWIW, I feel somewhat like Alvaro on that point; I use those terms synonymously,
> > perhaps distinguishing between a "started cluster" and a "stopped cluster".
> > After all, "cluster" refers to "a cluster of databases", which are there, regardless
> > if you start the server or not.
> >
> > The term "cluster" is unfortunate, because to most people it suggests a group of
> > machines, so the term "instance" is better, but that ship has sailed long ago.
> >
> > The static part of a cluster to me is the "data directory".
>
> cluster/instance: The different nature (static/dynamic) of what I
> call "cluster" and "instance" as well as the existence of the two
> commands "initdb — create a new PostgreSQL database cluster" and
> "pg_ctl — initialize, start, stop, or control a PostgreSQL server"
> confirms me in my opinion that we need two different terms for
> them.

I think that the "pg_ctl" example does not apply:
It does not talk about starting the cluster, but about starting the server process,
that is "server" in the way I understand it.

> There are situations where we need a single term for both of
> them. "Instance and its data directory" or "Instance and its
> cluster" are too wordy. In many cases we use "database server" or
> "server" in this sense. Imo "Server" is too short and ambiguous.
> "database server", the plural form "databases server", or the new
> term "cluster server", which is more accurate, would be ok for me.
> (Similar to "server", the term "cluster" is also used in many
> different contexts - but only outside of the PG world; within our
> context "cluster" is not ambiguous.)

That does not feel right to me.

"cluster server", ouch. "databases server", ouch as well.

I never felt the term "cluster" was unclear in these contexts.
Sometimes it means "data directory", sometimes it is used for "server process",
but I think few people would think one cound connect to a data directory
or create a process in a directory (initdb).

I think clarity is a Good Thing, but it can be overdone.

> > > server/host: We need a term to describe the underlying hardware respectively
> > > the virtual machine or container, where PG is running. I suggest to use both
> > > *server* and *host*. In computer science, both have their eligibility and are
> > > widely used. Everybody understands *client/server architecture* or *host* in
> > > TCP/IP configuration. We cannot change such matter of course. I suggest to
> > > use both depending on the context, but with the same meaning: "real hardware,
> > > a container, or a virtual machine".
> >
> > On this I have a strong opinion because of my Unix mindset.
> > "machine" and "host" are synonyms, and it doesn't matter to the database if they
> > are virtualized or not. You can always disambiguate by adding "virtual" or "physical".
> >
> > A "server" is a piece of software that responds to client requests, never a machine.
> > In my book, this is purely Windows jargon. The term "client-server architecture"
> > that you quote emphasized that.
> >
> > Perhaps "machine" would be the preferable term, because "host" is more prone to
> > misunderstandings (except in a networking context).
>
> server/host: I agree that we are not interested in the question
> whether there is real hardware or any virtualization container. We
> are even not interested in the operating system. Our primary
> concern is the existence of a port of the Internet Protocol. But
> is the term "server" appropriate to name an IP-port? Additionally,
> "server" is used for other meanings: a) the previously mentioned
> "database server" b) a (virtual) machine: "server-side", "... the
> file ... loaded by the server ..." c) binaries "... the server
> must be built with SSL support ..." d) whenever it seems to be
> appropriate: "standby server", "... the server parses query ...",
> "server configuration", "server process".

You are most thorough :^)

> Because of its ambiguous usage, the definition of "server" must
> clarify the allowed meanings. What's about:
>
> server: Depending on the context, the term *server* denotes:
>
> An IP-port which is offered by any OS. ?????

A port is a server? No way.

> A - possibly virtualized - machine

It might be good to disambiguate that, but I don't think that the PostgreSQL
documentation should use the word "server" to mean "machine".

> An abbreviation for the slightly longer term
> "database(s)/cluster server" ??? this will support the
> readability, but not the clarity ???

"Server" is short for "database server" and is a set of processes that listen
for and handle incoming database client requests.

I think that covers all the meanings you quoted from the documentation,
except c), where it is used as shorthand for "server executable".

Yours,
Laurenz Albe

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2020-05-20 13:32:38 Re: descriptions of pg_stat_user_functions and pg_stat_slru
Previous Message Jürgen Purtz 2020-05-20 11:17:29 Re: Add A Glossary

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-05-20 11:44:27 Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Jürgen Purtz 2020-05-20 11:17:29 Re: Add A Glossary