Re: libpq compression

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, "o(dot)bartunov(at)postgrespro(dot)ru" <o(dot)bartunov(at)postgrespro(dot)ru>
Subject: Re: libpq compression
Date: 2020-11-05 18:07:34
Message-ID: CAEze2WjTzoUPgyjGkvzDe8yXAtBngNBEfb4i4C3F3Eb73WhDLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 5 Nov 2020 at 17:01, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>
> Sorry, I do not understand your point.
> This view reports network traffic from server's side.
> But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa.
>
> Yes, first few bytes exchanged by client and server during handshake are not compressed.
> But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on
> measured average compression ratio (the main goal of using this network traffic statistic from my point of view).

As I understand it, the current metrics are as follows:

Server
|<- |<- Xx_raw_bytes
| Compression
| |<- Xx_compressed_bytes
Client connection
|
Network

From the views' name 'pg_stat_network_traffic', to me 'Xx_raw_bytes'
would indicate the amount of bytes sent/received over the client
connection (e.g. measured between the Client connection and Network
part, or between the Server/Client connection and Compression/Client
connection sections), because that is my natural understanding of
'raw tx network traffic'. This is why I proposed 'logical' instead
of 'raw', as 'raw' is quite apparently understood differently when
interpreted by different people, whereas 'logical' already implies
that the value is an application logic-determined value (e.g. size
before compression).

The current name implies a 'network' viewpoint when observing this
view, not the 'server'/'backend' viewpoint you describe. If the
'server'/'backend' viewpoint is the desired default viewpoint, then
I suggest to rename the view to `pg_stat_network_compression`, as
that moves the focus to the compression used, and subsequently
clarifies `raw` as the raw application command data.

If instead the name `pg_stat_network_traffic` is kept, I suggest
changing the metrics collected to the following scheme:

Server
|<- |<- Xx_logical_bytes
| Compression
| |<- Xx_compressed_bytes (?)
|<- |<- Xx_raw_bytes
Client connection
|
Network

This way, `raw` in the context of 'network_traffic' means
"sent-over-the-connection"-data, and 'logical' is 'application logic'
-data (as I'd expect from both a network as an application point of
view). 'Xx_compressed_bytes' is a nice addition, but not strictly
necessary, as you can subtract raw from logical to derive the bytes
saved by compression.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-11-05 18:20:06 Re: Move catalog toast table and index declarations
Previous Message Tomas Vondra 2020-11-05 17:42:10 Re: Fix brin_form_tuple to properly detoast data