Re: libpq compression

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Daniil Zakhlystov <usernamedt(at)yandex-team(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, "o(dot)bartunov(at)postgrespro(dot)ru" <o(dot)bartunov(at)postgrespro(dot)ru>
Subject: Re: libpq compression
Date: 2020-11-06 06:18:48
Message-ID: e1ab60c4-894e-04e2-b42d-3bb4e48a8256@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05.11.2020 21:07, Matthias van de Meent wrote:
> On Thu, 5 Nov 2020 at 17:01, Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>> Sorry, I do not understand your point.
>> This view reports network traffic from server's side.
>> But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa.
>>
>> Yes, first few bytes exchanged by client and server during handshake are not compressed.
>> But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on
>> measured average compression ratio (the main goal of using this network traffic statistic from my point of view).
> As I understand it, the current metrics are as follows:
>
> Server
> |<- |<- Xx_raw_bytes
> | Compression
> | |<- Xx_compressed_bytes
> Client connection
> |
> Network
>
> From the views' name 'pg_stat_network_traffic', to me 'Xx_raw_bytes'
> would indicate the amount of bytes sent/received over the client
> connection (e.g. measured between the Client connection and Network
> part, or between the Server/Client connection and Compression/Client
> connection sections), because that is my natural understanding of
> 'raw tx network traffic'. This is why I proposed 'logical' instead
> of 'raw', as 'raw' is quite apparently understood differently when
> interpreted by different people, whereas 'logical' already implies
> that the value is an application logic-determined value (e.g. size
> before compression).
>
> The current name implies a 'network' viewpoint when observing this
> view, not the 'server'/'backend' viewpoint you describe. If the
> 'server'/'backend' viewpoint is the desired default viewpoint, then
> I suggest to rename the view to `pg_stat_network_compression`, as
> that moves the focus to the compression used, and subsequently
> clarifies `raw` as the raw application command data.
>
> If instead the name `pg_stat_network_traffic` is kept, I suggest
> changing the metrics collected to the following scheme:
>
> Server
> |<- |<- Xx_logical_bytes
> | Compression
> | |<- Xx_compressed_bytes (?)
> |<- |<- Xx_raw_bytes
> Client connection
> |
> Network
>
> This way, `raw` in the context of 'network_traffic' means
> "sent-over-the-connection"-data, and 'logical' is 'application logic'
> -data (as I'd expect from both a network as an application point of
> view). 'Xx_compressed_bytes' is a nice addition, but not strictly
> necessary, as you can subtract raw from logical to derive the bytes
> saved by compression.
Sorry, but "raw" in this context means "not transformed", i.e. not
compressed.
I have not used term uncompressed, because it assumes that there are
"compressed" bytes which is not true if compression is not used.
So "raw" bytes are not bytes which we sent through network - quite
opposite: application writes "raw" (uncompressed) data,
it is compressed ans then compressed bytes are sent.

May be I am wrong, but term "logical" is much more confusing and
overloaded than "raw".
Especially taken in account that it is widely used in Postgres for
logical replication.
The antonym to "logical" is "physical", i.e. something materialized.
But in case of data exchanged between client and server, which one can
be named physical, which one logical?
Did you ever heard about logical size of the file (assuming that may
contain holes or be compressed by file system?)
In zfs it is called "apparent" size.

Also I do not understand at your picture why Xx_compressed_bytes may be
different from Xx_raw_bytes?

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2020-11-06 06:28:03 Re: libpq compression
Previous Message Thomas Munro 2020-11-06 05:40:04 Re: [Patch] Optimize dropping of relation buffers using dlist