Re: libpq compression

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Даниил Захлыстов <usernamedt(at)yandex-team(dot)ru>
Subject: Re: libpq compression
Date: 2020-10-29 15:11:36
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28.10.2020 22:58, Alvaro Herrera wrote:
> On 2020-Oct-26, Konstantin Knizhnik wrote:
>> + while (bufptr < bufend || zpq_buffered(PqStream) != 0) /* has more data to flush or unsent data in internal compression buffer */
>> {
>> - int r;
>> -
>> - r = secure_write(MyProcPort, bufptr, bufend - bufptr);
>> -
>> - if (r <= 0)
>> + int r;
>> + size_t processed = 0;
>> + size_t available = bufend - bufptr;
>> + r = PqStream
>> + ? zpq_write(PqStream, bufptr, available, &processed)
>> + : secure_write(MyProcPort, bufptr, available);
>> + bufptr += processed;
>> + PqSendStart += processed;
> This bit is surprising to me. I thought the whole zpq_write() thing
> should be hidden inside secure_write, so internal_flush would continue
> to call just secure_write; and it is that routine's responsibility to
> call zpq_write or be_tls_write or secure_raw_write etc according to
> compile-time options and socket state.
Sorry, may be it is not the nicest way of coding.
Ideally, we should use "decorator design pattern" here where each layer
(compression, TLS,...) is implemented as separate decorator class.
This is how io streams are implemented in java and many other SDKs.
But it requires  too much redesign of Postgres code.

Also from my point of view the core of the problem is that in Postgres
there are two almost independent implementation of networking code for
IMHO it is better to have some SAL (system abstraction layer) which
hides OS specific stuff from rest of the system and which can be shared
by backend and frontend.

In any case I failed to implement it in different way.
The basic requirements was:
1. zpq code should be used both by backend and frontent.
2. decompressor may need to perform multiple reads from underlying layer
to fill its buffer and be able to produce some output.
3. minimize changes in postgres code
4. be able to use zpq library in some other tools (like pooler)

This is why zpq_create function is given tx/rx functions to pefrom IO
secure_write is such tx function for backend (and pqsecure_write for

Actually the name of this function secure_write assumes that it deals
only with TLS, not with compression.
Certainly it is possible to rename it or better introduce some other
functions, i.e. stream_read which will perform this checks.
But please notice that it is not enough to perform all checks in one
functions as you suggested.  It should be really pipe each component of
which is doing its own job:
encryption, compression....

As for me I prefer to have in this place indirect function calls.
But there are several reasons (at least different prototypes of the
which makes me choose the current way.

In any case: there are many different ways of doing the same task.
And different people have own opinion about best/right way of doing it.
Definitely there are some objective criteria: encapsulation, lack of
code duplication,
readability of code,...

I tried to find the best approach base on my own preferences and
requirements described above.
May be I am wrong but then I want to be convinced that suggested
alternative is better.
From my point of view calling compressor from function named
secure_read is not right solution...

Konstantin Knizhnik
Postgres Professional:
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2020-10-29 15:14:50 Re: POC: GROUP BY optimization
Previous Message Dmitry Dolgov 2020-10-29 14:50:25 Re: POC: GROUP BY optimization