Re: Support load balancing in libpq

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support load balancing in libpq
Date: 2022-07-05 12:42:14
Message-ID: CALj2ACXywtN=EhvD_Qi1CxqniwwA4YT0pTz+VKeZ3bLAt2+Lvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2022 at 10:01 PM Jelte Fennema
<Jelte(dot)Fennema(at)microsoft(dot)com> wrote:
>
> Load balancing connections across multiple read replicas is a pretty
> common way of scaling out read queries. There are two main ways of doing
> so, both with their own advantages and disadvantages:
> 1. Load balancing at the client level
> 2. Load balancing by connecting to an intermediary load balancer
>
> Option 1 has been supported by JDBC (Java) for 8 years and Npgsql (C#)
> merged support about a year ago. This patch adds the same functionality
> to libpq. The way it's implemented is the same as the implementation of
> JDBC, and contains two levels of load balancing:
> 1. The given hosts are randomly shuffled, before resolving them
> one-by-one.
> 2. Once a host its addresses get resolved, those addresses are shuffled,
> before trying to connect to them one-by-one.

Thanks for the patch. +1 for the general idea of redirecting connections.

I'm quoting a previous attempt by Satyanarayana Narlapuram on this
topic [1], it also has a patch set.

IMO, rebalancing of the load must be based on parameters (as also
suggested by Aleksander Alekseev in this thread) such as the
read-only/write queries, CPU/IO/Memory utilization of the
primary/standby, network distance etc. We may not have to go the extra
mile to determine all of these parameters dynamically during query
authentication time, but we can let users provide a list of standby
hosts based on "some" priority (Satya's thread [1] attempts to do
this, in a way, with users specifying the hosts via pg_hba.conf file).
If required, randomization in choosing the hosts can be optional.

Also, IMO, the solution must have a fallback mechanism if the
standby/chosen host isn't reachable.

Few thoughts on the patch:
1) How are we determining if the submitted query is read-only or write?
2) What happens for explicit transactions? The queries related to the
same txn get executed on the same host right? How are we guaranteeing
this?
3) Isn't it good to provide a way to test the patch?

[1] https://www.postgresql.org/message-id/flat/CY1PR21MB00246DE1F9E9C58455A78A37915C0%40CY1PR21MB0024.namprd21.prod.outlook.com

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message gkokolatos 2022-07-05 13:22:47 Re: Add LZ4 compression in pg_dump
Previous Message Alvaro Herrera 2022-07-05 12:39:32 Re: Using PQexecQuery in pipeline mode produces unexpected Close messages