Re: [PATCH] pgbench: add multiconnect option

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pgbench: add multiconnect option
Date: 2021-08-28 09:01:50
Message-ID: alpine.DEB.2.22.394.2108281055020.3654177@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello David,

>>> round-robin and random make sense. I am wondering how round-robin
>>> would work with -C, though? Would you just reuse the same connection
>>> string as the one chosen at the starting point.
>> Well, not necessarily, but this is debatable.
> My expectation for such a behavior would be that it would reconnect to
> a random connstring each time, otherwise what's the point of using
> this with -C? If we needed to forbid some option combinations that is
> also an option.

Yep. ISTM that it should follow the connection policy/strategy, what ever
it is.

>>>> I was thinking of providing a allowing a list of conninfo strings with
>>>> repeated options, eg --conninfo "foo" --conninfo "bla"…
>>> That was my first thought when reading the subject of this thread:
>>> create a list of connection strings and pass one of them to
>>> doConnect() to grab the properties looked for. That's a bit confusing
>>> though as pgbench does not support directly connection strings,
>> They are supported because libpq silently assumes that "dbname" can be a
>> full connection string.
>>> and we should be careful to keep fallback_application_name intact.
>> Hmmm. See attached patch, ISTM that it does the right thing.
> I guess the multiple --conninfo approach is fine; I personally liked
> having the list come from a file, as you could benchmark different
> groups/clusters based on a file, much easier than constructing
> multiple pgbench invocations depending. I can see an argument for
> both approaches. The PGSERVICEFILE was an idea I'd had to store
> easily indexed groups of connection information in a way that I didn't
> need to know all the details, could easily parse, and could later pass
> in the ENV so libpq could just pull out the information.

The attached version does work with the service file if the user provides
"service=whatever" on the command line. The main difference is that it
sticks to the libpq policy to use an explicit connection string or list of
connection strings.

Also, note that the patch I sent dropped the --conninfo option.
Connections are simply tghe last arguments to pgbench.

> I'll see if I can take a look at your latest patch.


> I was also wondering about how we should handle `pgbench -i` with
> multiple connection strings; currently it would only initialize with the
> first DSN it gets, but it probably makes sense to run initialize against
> all of the databases (or at least attempt to).

I'll tend to disagree on this one. Pgbench whole expectation is to run
against "one" system, which might be composed of several nodes because of
replications. I do not think that it is desirable to jump to "serveral
fully independent databases".

> Maybe this is one argument for the multiple --conninfo handling, since
> you could explicitly pass the databases you want. (Not that it is hard
> to just loop over connection info and `pgbench -i` with ENV, or any
> other number of ways to accomplish the same thing.)



In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-08-28 09:49:27 Re: Added schema level support for publication.
Previous Message Trafalgar Ricardo Lu 2021-08-28 08:39:37 Summary of GSoC 2021