Re: [PATCH] pgbench: add multiconnect option

From: David Christensen <david(dot)christensen(at)crunchydata(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pgbench: add multiconnect option
Date: 2021-08-27 17:29:24
Message-ID: CAOxo6X+hKRLHnUswWdUrtpZnJs=6vwc3-0i+o+-f2eFPMHAFsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> >> Good. I was thinking of adding such capability, possibly for handling
> >> connection errors and reconnecting…
> >
> > round-robin and random make sense. I am wondering how round-robin
> > would work with -C, though? Would you just reuse the same connection
> > string as the one chosen at the starting point.
>
> Well, not necessarily, but this is debatable.

My expectation for such a behavior would be that it would reconnect to
a random connstring each time, otherwise what's the point of using
this with -C? If we needed to forbid some option combinations that is
also an option.

> >> I was thinking of providing a allowing a list of conninfo strings with
> >> repeated options, eg --conninfo "foo" --conninfo "bla"…
> >
> > That was my first thought when reading the subject of this thread:
> > create a list of connection strings and pass one of them to
> > doConnect() to grab the properties looked for. That's a bit confusing
> > though as pgbench does not support directly connection strings,
>
> They are supported because libpq silently assumes that "dbname" can be a
> full connection string.
>
> > and we should be careful to keep fallback_application_name intact.
>
> Hmmm. See attached patch, ISTM that it does the right thing.

I guess the multiple --conninfo approach is fine; I personally liked
having the list come from a file, as you could benchmark different
groups/clusters based on a file, much easier than constructing
multiple pgbench invocations depending. I can see an argument for
both approaches. The PGSERVICEFILE was an idea I'd had to store
easily indexed groups of connection information in a way that I didn't
need to know all the details, could easily parse, and could later pass
in the ENV so libpq could just pull out the information.

> >> Your approach using PGSERVICEFILE also make sense!
> >
> > I am not sure that's actually needed here, as it is possible to pass
> > down a service name within a connection string. I think that you'd
> > better leave libpq do all the work related to a service file, if
> > specified. pgbench does not need to know any of that.
>
> Yes, this is an inconvenient with this approach, part of libpq machinery
> is more or less replicated in pgbench, which is quite annoying, and less
> powerful.

There is some small fraction reproduced here just to pull out the
named sections; no other parsing should be done though.

> Attached my work-in-progress version, with a few open issues (eg probably
> not thread safe), but comments about the provided feature are welcome.
>
> I borrowed the "strategy" option, renamed policy, from the initial patch.
> Pgbench just accepts several connection strings as parameters, eg:
>
> pgbench ... "service=db1" "service=db2" "service=db3"
>
> The next stage is to map scripts to connections types and connections
> to connection types, so that pgbench could run W transactions against a
> primary and R transactions agains a hot standby, for instance. I have a
> some design for that, but nothing is implemented.
>
> There is also the combination with the error handling patch to consider:
> if a connection fails, a connection to a replica could be issued instead.

I'll see if I can take a look at your latest patch. I was also
wondering about how we should handle `pgbench -i` with multiple
connection strings; currently it would only initialize with the first
DSN it gets, but it probably makes sense to run initialize against all
of the databases (or at least attempt to). Maybe this is one argument
for the multiple --conninfo handling, since you could explicitly pass
the databases you want. (Not that it is hard to just loop over
connection info and `pgbench -i` with ENV, or any other number of ways
to accomplish the same thing.)

Best,

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-08-27 17:30:07 Re: Fwd: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4
Previous Message Andrew Dunstan 2021-08-27 17:00:38 Fwd: Big Performance drop of Exceptions in UDFs between V11.2 and 13.4