Re: [PATCH] pg_dump: lock tables in batches

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: lock tables in batches
Date: 2022-12-07 17:44:39
Message-ID: 20221207174439.ii2stmiv45aghbnw@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-12-07 12:28:03 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here. I'm pretty sure I experimented
> >> with this idea last year and gave up on it. I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
>
> > IIRC the case we were looking at around 989596152 were CPU bound workloads,
> > rather than latency bound workloads. It'd not be surprising to have cases
> > where batching LOCKs helps latency, but not CPU bound.
>
> Yeah, perhaps. Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.

FWIW, one can simulate network latency with 'netem' on linux. Works even for
'lo'.

ping -c 3 -n localhost

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.049 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.043 ms

tc qdisc add dev lo root netem delay 10ms

64 bytes from ::1: icmp_seq=1 ttl=64 time=20.1 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=20.2 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=20.2 ms

tc qdisc delete dev lo root netem

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.047 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.050 ms

> > I wonder if "manual" batching is the best answer. Alexander, have you
> > considered using libpq level pipelining?
>
> I'd be a bit nervous about how well that works with older servers.

I don't think there should be any problem - E.g. pgjdbc has been using
pipelining for ages.

Not sure if it's the right answer, just to be clear. I suspect that eventually
we're going to need to have a special "acquire pg_dump locks" function that is
cheaper than retail lock acquisition and perhaps deals more gracefully with
exceeding max_locks_per_transaction. Which would presumably not be pipelined.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-12-07 17:46:26 Re: Error-safe user functions
Previous Message Andres Freund 2022-12-07 17:34:27 Re: Error-safe user functions