Quick Links

Re: [PATCH] pg_dump: lock tables in batches

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: [PATCH] pg_dump: lock tables in batches
Date:	2022-12-07 17:44:39
Message-ID:	20221207174439.ii2stmiv45aghbnw@awork3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2022-12-07 12:28:03 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here. I'm pretty sure I experimented
> >> with this idea last year and gave up on it. I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
>
> > IIRC the case we were looking at around 989596152 were CPU bound workloads,
> > rather than latency bound workloads. It'd not be surprising to have cases
> > where batching LOCKs helps latency, but not CPU bound.
>
> Yeah, perhaps. Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.

FWIW, one can simulate network latency with 'netem' on linux. Works even for
'lo'.

ping -c 3 -n localhost

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.049 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.043 ms

tc qdisc add dev lo root netem delay 10ms

64 bytes from ::1: icmp_seq=1 ttl=64 time=20.1 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=20.2 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=20.2 ms

tc qdisc delete dev lo root netem

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.047 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.050 ms

> > I wonder if "manual" batching is the best answer. Alexander, have you
> > considered using libpq level pipelining?
>
> I'd be a bit nervous about how well that works with older servers.

I don't think there should be any problem - E.g. pgjdbc has been using
pipelining for ages.

Not sure if it's the right answer, just to be clear. I suspect that eventually
we're going to need to have a special "acquire pg_dump locks" function that is
cheaper than retail lock acquisition and perhaps deals more gracefully with
exceeding max_locks_per_transaction. Which would presumably not be pipelined.

Greetings,

Andres Freund

In response to

Re: [PATCH] pg_dump: lock tables in batches at 2022-12-07 17:28:03 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David G. Johnston	2022-12-07 17:46:26	Re: Error-safe user functions
Previous Message	Andres Freund	2022-12-07 17:34:27	Re: Error-safe user functions