Re: Test to dump and restore objects left behind by regression

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Test to dump and restore objects left behind by regression
Date: 2024-02-22 09:16:50
Message-ID: b0635739-39f0-4a29-9127-f62aa570a2d8@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22.02.24 02:01, Michael Paquier wrote:
> On Wed, Feb 21, 2024 at 12:18:45PM +0530, Ashutosh Bapat wrote:
>> Even with 1 and 2 the test is useful to detect dump/restore anomalies.
>> I think we should improve 3, but I don't have a good and simpler
>> solution. I didn't find any way to compare two given clusters in our
>> TAP test framework. Building it will be a lot of work. Not sure if
>> it's worth it.
>
> + my $rc =
> + system($ENV{PG_REGRESS}
> + . " $extra_opts "
> + . "--dlpath=\"$dlpath\" "
> + . "--bindir= "
> + . "--host="
> + . $node->host . " "
> + . "--port="
> + . $node->port . " "
> + . "--schedule=$srcdir/src/test/regress/parallel_schedule "
> + . "--max-concurrent-tests=20 "
> + . "--inputdir=\"$inputdir\" "
> + . "--outputdir=\"$outputdir\"");
>
> I am not sure that it is a good idea to add a full regression test
> cycle while we have already 027_stream_regress.pl that would be enough
> to test some dump scenarios. These are very expensive and easy to
> notice even with a high level of parallelization of the tests.

The problem is, we don't really have any end-to-end coverage of

dump
restore
dump again
compare the two dumps

with a database with lots of interesting objects in it.

Note that each of these steps could fail.

We have somewhat relied on the pg_upgrade test to provide this testing,
but we have recently discovered that the dumps in binary-upgrade mode
are different enough to not test the normal dumps well.

Yes, this test is a bit expensive. We could save some time by doing the
first dump at the end of the normal regress test and have the pg_dump
test reuse that, but then that would make the regress test run a bit
longer. Is that a better tradeoff?

I have done some timing tests:

master:

pg_dump check: 22s
pg_dump check -j8: 8s
check-world -j8: 2min44s

patched:

pg_dump check: 34s
pg_dump check -j8: 13s
check-world -j8: 2min46s

So overall it doesn't seem that bad.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-02-22 09:26:49 Re: A problem about partitionwise join
Previous Message Jelte Fennema-Nio 2024-02-22 09:01:36 Re: When extended query protocol ends?