buildfarm xversion diff

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: buildfarm xversion diff
Date: 2021-04-15 15:37:22
Message-ID: 20210415153722.GL6091@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Forking https://www.postgresql.org/message-id/20210328231433.GI15100@telsasoft.com

I gave suggestion how to reduce the "lines of diff" metric almost to nothing,
allowing a very small "fudge factor", and which I think makes this a pretty
good metric rather than a passable one.

Thoughts ?

On Sun, Mar 28, 2021 at 06:14:33PM -0500, Justin Pryzby wrote:
> On Sun, Mar 28, 2021 at 04:48:29PM -0400, Andrew Dunstan wrote:
> > Nothing is hidden here - the diffs are reported, see for example
> > <https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=crake&dt=2021-03-28%2015%3A37%3A07&stg=xversion-upgrade-REL9_4_STABLE-HEAD>
> > What we're comparing here is target pg_dumpall against the original
> > source vs target pg_dumpall against the upgraded source.
>
> The command being run is:
>
> https://github.com/PGBuildFarm/client-code/blob/master/PGBuild/Modules/TestUpgradeXversion.pm#L610
> system( "diff -I '^-- ' -u $upgrade_loc/origin-$oversion.sql "
> . "$upgrade_loc/converted-$oversion-to-$this_branch.sql "
> . "> $upgrade_loc/dumpdiff-$oversion 2>&1");
> ...
> my $difflines = `wc -l < $upgrade_loc/dumpdiff-$oversion`;
>
> where -I means: --ignore-matching-lines=RE
>
> I think wc -l should actually be grep -c '^[-+]'
> otherwise context lines count for as much as diff lines.
> You could write that with diff -U0 |wc -l, except the context is useful to
> humans.
>
> With some more effort, the number of lines of diff can be very small, allowing
> a smaller fudge factor.
>
> For upgrade from v10:
> time make -C src/bin/pg_upgrade check oldsrc=`pwd`/10 oldbindir=`pwd`/10/tmp_install/usr/local/pgsql/bin
>
> $ diff -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |wc -l
> 622
>
> Without context:
> $ diff -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |grep -c '^[-+]'
> 142
>
> Without comments:
> $ diff -I '^-- ' -u src/bin/pg_upgrade/tmp_check/dump1.sql src/bin/pg_upgrade/tmp_check/dump2.sql |grep -c '^[-+]'
> 130
>
> Without SET default stuff:
> diff -I '^$' -I "SET default_table_access_method = heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql |less |grep -c '^[-+]'
> 117
>
> Without trigger function call noise:
> diff -I "^CREATE TRIGGER [_[:alnum:]]\+ .* FOR EACH \(ROW\|STATEMENT\) EXECUTE \(PROCEDURE\|FUNCTION\)" -I '^$' -I "SET default_table_access_method = heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql |grep -c '^[-+]'
> 11
>
> Maybe it's important not to totally ignore that, and instead perhaps clean up
> the known/accepted changes like s/FUNCTION/PROCEDURE/:
>
> </home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump2.sql sed '/^CREATE TRIGGER/s/FUNCTION/PROCEDURE/' |diff -I '^$' -I "SET default_table_access_method = heap;" -I "^SET default_toast_compression = 'pglz';$" -I '^-- ' -u /home/pryzbyj/src/postgres/src/bin/pg_upgrade/tmp_check/dump1.sql - |grep -c '^[-+]'
> 11
>
> It seems weird that we don't quote "heap" but we quote tablespaces and not
> toast compression methods.

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-04-15 16:02:45 RE: Schema variables - new implementation for Postgres 15
Previous Message Tom Lane 2021-04-15 15:10:31 Re: Commit ab596105b55 - BRIN minmax-multi indexes