Re: Windows regress fails (latest HEAD)

From: Russell Foster <russell(dot)foster(dot)coding(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows regress fails (latest HEAD)
Date: 2020-11-10 21:22:37
Message-ID: CA+VXQb+_SybU=+paTZ07t7vamTsO+x=_HLEzNW5vTP7_GTXKig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 10, 2020 at 1:52 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
>
>
>
> On 11/10/20 7:15 PM, Tom Lane wrote:
> > Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> >> That's unlikely, I think. The regression tests are constructed so that
> >> the estimates are stable. It's more likely this is some difference in
> >> rounding behavior, for example.
> >
> > The reported delta is in the actual row count, not an estimate.
> > How could that be subject to roundoff issues? And there's no
> > delta in the Append's inputs, so this seems like it's a flat-out
> > miss of a row count in EXPLAIN ANALYZE.
> >
>
> My bad. I've not noticed it's EXPLAIN ANALYZE (COSTS OFF) so I thought
> it's estimates. You're right this can't be a roundoff error.
>
> >> I wonder which msvc builds are used on the machines that fail/pass
> >> the tests, and if the compiler flags are the same.
> >
> > Yeah, this might be a fruitful way to figure out "what's different
> > from the buildfarm".
> >
>
> Yeah. Also Russell claims to have two "same" machines out of which one
> works and the other one fails.

Never claimed they were the same, but they are both Windows x64. Here
are some more details:

Test Passes:
VM machine (Build Server)
Microsoft Windows 10 Pro
Version 10.0.18363 Build 18363
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64

Compile args:
C:\Program Files (x86)\Microsoft Visual
Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
/c /Isrc/include /Isrc/include/port/win32
/Isrc/include/port/win32_msvc /Iexternals/zlib
/Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
/diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
__WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
_CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
/D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
/Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
/Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
/errorReport:queue /MP src/backend/catalog/partition.c

Test Fails:
Laptop machine (Development)
Microsoft Windows 10 Enterprise
Version 10.0.19041 Build 19041
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29112 for x64

Compile args:
C:\Program Files (x86)\Microsoft Visual
Studio\2019\Enterprise\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\CL.exe
/c /Isrc/include /Isrc/include/port/win32
/Isrc/include/port/win32_msvc /Iexternals/zlib
/Iexternals/openssl\include /Isrc/backend /Zi /nologo /W3 /WX-
/diagnostics:column /Ox /D WIN32 /D _WINDOWS /D __WINDOWS__ /D
__WIN32__ /D EXEC_BACKEND /D WIN32_STACK_RLIMIT=4194304 /D
_CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D BUILDING_DLL
/D _MBCS /GF /Gm- /EHsc /MD /GS /fp:precise /Zc:wchar_t /Zc:forScope
/Zc:inline /Fo".\Release\postgres\\" /Fd".\Release\postgres\vc142.pdb"
/Gd /TC /wd4018 /wd4244 /wd4273 /wd4102 /wd4090 /wd4267 /FC
/errorReport:queue /MP src/backend/catalog/partition.c

>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Compiler versions are the same and the build args seem the same. Also,
tried with the latest REL_12_STABLE and REL_13_STABLE, and they both
fail on my dev machine as well.

I'm not going to pretend to know why the explain should be equal in
this case, but I wonder what difference it would make if the output
for both is equal and/or correct? From parttion_prune.out on the
failing machine, line 2810:

-- Multiple partitions
insert into tbl1 values (1001), (1010), (1011);
explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 > tprt.col1;
QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (actual rows=23 loops=1)
-> Seq Scan on tbl1 (actual rows=5 loops=1)
-> Append (actual rows=5 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 < tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 < tbl1.col1)
(15 rows)

explain (analyze, costs off, summary off, timing off)
select * from tbl1 inner join tprt on tbl1.col1 = tprt.col1;
QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (actual rows=3 loops=1)
-> Seq Scan on tbl1 (actual rows=5 loops=1)
-> Append (actual rows=0 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt3_idx on tprt_3 (actual rows=0 loops=3)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt4_idx on tprt_4 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt5_idx on tprt_5 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt6_idx on tprt_6 (never executed)
Index Cond: (col1 = tbl1.col1)
(15 rows)

select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 > tprt.col1
order by tbl1.col1, tprt.col1;
col1 | col1
------+------
501 | 10
501 | 20
505 | 10
505 | 20
505 | 501
505 | 502
1001 | 10
1001 | 20
1001 | 501
1001 | 502
1001 | 505
1010 | 10
1010 | 20
1010 | 501
1010 | 502
1010 | 505
1010 | 1001
1011 | 10
1011 | 20
1011 | 501
1011 | 502
1011 | 505
1011 | 1001
(23 rows)

select tbl1.col1, tprt.col1 from tbl1
inner join tprt on tbl1.col1 = tprt.col1
order by tbl1.col1, tprt.col1;
col1 | col1
------+------
501 | 501
505 | 505
1001 | 1001
(3 rows)

Here the selects return the same values as the expected.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2020-11-10 21:58:26 Re: Additional Chapter for Tutorial
Previous Message Ranier Vilela 2020-11-10 21:19:21 Re: Windows regress fails (latest HEAD)