neqjoinsel versus "refresh materialized view concurrently"

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: neqjoinsel versus "refresh materialized view concurrently"
Date: 2018-03-13 19:07:59
Message-ID: CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following commit has caused a devastating performance regression
in concurrent refresh of MV:

commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Wed Nov 29 22:00:29 2017 -0500

Fix neqjoinsel's behavior for semi/anti join cases.

The below reproduction goes from taking about 1 second to refresh, to
taking an amount of time I don't have the patience to measure.

drop table foobar2 cascade;
create table foobar2 as select * from generate_series(1,200000);
create materialized view foobar3 as select * from foobar2;
create unique index on foobar3 (generate_series );
analyze foobar3;
refresh materialized view CONCURRENTLY foobar3 ;

When I interrupt the refresh, I get a message including this line:

CONTEXT: SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420
newdata WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM
pg_temp_3.pg_temp_16420 newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid OPERATOR(pg_catalog.<>)
newdata.ctid) LIMIT 1"

So I makes sense that the commit in question could have caused a change in
the execution plan. Because these are temp tables, I can't easily get my
hands on them to investigate further.

Cheers,

Jeff

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-13 19:12:59 Re: JIT compiling with LLVM v11
Previous Message Jeevan Ladhe 2018-03-13 18:51:16 Re: Fix error in ECPG while connection handling