Re: REFRESH MATERIALIZED VIEW CONCURRENTLY interaction with ORDER BY

From: Awad Mackie <awad(at)zyper(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: REFRESH MATERIALIZED VIEW CONCURRENTLY interaction with ORDER BY
Date: 2019-07-24 17:22:35
Message-ID: CAJOOww=KN2xDzLb7tVf8jHj4K7a20XkH8GWQ7AmJfkp11JS3qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Sorry, realised I could have elaborated a bit more on this one. Running
against PG11.4 provided by Docker Hub this time.

I've added a single update statement which just reverses the last change in
the original SQL:

UPDATE test SET col = 4 WHERE col = 14;

Running a concurrent update I get:

REFRESH MATERIALIZED VIEW CONCURRENTLY m_test;
SELECT * FROM m_test;
col
-----
1
2
5
6
7
8
9
10
13
4
(10 rows)

But then if I go ahead and do a non-concurrent refresh with no further data
changes:

REFRESH MATERIALIZED VIEW m_test;
SELECT * FROM m_test;
col
-----
1
2
4
5
6
7
8
9
10
13
(10 rows)

which is what I was expecting.

On Wed, 24 Apr 2019 at 12:23, Awad Mackie <awad(at)zyper(dot)com> wrote:

> I've added 5 more statements to the attached SQL to demonstrate the
> problem in the last two SELECTs. Ran it against PG 11.2 provided by Docker
> Hub this time and the problem is still visible.
>
> On Thu, 11 Apr 2019 at 21:39, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
>> On Thu, Apr 4, 2019 at 02:54:29PM +0000, PG Doc comments form wrote:
>> > The following documentation comment has been logged on the website:
>> >
>> > Page:
>> https://www.postgresql.org/docs/10/sql-refreshmaterializedview.html
>> > Description:
>> >
>> > On 10.7 we hit a case where the backing query had an order by clause
>> and a
>> > concurrent refresh updated it differently than a normal refresh. This
>> is a
>> > bit of an odd corner case given that up till that point, views would
>> respect
>> > the ordering.
>> >
>> > It's not explicit that CONCURRENTLY populates the data any differently
>> than
>> > normal, specifically, in a way that means the ordering in the backing
>> query
>> > is not always maintained, depending on the order of updates.
>> >
>> > Reading through the comment at
>> >
>> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/matview.c;h=2aac63296bfee535af3ea660c617b265d7ec8042;hb=HEAD#l548
>> > I can see the logic in that, but it could use an explicit mention in the
>> > CONCURRENTLY section. Not sure if there's any plan for changing the
>> > behaviour either.
>> >
>> > Also the existing sentence "If you want the data to be ordered upon
>> > generation, you must use an ORDER BY clause in the backing query." sort
>> of
>> > implies that the ORDER BY will be respected.
>>
>> I ran the attached SQL file on PG 10.7 and PG head and got output that
>> honored the ORDER BY. Is the test wrong? Is something else needed to
>> see the ordering fail. Can you provide an example of the failure? If I
>> remove the ORDER BY from the materialized view, I do get randomly
>> ordered rows.
>>
>> --
>> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
>> EnterpriseDB http://enterprisedb.com
>>
>> + As you are, so once was I. As I am, so you will be. +
>> + Ancient Roman grave inscription +
>>
>
>
> --
> Awad Mackie
> Lead Backend Engineer
>
>
>

--
*Awad Mackie*
Lead Backend Engineer, Zyper

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Eisentraut 2019-07-24 19:59:36 Re: initdb recommendations
Previous Message Andrew Dunstan 2019-07-24 14:00:34 Re: initdb recommendations