From: | Önder Kalacı <onderkalaci(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date: | 2023-03-10 09:54:48 |
Message-ID: | CACawEhUu6S8E4Oo7+s5iaq=yLRZJb6uOZeEQSGJj-7NVkDzSaw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Amit, all
>
> Some of the ideas I can think of are as follows:
>
> 1. Combine "SUBSCRIPTION USES INDEX WITH MULTIPLE ROWS AND COLUMNS"
> and "SUBSCRIPTION USES INDEX WITH DROPPED COLUMNS" such that after
> verifying updates and deletes of the first test, we can drop some of
> the columns on both publisher and subscriber, then use alter
> subscription ... refresh publication command and then do the steps of
> the second test. Note that we don't add tables after initial setup,
> only changing schema.
>
Done with an important caveat. I think this reorganization of the test
helped
us to find one edge case regarding dropped columns.
I realized that the dropped columns also get into the tuples_equal()
function. And,
the remote sends NULL to for the dropped columns(i.e., remoteslot), but
index_getnext_slot() (or table_scan_getnextslot) indeed fills the dropped
columns on the outslot. So, the dropped columns are not NULL in the outslot.
This triggers tuples_equal to fail. To fix that, I improved the tuples_equal
such that it skips the dropped columns.
I also spend quite a bit of time understanding how/why this impacts
HEAD. See steps below on HEAD, where REPLICA IDENTITY FULL
fails to replica the data properly:
-- pub
CREATE TABLE test (drop_1 jsonb, x int, drop_2 numeric, y text, drop_3
timestamptz);
ALTER TABLE test REPLICA IDENTITY FULL;
INSERT INTO test SELECT NULL, i, i, (i)::text, now() FROM
generate_series(0,1)i;
CREATE PUBLICATION pub FOR ALL TABLES;
-- sub
CREATE TABLE test (drop_1 jsonb, x int, drop_2 numeric, y text, drop_3
timestamptz);
CREATE SUBSCRIPTION sub CONNECTION 'host=localhost port=5432
dbname=postgres' PUBLICATION pub;
-- show that before dropping the columns, the data in the source and
-- target are deleted properly
DELETE FROM test WHERE x = 0;
-- both on the source and target
SELECT count(*) FROM test WHERE x = 0;
┌───────┐
│ count │
├───────┤
│ 0 │
└───────┘
(1 row)
-- drop columns on both the the source
ALTER TABLE test DROP COLUMN drop_1;
ALTER TABLE test DROP COLUMN drop_2;
ALTER TABLE test DROP COLUMN drop_3;
-- drop columns on both the the target
ALTER TABLE test DROP COLUMN drop_1;
ALTER TABLE test DROP COLUMN drop_2;
ALTER TABLE test DROP COLUMN drop_3;
-- on the target
ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
-- after dropping the columns
DELETE FROM test WHERE x = 1;
-- source
SELECT count(*) FROM test WHERE x = 1;
┌───────┐
│ count │
├───────┤
│ 0 │
└───────┘
(1 row)
*-- target, OOPS wrong result!!!!*SELECT count(*) FROM test WHERE x = 1;
┌───────┐
│ count │
├───────┤
│ 1 │
└───────┘
(1 row)
Should we have a separate patch for the tuples_equal change so that we
might want to backport? Attached as v40_0001 on the patch.
Note that I need to have that commit as 0001 so that 0002 patch
passes the tests.
> 2. We can also combine "Some NULL values" and "PUBLICATION LACKS THE
> COLUMN ON THE SUBS INDEX" as both use the same schema. After the first
> test, we need to drop the existing index and create a new index on the
> subscriber node.
>
done
>
> 3. General comment
> +# updates 200 rows
> +$node_publisher->safe_psql('postgres',
> + "UPDATE test_replica_id_full SET x = x + 1 WHERE x IN (5, 6);");
>
> I think here you are updating 20 rows not 200. So, the comment seems
> wrong to me.
>
I think I have fixed that in an earlier version because I cannot see this
comment anymore.
>
> Please think more and see if we can combine some other tests like
> "Unique index that is not primary key or replica identity" and the
> test we will have after comment#2 above.
>
>
I'll look for more opportunities and reply to the thread. I wanted to send
this mail so that you can have a look at (1) earlier.
Thanks,
Onder
Attachment | Content-Type | Size |
---|---|---|
v40_0001_Skip-dropped_columns_for_tuples_equal.patch | application/octet-stream | 1.2 KB |
v40_0002_use_index_on_subs_when_pub_rep_ident_full.patch | application/octet-stream | 55.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2023-03-10 09:59:12 | Re: pgsql: Use ICU by default at initdb time. |
Previous Message | Peter Eisentraut | 2023-03-10 09:54:14 | Re: pgsql: Allow tailoring of ICU locales with custom rules |