[PROPOSAL] Use SnapshotAny in get_actual_variable_range

From: Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Cc: root(at)simply(dot)name, Хомик Кирилл <khomikki(at)yandex-team(dot)ru>
Subject: [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Date: 2017-04-27 08:08:30
Message-ID: 05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I'd like to propose to search min and max value in index with SnapshotAny in get_actual_variable_range function.
Current implementation scans index with SnapshotDirty which accepts uncommitted rows and rejects dead rows.
In a code there is a comment about this:
/*
* In principle, we should scan the index with our current
* active snapshot, which is the best approximation we've got
* to what the query will see when executed. But that won't
* be exact if a new snap is taken before running the query,
* and it can be very expensive if a lot of uncommitted rows
* exist at the end of the index (because we'll laboriously
* fetch each one and reject it). What seems like a good
* compromise is to use SnapshotDirty. That will accept
* uncommitted rows, and thus avoid fetching multiple heap
* tuples in this scenario. On the other hand, it will reject
* known-dead rows, and thus not give a bogus answer when the
* extreme value has been deleted; that case motivates not
* using SnapshotAny here.
*/
But if we delete many rows from beginning or end of index, it would be
very expensive too because we will fetch each dead row and reject it.

Following sequence can be used to reproduce this issue:
psql -c "DROP DATABASE test_polygon";
psql -c "CREATE DATABASE test_polygon";
psql test_polygon -c "CREATE EXTENSION postgis";
psql test_polygon -f /tmp/data.sql;
psql test_polygon -c "ANALYZE";

# \d polygon_table
Table "public.polygon_table"
Column | Type | Modifiers
-----------+--------------------------+------------------------------------------------------------
id | integer | not null default nextval('polygon_table_id_seq'::regclass)
time | timestamp with time zone | not null
poly | geometry(Polygon,4326) | not null
second_id | integer | not null
Indexes:
"polygon_table_pkey" PRIMARY KEY, btree (id)
"polygon_table_b179ed4a" btree (second_id)
"polygon_table_poly_id" gist (poly)
"polygon_table_time" btree ("time")
Foreign-key constraints:
"second_table_id" FOREIGN KEY (second_id) REFERENCES second_table(id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED

1st session:
pgbench test_polygon -P 1 -R 6000 -c 24 -j 8 -T 1000 -n -f /tmp/bad_request

2nd session:
psql test_polygon -c "DELETE FROM polygon_table WHERE time <= '2017-03-30 16:00:00+03'"

After delete we have many dead rows in the beginning of index "polygon_table_time" (time).

pgbench output:
progress: 1.0 s, 6023.8 tps, lat 1.170 ms stddev 1.022, lag 0.157 ms
progress: 2.0 s, 6023.8 tps, lat 1.045 ms stddev 0.182, lag 0.076 ms
progress: 3.0 s, 5957.0 tps, lat 1.046 ms stddev 0.176, lag 0.071 ms
progress: 4.0 s, 6066.9 tps, lat 1.061 ms stddev 0.184, lag 0.072 ms
progress: 5.0 s, 6178.1 tps, lat 1.060 ms stddev 0.189, lag 0.076 ms
progress: 6.0 s, 6079.0 tps, lat 1.075 ms stddev 0.195, lag 0.075 ms
progress: 7.0 s, 6246.0 tps, lat 1.069 ms stddev 0.194, lag 0.076 ms
progress: 8.0 s, 6046.0 tps, lat 1.050 ms stddev 0.181, lag 0.073 ms
progress: 9.0 s, 1255.0 tps, lat 79.114 ms stddev 189.686, lag 63.194 ms
progress: 10.0 s, 4696.0 tps, lat 1015.294 ms stddev 36.291, lag 1009.983 ms
progress: 11.0 s, 6031.0 tps, lat 1001.354 ms stddev 59.379, lag 997.375 ms
progress: 12.0 s, 6013.0 tps, lat 961.725 ms stddev 104.536, lag 957.736 ms
progress: 13.0 s, 6098.0 tps, lat 936.516 ms stddev 140.039, lag 932.580 ms
progress: 14.0 s, 6032.0 tps, lat 935.867 ms stddev 137.761, lag 931.892 ms
progress: 15.0 s, 5975.0 tps, lat 950.911 ms stddev 153.438, lag 946.895 ms
progress: 16.0 s, 6044.0 tps, lat 953.380 ms stddev 146.601, lag 949.413 ms
progress: 17.0 s, 6105.0 tps, lat 956.524 ms stddev 134.940, lag 952.593 ms
progress: 18.0 s, 6097.0 tps, lat 950.913 ms stddev 135.902, lag 946.980 ms
progress: 19.0 s, 6004.9 tps, lat 933.010 ms stddev 142.037, lag 929.014 ms
progress: 20.0 s, 6078.1 tps, lat 920.415 ms stddev 157.117, lag 916.469 ms
progress: 21.0 s, 5402.0 tps, lat 945.490 ms stddev 145.262, lag 941.048 ms
progress: 22.0 s, 5226.0 tps, lat 1082.013 ms stddev 141.718, lag 1077.423 ms
progress: 23.0 s, 12794.1 tps, lat 479.046 ms stddev 434.510, lag 478.106 ms
progress: 24.0 s, 5914.8 tps, lat 0.604 ms stddev 0.075, lag 0.067 ms
progress: 25.0 s, 5994.0 tps, lat 0.596 ms stddev 0.071, lag 0.066 ms
progress: 26.0 s, 6126.9 tps, lat 0.598 ms stddev 0.072, lag 0.067 ms
progress: 27.0 s, 6076.2 tps, lat 0.601 ms stddev 0.072, lag 0.068 ms
progress: 28.0 s, 6035.0 tps, lat 0.608 ms stddev 0.077, lag 0.068 ms

After delete (9s) latency increases significantly for up to 1000ms until autovacuum comes and performs index cleanup (23s).
From EXPLAIN ANALYZE we could see, that we have significantly increased Planning time:
# explain (analyze, verbose, timing, buffers) SELECT * FROM polygon_table polygon INNER JOIN second_table second ON (polygon.second_id = second.id) WHERE ST_Intersects(poly, ST_SetSrid(ST_MakePoint(52.3433914, 58.7438431), 4326));
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.56..16.86 rows=1 width=160) (actual time=0.100..0.248 rows=4 loops=1)
Output: polygon.id, polygon."time", polygon.poly, polygon.second_id, second.id
Buffers: shared hit=49
-> Index Scan using polygon_table_poly_id on public.polygon_table polygon (cost=0.29..8.55 rows=1 width=156) (actual time=0.081..0.220 rows=4 loops=1)
Output: polygon.id, polygon."time", polygon.poly, polygon.second_id
Index Cond: (polygon.poly && '0101000020E6100000245DD83FF42B4A4079ED2D40365F4D40'::geometry)
Filter: _st_intersects(polygon.poly, '0101000020E6100000245DD83FF42B4A4079ED2D40365F4D40'::geometry)
Rows Removed by Filter: 6
Buffers: shared hit=37
-> Index Only Scan using second_table_pkey on public.second_table second (cost=0.28..8.29 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=4)
Output: second.id
Index Cond: (second.id = polygon.second_id)
Heap Fetches: 4
Buffers: shared hit=12
Planning time: 115.122 ms
Execution time: 0.422 ms
(16 rows)

Time: 116.926 ms

# explain (analyze, verbose, timing, buffers) SELECT * FROM polygon_table polygon INNER JOIN second_table second ON (polygon.second_id = second.id) WHERE ST_Intersects(poly, ST_SetSrid(ST_MakePoint(52.3433914, 58.7438431), 4326));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.56..16.86 rows=1 width=160) (actual time=0.059..0.373 rows=46 loops=1)
Output: polygon.id, polygon."time", polygon.poly, polygon.second_id, second.id
Buffers: shared hit=170
-> Index Scan using polygon_table_poly_id on public.polygon_table polygon (cost=0.29..8.55 rows=1 width=156) (actual time=0.045..0.269 rows=46 loops=1)
Output: polygon.id, polygon."time", polygon.poly, polygon.second_id
Index Cond: (polygon.poly && '0101000020E6100000245DD83FF42B4A4079ED2D40365F4D40'::geometry)
Filter: _st_intersects(polygon.poly, '0101000020E6100000245DD83FF42B4A4079ED2D40365F4D40'::geometry)
Rows Removed by Filter: 44
Buffers: shared hit=32
-> Index Only Scan using second_table_pkey on public.second_table second (cost=0.28..8.29 rows=1 width=4) (actual time=0.001..0.002 rows=1 loops=46)
Output: second.id
Index Cond: (second.id = polygon.second_id)
Heap Fetches: 46
Buffers: shared hit=138
Planning time: 6.139 ms
Execution time: 0.482 ms
(16 rows)

Time: 7.722 ms

Initially, the function used active snapshot from GetActiveSnapshot(). But in fccebe421d0c410e6378fb281419442c84759213
this behavior was "weakened" to SnapshotDirty (I suppose for a similar reason).
Was there a particular reason for allowing planner to see uncommitted rows, but forbidding him access to the dead ones?
Simple patch that uses SnapshotAny is attached. Comments in code are not changed yet.

data.sql file: https://yadi.sk/d/GtBW6Hhu3HQ4CA <https://yadi.sk/d/GtBW6Hhu3HQ4CA>

Regards,
Dmitriy Sarafannikov

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Harris 2017-04-27 08:08:48 Re: pg_basebackup: Allow use of arbitrary compression program
Previous Message Petr Jelinek 2017-04-27 08:08:03 Re: Logical replication in the same cluster