Index Scan Backward + check additional condition before heap access

From: "Markus Bertheau" <mbertheau(dot)pg(at)googlemail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Index Scan Backward + check additional condition before heap access
Date: 2008-02-07 14:51:57
Message-ID: 684362e10802070651o1cf56a79w32fb792949b1bec3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

(PostgreSQL 8.3)

I'm trying to optimize one of the most often used queries in our system:

(Full minimized pastable schema and data below.)

create table feeds_users (
user_id int references users(id) not null,
feed_id int references feeds(id) not null,
unique(user_id, feed_id)
);

create table items (
id serial primary key,
feed_id int references feeds(id) not null,
title text,
pub_date timestamp
);

create index items_feed_id_idx on items(feed_id);
create index items_pub_date_idx on items(pub_date);
create index items_pub_date_feed_id_idx on items(pub_date, feed_id);
create index feeds_users_feed_id on feeds_users(feed_id);

-- Query variant 1:

EXPLAIN ANALYZE SELECT i.* FROM items i WHERE feed_id IN (
SELECT f.feed_id FROM feeds_users f WHERE f.user_id = ?)
ORDER BY pub_date DESC
LIMIT 20 OFFSET 100;

-- Query variant 2:
EXPLAIN ANALYZE SELECT i.* FROM items i
JOIN feeds_users f ON i.feed_id = f.feed_id AND f.user_id = ?
ORDER BY pub_date DESC
LIMIT 20 OFFSET 100;

The table items contains 700000 rows, feeds_users 99000, feeds 10000 and
users
1000. The number of feeds for each user is distributed logarithmically, i.e.
there are many users with none to little feeds and some users with many
feeds.

In reality, 99% of the rows in items are being inserted in pub_date order,
and the correlation of user_id in feeds_users is not 1 (it is 1 with the
test data).

I need this query to work blisteringly fast for the common case, and at
least not too slow for extreme cases. Extreme cases are:
* no feeds for a user
* very little feeds for a user, with the top 20 items spread over >10% of
table items
* normal number of feeds for a user, but big offset (1000 or 10000 or
100000). The maximum offset could be capped in the application if needed,
but a solution without that would be preferred.

The common case is that the needed rows are within the first (by pub_date
desc) <1% of items.

I ran some tests of both query variants on a Pentium M 1.6 GHz notebook with
1280 MB RAM, shared_buffers = 32MB, temp_buffers 8MB, work_mem 8MB. Three
different user_ids were used for testing; the needed rows for each user are
either 1) not existant, 2) spread over 18% of the table, 3) spread over
0.064% of the table. Also I tried a statistics target of 10 and 100 for the
two columns in feeds_users. Two query variants were used, one with an inner
join and one with IN. I got 4 different plans all in all. Results:

no. stat user_id item rows result rows variant plan time
target scanned w/o limit query

1 10 3 700000 0 in 1 20000 ms
2 join 2 15000 ms
3 49 46855 (18%) 630 in 1 2300 ms
4 join 2 2300 ms
5 109 448 (0.064%) 206780 in 1 6 ms
6 join 2 9 ms
7 100 3 700000 0 in 3 0.2 ms
8 join 2 16500 ms
9 49 46855 (18%) 630 in 4 10 ms
10 join 2 2300 ms
11 109 448 (0.064%) 206780 in 1 6 ms
12 join 2 9 ms

Plans below. Now the questions:

Do the differences in characteristics of the test data and the real data
somehow invalidate these numbers?

I observe, that the query variant with IN is faster in all cases. What's the
difference in them that leads to plans being chosen that differ so much
performance-wise?

Can I somehow trigger the items_pub_date_feed_id_idx to be used? ISTM that
scanning by that index in pub_date desc order and using that same index to
test for a needed feed_id would be faster than accessing the heap for each
tuple.

With a statistics target of 100, in queries no 3 and 9 a different, a very
much faster plan was chosen. How is the statistics target to be determined
such that the faster plan is chosen? Am I going to have to increase the
statistics target as one or more table receive more rows?

Thanks

Markus

Plans:
1 (for no 3)
Limit (cost=1304.78..1565.74 rows=20 width=27) (actual time=
2121.866..2377.740 rows=20 loops=1)
-> Nested Loop IN Join (cost=0.00..57984.39 rows=4444 width=27) (actual
time=9.856..2377.421 rows=120 loops=1)
-> Index Scan Backward using items_pub_date_idx on items i (cost=
0.00..37484.20 rows=700071 width=27) (actual
time=0.131..1152.933rows=127337 loops=1)
-> Index Scan using feeds_users_user_id_key on feeds_users f
(cost=0.00..0.29 rows=1 width=4) (actual time=0.006..0.006 rows=0
loops=127337)
Index Cond: ((f.user_id = 49) AND (f.feed_id = i.feed_id))
Total runtime: 2377.899 ms
(6 rows)

2 (for no. 4)
Limit (cost=542.78..651.33 rows=20 width=27) (actual time=
2133.759..2393.259 rows=20 loops=1)
-> Nested Loop (cost=0.00..249705.41 rows=46005 width=27) (actual time=
24.087..2392.950 rows=120 loops=1)
-> Index Scan Backward using items_pub_date_idx on items i (cost=
0.00..37484.20 rows=700071 width=27) (actual
time=0.067..1171.572rows=127337 loops=1)
-> Index Scan using feeds_users_user_id_key on feeds_users f
(cost=0.00..0.29 rows=1 width=4) (actual time=0.005..0.005 rows=0
loops=127337)
Index Cond: ((f.user_id = 49) AND (f.feed_id = i.feed_id))
Total runtime: 2393.392 ms

3 (for no. 7)
Limit (cost=2227.06..2227.11 rows=20 width=27) (actual
time=0.052..0.052rows=0 loops=1)
-> Sort (cost=2226.81..2229.43 rows=1048 width=27) (actual time=
0.047..0.047 rows=0 loops=1)
Sort Key: i.pub_date
Sort Method: quicksort Memory: 17kB
-> Nested Loop (cost=236.45..2185.37 rows=1048 width=27) (actual
time=0.036..0.036 rows=0 loops=1)
-> HashAggregate (cost=231.72..231.87 rows=15 width=4)
(actual time=0.032..0.032 rows=0 loops=1)
-> Index Scan using feeds_users_user_id_key on
feeds_users f (cost=0.00..231.35 rows=148 width=4) (actual time=
0.027..0.027 rows=0 loops=1)
Index Cond: (user_id = 3)
-> Bitmap Heap Scan on items i (cost=4.73..129.30 rows=75
width=27) (never executed)
Recheck Cond: (i.feed_id = f.feed_id)
-> Bitmap Index Scan on items_feed_id_idx (cost=
0.00..4.71 rows=75 width=0) (never executed)
Index Cond: (i.feed_id = f.feed_id)
Total runtime: 0.136 ms

4 (for no. 9)
Limit (cost=2227.06..2227.11 rows=20 width=27) (actual
time=8.806..8.906rows=20 loops=1)
-> Sort (cost=2226.81..2229.43 rows=1048 width=27) (actual time=
8.456..8.662 rows=120 loops=1)
Sort Key: i.pub_date
Sort Method: top-N heapsort Memory: 25kB
-> Nested Loop (cost=236.45..2185.37 rows=1048 width=27) (actual
time=0.225..6.142 rows=630 loops=1)
-> HashAggregate (cost=231.72..231.87 rows=15 width=4)
(actual time=0.104..0.126 rows=9 loops=1)
-> Index Scan using feeds_users_user_id_key on
feeds_users f (cost=0.00..231.35 rows=148 width=4) (actual time=
0.037..0.062 rows=9 loops=1)
Index Cond: (user_id = 49)
-> Bitmap Heap Scan on items i (cost=4.73..129.30 rows=75
width=27) (actual time=0.076..0.369 rows=70 loops=9)
Recheck Cond: (i.feed_id = f.feed_id)
-> Bitmap Index Scan on items_feed_id_idx (cost=
0.00..4.71 rows=75 width=0) (actual time=0.046..0.046 rows=70 loops=9)
Index Cond: (i.feed_id = f.feed_id)
Total runtime: 9.061 ms

-- Full pastable schema and data generation:

create table feeds (
id serial primary key,
title text
);

create table users (
id serial primary key,
name text
);

create table feeds_users (
user_id int references users(id) not null,
feed_id int references feeds(id) not null,
unique(user_id, feed_id)
);

create table items (
id serial primary key,
feed_id int references feeds(id) not null,
title text,
pub_date timestamp
);

insert into users (name) select 'User ' || i::text as name from
generate_series(1, 1000) as i;
insert into feeds (title) select 'Feed ' || i::text as name from
generate_series(1, 10000) as i;

insert into feeds_users (user_id, feed_id)
select
--(i / 100) + 1 as user_id,
floor(log(i)/log(1.1))+1 as user_id,
((i + floor(log(i)/log(1.1)))::int % 10000) + 1 as feed_id
from generate_series(1, 99000) as i;

insert into items (feed_id, title, pub_date)
select
((i * 17) % 10000) + 1 as feed_id,
'Item ' || i::text as title,
'12/12/2006'::timestamp
+ cast(((i * 547) % 12343)::text || ' hours' as interval)
+ cast((random()*60)::numeric(6,3)::text || ' minutes' as interval)
as pub_date
from
generate_series(1, 700000) as i;

create index items_feed_id_idx on items(feed_id);
create index items_pub_date_idx on items(pub_date);
create index items_pub_date_feed_id_idx on items(pub_date, feed_id);
create index feeds_users_feed_id on feeds_users(feed_id);

analyze;

-- later
alter table feeds_users alter column feed_id set statistics 100;
alter table feeds_users alter column user_id set statistics 100;
analyze;

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2008-02-07 17:06:42 Re: Benchmark Data requested --- pgloader CE design ideas
Previous Message Dimitri Fontaine 2008-02-07 09:31:47 Re: Benchmark Data requested --- pgloader CE design ideas