Re: 225 times slower

From: Pailloncy Jean-Gérard <pailloncy(at)ifrance(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: 225 times slower
Date: 2004-04-22 18:46:51
Message-ID: 79BD69C1-948D-11D8-ACE1-000A95DE2550@ifrance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> The planner is guessing that scanning in rec_id order will produce a
> matching row fairly quickly (sooner than selecting all the matching
> rows
> and sorting them would do). It's wrong in this case, but I'm not sure
> it could do better without very detailed cross-column statistics.

> Am I
> right to guess that the rows that match the WHERE clause are not evenly
> distributed in the rec_id order, but rather there are no such rows till
> you get well up in the ordering?

I must agree that the data are not evenly distributed....

For table url:
count 271.395
min rec_id 1
max rec_id 3.386.962

dps=> select * from url where crc32=419903683;
count 852
min rec_id 264.374
max rec_id 2.392.046

I do
dps=> select ctid, rec_id from url where crc32=419903683 order by
crc32,rec_id;
And then in a text edit extract the "page_id" from ctid
and there is 409 distinct pages for the 852 rows.
There is 4592 pages for the tables url.

dps=> select (rec_id/25), count(*) from url where crc32=419903683 group
by rec_id/25 having count(*)>4 order by count(*) desc;
?column? | count
----------+-------
30289 | 25
11875 | 24
11874 | 24
11876 | 24
28154 | 23
26164 | 21
26163 | 21
55736 | 21
40410 | 20
47459 | 20
30290 | 20
28152 | 20
26162 | 19
30291 | 19
37226 | 19
60357 | 18
28150 | 18
12723 | 17
40413 | 17
40412 | 16
33167 | 15
40415 | 15
12961 | 15
40414 | 15
28151 | 14
63961 | 14
26165 | 13
11873 | 13
63960 | 12
37225 | 12
37224 | 12
20088 | 11
30288 | 11
91450 | 11
20087 | 11
26892 | 10
47458 | 10
40411 | 10
91451 | 10
12722 | 10
28153 | 9
43488 | 9
60358 | 7
60356 | 7
11877 | 7
33168 | 6
91448 | 6
26161 | 6
40409 | 5
28155 | 5
28318 | 5
30292 | 5
26891 | 5
95666 | 5
(54 rows)

An other question, with VACUUM VERBOSE ANALYZE, I see:
> INFO: "url": removed 568107 row versions in 4592 pages
> DETAIL: CPU 0.51s/1.17u sec elapsed 174.74 sec.
And I run pg_autovacuum.
Does the big number (568107) of removed row indicates I should set a
higher max_fsm_pages ?

> grep fsm /var/pgsql/postgresql.conf
max_fsm_pages = 60000 # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 200 # min 100, ~50 bytes each

dps=> VACUUM VERBOSE ANALYSE url;
INFO: vacuuming "public.url"
INFO: index "url_crc" now contains 211851 row versions in 218 pages
DETAIL: 129292 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/1.38u sec elapsed 5.71 sec.
INFO: index "url_seed" now contains 272286 row versions in 644 pages
DETAIL: 568107 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.10s/2.96u sec elapsed 13.06 sec.
INFO: index "url_referrer" now contains 272292 row versions in 603
pages
DETAIL: 568107 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.10s/2.98u sec elapsed 22.30 sec.
INFO: index "url_next_index_time" now contains 272292 row versions in
684 pages
DETAIL: 568107 index row versions were removed.
42 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/1.80u sec elapsed 9.50 sec.
INFO: index "url_status" now contains 272298 row versions in 638 pages
DETAIL: 568107 index row versions were removed.
12 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/2.18u sec elapsed 13.66 sec.
INFO: index "url_bad_since_time" now contains 272317 row versions in
611 pages
DETAIL: 568107 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/2.40u sec elapsed 10.99 sec.
INFO: index "url_hops" now contains 272317 row versions in 637 pages
DETAIL: 568107 index row versions were removed.
5 index pages have been deleted, 0 are currently reusable.
CPU 0.04s/2.24u sec elapsed 12.46 sec.
INFO: index "url_siteid" now contains 272321 row versions in 653 pages
DETAIL: 568107 index row versions were removed.
13 index pages have been deleted, 0 are currently reusable.
CPU 0.14s/2.05u sec elapsed 11.63 sec.
INFO: index "url_serverid" now contains 272321 row versions in 654
pages
DETAIL: 568107 index row versions were removed.
8 index pages have been deleted, 0 are currently reusable.
CPU 0.10s/2.27u sec elapsed 11.45 sec.
INFO: index "url_url" now contains 272065 row versions in 1892 pages
DETAIL: 193884 index row versions were removed.
5 index pages have been deleted, 0 are currently reusable.
CPU 0.39s/1.50u sec elapsed 36.99 sec.
INFO: index "url_last_mod_time" now contains 272071 row versions in
317 pages
DETAIL: 193884 index row versions were removed.
7 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/1.38u sec elapsed 5.61 sec.
INFO: index "url_pkey" now contains 272086 row versions in 328 pages
DETAIL: 193884 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.05s/1.60u sec elapsed 60.64 sec.
INFO: "url": removed 568107 row versions in 4592 pages
DETAIL: CPU 0.51s/1.17u sec elapsed 174.74 sec.
INFO: "url": found 568107 removable, 272027 nonremovable row versions
in 4614 pages
DETAIL: 402 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 1.98s/26.08u sec elapsed 466.27 sec.
INFO: vacuuming "pg_toast.pg_toast_137628026"
INFO: index "pg_toast_137628026_index" now contains 0 row versions in
1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.06 sec.
INFO: "pg_toast_137628026": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.07 sec.
INFO: analyzing "public.url"
INFO: "url": 4624 pages, 150000 rows sampled, 577419 estimated total
rows
VACUUM

Cordialement,
Jean-Gérard Pailloncy

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-04-22 19:22:51 Re: [pgsql-advocacy] MySQL vs PG TPC-H benchmarks
Previous Message Bruce Momjian 2004-04-22 18:31:22 Re: Wierd context-switching issue on Xeon patch for 7.4.1