Skip site navigation (1) Skip section navigation (2)

query planner and scanning methods

From: Colin Copeland <copelco(at)caktusgroup(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: "team(at)caktusgroup(dot)com s" <team(at)caktusgroup(dot)com>
Subject: query planner and scanning methods
Date: 2008-09-23 21:22:28
Message-ID: 8559E6C3-47ED-40E4-99E6-C56B322310DA@caktusgroup.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hello,

I'm running into performance issues with various queries on a  
PostgreSQL database (of books).  I'm having trouble understanding the  
thinking behind the query planner in this scenario:
http://dpaste.com/hold/80101/
(also attached at bottom of email)

Relation sizes:
dimension_books: 1998766 rows
dimension_library_books: 10397943 rows
Version: PostgreSQL 8.3.3 on x86_64-pc-linux-gnu, compiled by GCC cc  
(GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)

Why does the query planner change when adding OFFSET?  Is there a way  
to force it to use the first plan?  The second plan is relatively  
slower than the first.  I've run ANALYZE recently and played around  
with different sets of indexes, but I believe my knowledge here is  
limited.

count() is equally as slow (SELECT count(DISTINCT  
"dimension_book"."call")...).  Eventually I want to paginate the  
results, kind of like the PostgreSQL Archive search:

Results 1-20 of more than 1000.
Searching in 706,529 pages took 0.13221 seconds.
Result pages: 1 2 3 4 5 6 7 8 9 10 11 ... Next

I assume it implements something something along these lines?

Thanks,
   colin

/******************************************************
                            Table "public.dimension_library_books"
    Column   |  Type   |                              Modifiers
------------+--------- 
+----------------------------------------------------------------------
  id         | integer | not null default  
nextval('dimension_library_books_id_seq'::regclass)
  book_id    | integer | not null
  library_id | integer | not null
Indexes:
     "dimension_library_books_pkey" PRIMARY KEY, btree (id)
     "dimension_library_books_book_id" btree (book_id)
     "dimension_library_books_library_id" btree (library_id)
Foreign-key constraints:
     "dimension_library_books_book_id_fkey" FOREIGN KEY (book_id)  
REFERENCES dimension_book(id) DEFERRABLE INITIALLY DEFERRED
     "dimension_library_books_library_id_fkey" FOREIGN KEY  
(library_id) REFERENCES dimension_library(id) DEFERRABLE INITIALLY  
DEFERRED

                                   Table "public.dimension_book"
   Column  |          Type          |                          Modifiers
----------+------------------------ 
+-------------------------------------------------------------
  id       | integer                | not null default  
nextval('dimension_book_id_seq'::regclass)
  acno     | character varying(255) |
  title    | character varying(255) |
  allusage | double precision       |
  dousage  | double precision       |
  comusage | double precision       |
  year     | integer                |
  language | character varying(255) |
  bclass   | character varying(255) |
  call     | character varying(255) | not null
Indexes:
     "dimension_book_pkey" PRIMARY KEY, btree (id)
     "call_idx" btree (call)
******************************************************/

dimension=# EXPLAIN ANALYZE
SELECT   DISTINCT ON ("dimension_book"."call")
          "dimension_book"."title"
FROM     "dimension_book"
          INNER JOIN "dimension_library_books"
            ON ("dimension_book"."id" =  
"dimension_library_books"."book_id")
WHERE    ("dimension_book"."call" >= 'PA0000'
           AND "dimension_library_books"."library_id" IN (12,15,20))
ORDER BY "dimension_book"."call" ASC
LIMIT 10;
                                                                                   QUERY 
  PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..19141.37 rows=10 width=105) (actual  
time=0.349..1.874 rows=10 loops=1)
    ->  Unique  (cost=0.00..15389657.66 rows=8040 width=105) (actual  
time=0.348..1.865 rows=10 loops=1)
          ->  Nested Loop  (cost=0.00..15389443.94 rows=85489  
width=105) (actual time=0.344..1.832 rows=14 loops=1)
                ->  Index Scan using call_idx on dimension_book   
(cost=0.00..311156.04 rows=806644 width=105) (actual time=0.118..0.452  
rows=133 loops=1)
                      Index Cond: ((call)::text >= 'PA0000'::text)
                ->  Index Scan using dimension_library_books_book_id  
on dimension_library_books  (cost=0.00..18.61 rows=7 width=4) (actual  
time=0.009..0.009 rows=0 loops=133)
                      Index Cond: (dimension_library_books.book_id =  
dimension_book.id)
                      Filter: (dimension_library_books.library_id =  
ANY ('{12,15,20}'::integer[]))
  Total runtime: 1.947 ms
(9 rows)

Time: 3.157 ms

dimension=# EXPLAIN ANALYZE
SELECT   DISTINCT ON ("dimension_book"."call")
          "dimension_book"."title"
FROM     "dimension_book"
          INNER JOIN "dimension_library_books"
            ON ("dimension_book"."id" =  
"dimension_library_books"."book_id")
WHERE    ("dimension_book"."call" >= 'PA0000'
           AND "dimension_library_books"."library_id" IN (12,15,20))
ORDER BY "dimension_book"."call" ASC
LIMIT 10 OFFSET 100;
                                                                                     QUERY 
  PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=137122.20..137122.73 rows=10 width=105) (actual  
time=3428.164..3428.180 rows=10 loops=1)
    ->  Unique  (cost=137116.88..137544.33 rows=8040 width=105)  
(actual time=3427.981..3428.159 rows=110 loops=1)
          ->  Sort  (cost=137116.88..137330.60 rows=85489 width=105)  
(actual time=3427.978..3428.039 rows=212 loops=1)
                Sort Key: dimension_book.call
                Sort Method:  quicksort  Memory: 34844kB
                ->  Hash Join  (cost=71699.90..133790.78 rows=85489  
width=105) (actual time=1676.993..2624.015 rows=167419 loops=1)
                      Hash Cond: (dimension_library_books.book_id =  
dimension_book.id)
                      ->  Bitmap Heap Scan on dimension_library_books   
(cost=3951.25..63069.35 rows=211789 width=4) (actual  
time=112.627..581.554 rows=426156 loops=1)
                            Recheck Cond: (library_id = ANY  
('{12,15,20}'::integer[]))
                            ->  Bitmap Index Scan on  
dimension_library_books_library_id  (cost=0.00..3898.30 rows=211789  
width=0) (actual time=95.030..95.030 rows=426156 loops=1)
                                  Index Cond: (library_id = ANY  
('{12,15,20}'::integer[]))
                      ->  Hash  (cost=57665.60..57665.60 rows=806644  
width=105) (actual time=1484.803..1484.803 rows=799876 loops=1)
                            ->  Seq Scan on dimension_book   
(cost=0.00..57665.60 rows=806644 width=105) (actual  
time=37.391..1028.518 rows=799876 loops=1)
                                  Filter: ((call)::text >=  
'PA0000'::text)
  Total runtime: 3446.154 ms
(15 rows)

Time: 3447.396 ms


-- 
Colin Copeland
Caktus Consulting Group, LLC
P.O. Box 1454
Carrboro, NC 27510
(919) 951-0052
http://www.caktusgroup.com


Responses

pgsql-performance by date

Next:From: Richard BroersmaDate: 2008-09-23 22:07:50
Subject: Re: query planner and scanning methods
Previous:From: Simon RiggsDate: 2008-09-23 19:01:15
Subject: Re: Choosing a filesystem

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group