Skip site navigation (1) Skip section navigation (2)

Query optimizer 8.0.1 (and 8.0)

From: pgsql(at)mohawksoft(dot)com
To: pgsql-hackers(at)postgresql(dot)org
Subject: Query optimizer 8.0.1 (and 8.0)
Date: 2005-02-04 22:10:18
Message-ID: 16398.24.91.171.78.1107555018.squirrel@mail.mohawksoft.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Here's one:
I have the USA census TIGER database loaded, the WHOLE THING, the whole
country. It isn't the biggest database, but it is about 40G before
indexes. Every table is over a few million rows. I can quite honestly say,
a sequential scan is almost never the right thing to do.

Info below.

Here is the query:
select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr = 2186 or
zipl=2186);

Well, when you look at the explain, obviously it looks like the sequential
scan is better, but trust me, it takes a good few minutes for the query to
respond with sequential scan enabled. I've run analyze and everything.

I suspect that analyze only samples a very small amount of the database
and gets the wrong idea about it. Is there a way to force analyze to
sample more  rows?

tiger=# analyze verbose rt2 ;
INFO:  analyzing "public.rt2"
INFO:  "rt2": scanned 3000 of 1139825 pages, containing 84160 live rows
and 0 dead rows; 3000 rows in sample, 31975891 estimated total rows
ANALYZE
tiger=# analyze verbose rt1 ;
INFO:  analyzing "public.rt1"
INFO:  "rt1": scanned 3000 of 1527360 pages, containing 90951 live rows
and 0 dead rows; 3000 rows in sample, 46304973 estimated total rows
ANALYZE



tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr
= 2186 or zipl=2186);
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Hash Join  (cost=121978.81..3996190.89 rows=21118 width=520)
   Hash Cond: ("outer".tlid = "inner".tlid)
   ->  Seq Scan on rt2  (cost=0.00..1459291.36 rows=31946636 width=218)
   ->  Hash  (cost=120662.36..120662.36 rows=30579 width=302)
         ->  Index Scan using rt1_zipr, rt1_zipl on rt1 
(cost=0.00..120662.36 rows=30579 width=302)
               Index Cond: ((zipr = 2186) OR (zipl = 2186))
(6 rows)

tiger=# set enable_seqscan=no;
SET
tiger=# explain select * from rt1, rt2 where rt1.tlid = rt2.tlid and (zipr
= 2186 or zipl=2186);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..46868256.00 rows=21118 width=520)
   ->  Index Scan using rt1_zipr, rt1_zipl on rt1  (cost=0.00..120662.36
rows=30579 width=302)
         Index Cond: ((zipr = 2186) OR (zipl = 2186))
   ->  Index Scan using rt2_tlid on rt2  (cost=0.00..1523.80 rows=396
width=218)
         Index Cond: ("outer".tlid = rt2.tlid)




            Table "public.rt1"
  Column   |       Type        | Modifiers
-----------+-------------------+-----------
 tlid      | integer           |
 side1     | character varying |
 source    | character varying |
 fedirp    | character varying |
 fename    | character varying |
 fetype    | character varying |
 fedirs    | character varying |
 cfcc      | character varying |
 fraddl    | character varying |
 toaddl    | character varying |
 fraddr    | character varying |
 toaddr    | character varying |
 friaddl   | character varying |
 toiaddl   | character varying |
 friaddr   | character varying |
 toiaddr   | character varying |
 zipl      | integer           |
 zipr      | integer           |
 aianhhfpl | character varying |
 aianhhfpr | character varying |
 aihhtlil  | character varying |
 aihhtlir  | character varying |
 statel    | character varying |
 stater    | character varying |
 countyl   | character varying |
 countyr   | character varying |
 cousubl   | character varying |
 cousubr   | character varying |
 submcdl   | character varying |
 submcdr   | character varying |
 placel    | character varying |
 placer    | character varying |
 tractl    | character varying |
 tractr    | character varying |
 blockl    | character varying |
 blockr    | character varying |
 frlong    | numeric           |
 frlat     | numeric           |
 tolong    | numeric           |
 tolat     | numeric           |
Indexes:
    "rt1_fename" btree (fename)
    "rt1_frlat" btree (frlat)
    "rt1_frlong" btree (frlong)
    "rt1_tlid" btree (tlid)
    "rt1_tolat" btree (tolat)
    "rt1_tolong" btree (tolong)
    "rt1_zipl" btree (zipl)
    "rt1_zipr" btree (zipr)

      Table "public.rt2"
 Column |  Type   | Modifiers
--------+---------+-----------
 tlid   | integer |
 rtsq   | integer |
 long1  | numeric |
 lat1   | numeric |
 long2  | numeric |
 lat2   | numeric |
 long3  | numeric |
 lat3   | numeric |
 long4  | numeric |
 lat4   | numeric |
 long5  | numeric |
 lat5   | numeric |
 long6  | numeric |
 lat6   | numeric |
 long7  | numeric |
 lat7   | numeric |
 long8  | numeric |
 lat8   | numeric |
 long9  | numeric |
 lat9   | numeric |
 long10 | numeric |
 lat10  | numeric |
Indexes:
    "rt2_tlid" btree (tlid)



Responses

pgsql-hackers by date

Next:From: Philip WarnerDate: 2005-02-04 22:14:16
Subject: Re: Escaping the ARC patent
Previous:From: Tom LaneDate: 2005-02-04 22:02:07
Subject: Re: Escaping the ARC patent

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group