Skip site navigation (1) Skip section navigation (2)

Trouble with LEFT JOIN using VIEWS.

From: Piotr Gasidło <quaker(at)barbara(dot)eu(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Trouble with LEFT JOIN using VIEWS.
Date: 2007-12-07 10:55:08
Message-ID: 4759268C.6090508@barbara.eu.org (view raw or flat)
Thread:
Lists: pgsql-performance
Hello,

I've just hit problem, that is unusual for me.

quaker=> \d sites
                                 Table "public.sites"
   Column   |       Type        |                     Modifiers 

-----------+-------------------+----------------------------------------------------
  id        | integer           | not null default 
nextval('sites_id_seq'::regclass)
  site_name | character varying | not null
  user_id   | integer           | not null
  extra     | integer           |
Indexes:
     "sites_pkey" PRIMARY KEY, btree (id)
     "sites_site_name_key_unique" UNIQUE, btree (site_name text_pattern_ops)
     "sites_user_id_key" btree (user_id)

quaker=> \d users
                                 Table "public.users"
   Column   |       Type        |                     Modifiers 

-----------+-------------------+----------------------------------------------------
  id        | integer           | not null default 
nextval('users_id_seq'::regclass)
  user_name | character varying | not null
  extra     | integer           |
Indexes:
     "users_pkey" PRIMARY KEY, btree (id)

Both tables filled with 100k records of random data. In users id is in 
range from 1..100k, same in sites. In sites user_id is random, range 
from 1..150k.

I've created views:

quaker=> \d users_secure
         View "public.users_secure"
   Column   |       Type        | Modifiers
-----------+-------------------+-----------
  id        | integer           |
  user_name | character varying |
View definition:
  SELECT users.id, users.user_name
    FROM users;

quaker=> \d users_secure_with_has_extra
  View "public.users_secure_with_has_extra"
   Column   |       Type        | Modifiers
-----------+-------------------+-----------
  id        | integer           |
  user_name | character varying |
  has_extra | boolean           |
View definition:
  SELECT users.id, users.user_name, users.extra IS NOT NULL AS has_extra
    FROM users;

Now, when I do simple query to find all data for sites matching 
site_name like 'H3bh%' (there are at least one record in sites matching 
this condition).

quaker=> explain analyze select s.site_name,u.user_name from 
sites_secure s left join users_secure_with_has_extra u on u.id = 
s.user_id where site_name like 'H3bh%' order by site_name limit 10;
 
QUERY PLAN 

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=3897.02..3897.03 rows=2 width=44) (actual 
time=430.326..430.331 rows=1 loops=1)
    ->  Sort  (cost=3897.02..3897.03 rows=2 width=44) (actual 
time=430.321..430.323 rows=1 loops=1)
          Sort Key: sites.site_name
          ->  Nested Loop Left Join  (cost=0.00..3897.01 rows=2 
width=44) (actual time=290.103..430.301 rows=1 loops=1)
                Join Filter: ("inner".id = "outer".user_id)
                ->  Index Scan using sites_site_name_key_unique on sites 
  (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.054 rows=1 
loops=1)
                      Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) 
AND ((site_name)::text ~<~ 'H3bi'::text))
                      Filter: ((site_name)::text ~~ 'H3bh%'::text)
                ->  Seq Scan on users  (cost=0.00..1641.00 rows=100000 
width=20) (actual time=0.007..245.406 rows=100000 loops=1)
  Total runtime: 430.432 ms
(10 rows)

When I resign from LEFT JOIN users_secure_with_has_extra, and put JOIN 
instead I've got:

quaker=> explain analyze select s.site_name,u.user_name from 
sites_secure s join users_secure_with_has_extra u on u.id = s.user_id 
where site_name like 'H3bh%' order by site_name limit 10;
 
QUERY PLAN 

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=9.05..9.06 rows=1 width=24) (actual time=0.112..0.118 
rows=1 loops=1)
    ->  Sort  (cost=9.05..9.06 rows=1 width=24) (actual 
time=0.106..0.108 rows=1 loops=1)
          Sort Key: sites.site_name
          ->  Nested Loop  (cost=0.00..9.04 rows=1 width=24) (actual 
time=0.073..0.088 rows=1 loops=1)
                ->  Index Scan using sites_site_name_key_unique on sites 
  (cost=0.00..6.01 rows=1 width=16) (actual time=0.044..0.050 rows=1 
loops=1)
                      Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) 
AND ((site_name)::text ~<~ 'H3bi'::text))
                      Filter: ((site_name)::text ~~ 'H3bh%'::text)
                ->  Index Scan using users_pkey on users 
(cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.023 rows=1 loops=1)
                      Index Cond: (users.id = "outer".user_id)
  Total runtime: 0.216 ms
(10 rows)

As explain shows PostgreSQL is using index scan on users, instead of seq 
scan like in example above.

Now. When I use view with no has_extra field (important: field is a 
simple function on extra field) I get expectable results. Both using 
indexes.

quaker=> explain analyze select s.site_name,u.user_name from 
sites_secure s left join users_secure u on u.id = s.user_id where 
site_name like 'H3bh%' order by site_name limit 10;
 
QUERY PLAN 

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=9.05..9.06 rows=1 width=24) (actual time=0.111..0.117 
rows=1 loops=1)
    ->  Sort  (cost=9.05..9.06 rows=1 width=24) (actual 
time=0.105..0.107 rows=1 loops=1)
          Sort Key: sites.site_name
          ->  Nested Loop Left Join  (cost=0.00..9.04 rows=1 width=24) 
(actual time=0.072..0.087 rows=1 loops=1)
                ->  Index Scan using sites_site_name_key_unique on sites 
  (cost=0.00..6.01 rows=1 width=16) (actual time=0.043..0.049 rows=1 
loops=1)
                      Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) 
AND ((site_name)::text ~<~ 'H3bi'::text))
                      Filter: ((site_name)::text ~~ 'H3bh%'::text)
                ->  Index Scan using users_pkey on users 
(cost=0.00..3.02 rows=1 width=16) (actual time=0.019..0.022 rows=1 loops=1)
                      Index Cond: (users.id = "outer".user_id)
  Total runtime: 0.216 ms
(10 rows)

quaker=> explain analyze select s.site_name,u.user_name from 
sites_secure s join users_secure u on u.id = s.user_id where site_name 
like 'H3bh%' order by site_name limit 10;
 
QUERY PLAN 

-----------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=9.05..9.06 rows=1 width=24) (actual time=0.109..0.115 
rows=1 loops=1)
    ->  Sort  (cost=9.05..9.06 rows=1 width=24) (actual 
time=0.104..0.106 rows=1 loops=1)
          Sort Key: sites.site_name
          ->  Nested Loop  (cost=0.00..9.04 rows=1 width=24) (actual 
time=0.071..0.086 rows=1 loops=1)
                ->  Index Scan using sites_site_name_key_unique on sites 
  (cost=0.00..6.01 rows=1 width=16) (actual time=0.042..0.048 rows=1 
loops=1)
                      Index Cond: (((site_name)::text ~>=~ 'H3bh'::text) 
AND ((site_name)::text ~<~ 'H3bi'::text))
                      Filter: ((site_name)::text ~~ 'H3bh%'::text)
                ->  Index Scan using users_pkey on users 
(cost=0.00..3.02 rows=1 width=16) (actual time=0.018..0.021 rows=1 loops=1)
                      Index Cond: (users.id = "outer".user_id)
  Total runtime: 0.214 ms
(10 rows)

Why?

quaker=> select version();
                                                         version 

-----------------------------------------------------------------------------------------------------------------------
  PostgreSQL 8.1.10 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.3 
20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)
(1 row)



Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2007-12-07 15:36:04
Subject: Re: Trouble with LEFT JOIN using VIEWS.
Previous:From: Guillaume CottenceauDate: 2007-12-07 10:50:19
Subject: Cost-Based Vacuum Delay tuning

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group