Skip site navigation (1) Skip section navigation (2)

Re: Subquery in a JOIN not getting restricted?

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Jay Levitt <jay(dot)levitt(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Subquery in a JOIN not getting restricted?
Date: 2011-11-12 09:28:45
Message-ID: 4EBE3C4D.9010208@archidevsys.co.nz (view raw or flat)
Thread:
Lists: pgsql-performance
On 10/11/11 09:39, Jay Levitt wrote:
> Kevin Grittner wrote:
>> Jay Levitt<jay(dot)levitt(at)gmail(dot)com>  wrote:
>>
>>> I don't get why the GROUP BY in this subquery forces it to scan
>>> the entire users table (seq scan here, index scan on a larger
>>> table) when there's only one row in users that can match:
>
>> Are you sure there's a plan significantly faster than 1.3 ms?
>
> Yep!  Watch this:
>
> drop schema if exists jaytest cascade;
> create schema jaytest;
> set search_path to jaytest;
>
> create table questions (
>   id int not null primary key,
>   user_id int not null
> );
> insert into questions
>   select generate_series(1,1100), (random()*2000000)::int;
>
> create table users (
>   id int not null primary key
> );
> insert into users select generate_series(1, 2000000);
>
> vacuum freeze analyze;
>
> explain analyze
> select questions.id
> from questions
> join (
>   select u.id
>   from users as u
>   group by u.id
> ) as s
> on s.id = questions.user_id
> where questions.id = 1;
>
> -----------------------
>  Merge Join  (cost=8.28..90833.02 rows=1818 width=4) (actual 
> time=888.787..888.790 rows=1 loops=1)
>    Merge Cond: (u.id = questions.user_id)
>    ->  Group  (cost=0.00..65797.47 rows=2000000 width=4) (actual 
> time=0.017..735.509 rows=1747305 loops=1)
>          ->  Index Scan using users_pkey on users u  
> (cost=0.00..60797.47 rows=2000000 width=4) (actual time=0.015..331.990 
> rows=1747305 loops=1)
>    ->  Materialize  (cost=8.28..8.29 rows=1 width=8) (actual 
> time=0.013..0.015 rows=1 loops=1)
>          ->  Sort  (cost=8.28..8.28 rows=1 width=8) (actual 
> time=0.012..0.013 rows=1 loops=1)
>                Sort Key: questions.user_id
>                Sort Method:  quicksort  Memory: 25kB
>                ->  Index Scan using questions_pkey on questions 
> (cost=0.00..8.27 rows=1 width=8) (actual time=0.006..0.006 rows=1 
> loops=1)
>                      Index Cond: (id = 1)
>  Total runtime: 888.832 ms
> (11 rows)
>
> explain analyze
> select questions.id
> from questions
> join (
>   select u.id
>   from users as u
> ) as s
> on s.id = questions.user_id
> where questions.id = 1;
>
> -----------------------
>  Nested Loop  (cost=0.00..16.77 rows=1 width=4) (actual 
> time=0.019..0.021 rows=1 loops=1)
>    ->  Index Scan using questions_pkey on questions  (cost=0.00..8.27 
> rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=1)
>          Index Cond: (id = 1)
>    ->  Index Scan using users_pkey on users u  (cost=0.00..8.49 rows=1 
> width=4) (actual time=0.007..0.007 rows=1 loops=1)
>          Index Cond: (u.id = questions.user_id)
>  Total runtime: 0.045 ms
> (6 rows)
>
>> That said, there might be some room for an optimization which pushes
>> that test into the query with the "group by" clause.  I don't know
>> if there's a problem with that which I'm missing, the construct was
>> judged to be too rare to be worth the cost of testing for it, or
>> it's just that nobody has yet gotten to it.
>
> Anyone have more insights on whether this is hard to optimize or 
> simply not-yet-optimized?  And if the latter, where might I start 
> looking? (Not that you -really- want me to submit a patch; my C has 
> regressed to the "try an ampersand.  OK, try an asterisk." level...)
>
> Jay
>
Minor note:

'PRIMARY KEY' gives you a 'NOT NULL'  constraint for free.

In response to

pgsql-performance by date

Next:From: Pavel StehuleDate: 2011-11-13 08:57:31
Subject: Re: Heavy contgnous load
Previous:From: Greg SmithDate: 2011-11-12 06:31:16
Subject: Re: WAL partition filling up after high WAL activity

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group