Skip site navigation (1) Skip section navigation (2)

NULLS LAST performance

From: Mathieu De Zutter <mathieu(at)dezutter(dot)org>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: NULLS LAST performance
Date: 2011-02-23 19:27:11
Message-ID: AANLkTimo4nrZ3_5wgMRniMoosv8eT4dZA+Buz_HCxEr_@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi all,

Running PostgreSQL 8.4.7 (backport package from Debian Lenny).

I have some queries that are based on views, and an engine adds a few
clauses (like NULLS LAST). One of these queries has a performance problem.

The simplified form is this:

shs=# explain analyze select * from performance e JOIN part v ON
v.performance_id = e.id order by e.creation_date desc limit 10;

 QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..4.25 rows=10 width=312) (actual time=0.078..0.147
rows=10 loops=1)
   ->  Nested Loop  (cost=0.00..62180.28 rows=146294 width=312) (actual
time=0.078..0.145 rows=10 loops=1)
         ->  Index Scan Backward using performance_create_idx on performance
e  (cost=0.00..12049.21 rows=145379 width=247) (actual time=0.051..0.087
rows=10 loops=1)
         ->  Index Scan using part_performance_idx on part v
 (cost=0.00..0.33 rows=1 width=65) (actual time=0.005..0.005 rows=1
loops=10)
               Index Cond: (v.performance_id = e.id)
 Total runtime: 0.205 ms

creation_date is declared as NOT NULL, and since it's a inner join,
creation_date can never be null in this query. I'd think that if I add NULLS
LAST, it wouldn't have any effect.

However, the query with NULLS LAST (as generated by the engine):

shs=# explain analyze select * from performance e JOIN part v ON
v.performance_id = e.id order by e.creation_date desc nulls last limit 10;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=25773.76..25773.79 rows=10 width=312) (actual
time=492.959..492.963 rows=10 loops=1)
   ->  Sort  (cost=25773.76..26139.50 rows=146294 width=312) (actual
time=492.958..492.962 rows=10 loops=1)
         Sort Key: e.creation_date
         Sort Method:  top-N heapsort  Memory: 27kB
         ->  Merge Join  (cost=1.27..22612.40 rows=146294 width=312) (actual
time=0.064..367.160 rows=146294 loops=1)
               Merge Cond: (e.id = v.performance_id)
               ->  Index Scan using performance_pkey on performance e
 (cost=0.00..11989.20 rows=145379 width=247) (actual time=0.035..160.838
rows=145379 loops=1)
               ->  Index Scan using part_performance_idx on part v
 (cost=0.00..8432.35 rows=146294 width=65) (actual time=0.025..91.084
rows=146294 loops=1)
 Total runtime: 493.062 ms

Both tables have around 150k rows as you can read from the last plan.

Table performance:

                                      Table "public.performance"
     Column      |           Type           |
 Modifiers
-----------------+--------------------------+----------------------------------------------------------
 created_by      | integer                  | not null
 creation_date   | timestamp with time zone | not null
 comments        | text                     |
 owned_by        | integer                  | not null
 id              | integer                  | not null default
nextval('performance_id_seq'::regclass)
 title           | text                     |
 title_          | text                     |
 performer_id    | integer                  |
 first_medium_id | integer                  |
 vperf_id        | integer                  |
 perf_date       | partial_date             |
 bonustrack      | boolean                  | not null default false
 type_id         | integer                  | not null
 instrumental    | boolean                  | not null default false
 init_rev_level  | smallint                 | not null default 1
 curr_rev_level  | smallint                 | not null default 1
 revision_date   | timestamp with time zone |
 revised_by      | integer                  |
 object_type     | text                     | not null default
'performance'::text
 editor_note     | text                     |
 active          | boolean                  | not null default true
Indexes:
    "performance_pkey" PRIMARY KEY, btree (id)
    "performance_create_idx" btree (creation_date)
    "performance_medium_idx" btree (first_medium_id)
    "performance_own_idx" btree (owned_by)
    "performance_performer_idx" btree (performer_id)

Table part:

                                      Table "public.part"
     Column     |           Type           |                     Modifiers

----------------+--------------------------+---------------------------------------------------
 created_by     | integer                  | not null
 creation_date  | timestamp with time zone |
 comments       | text                     |
 owned_by       | integer                  | not null
 id             | integer                  | not null default
nextval('part_id_seq'::regclass)
 work_id        | integer                  | not null
 performance_id | integer                  | not null
Indexes:
    "part_pkey" PRIMARY KEY, btree (id)
    "part_own_idx" btree (owned_by)
    "part_performance_idx" btree (performance_id)
    "part_work_idx" btree (work_id)

Please advise!

Thanks.

Kind regards,

Mathieu

Responses

pgsql-performance by date

Next:From: Merlin MoncureDate: 2011-02-23 19:48:03
Subject: Re: NULLS LAST performance
Previous:From: Benjamin KrajmalnikDate: 2011-02-23 19:12:42
Subject: Unused indices

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group