NULLS LAST performance

From: Mathieu De Zutter <mathieu(at)dezutter(dot)org>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: NULLS LAST performance
Date: 2011-02-23 19:27:11
Message-ID: AANLkTimo4nrZ3_5wgMRniMoosv8eT4dZA+Buz_HCxEr_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all,

Running PostgreSQL 8.4.7 (backport package from Debian Lenny).

I have some queries that are based on views, and an engine adds a few
clauses (like NULLS LAST). One of these queries has a performance problem.

The simplified form is this:

shs=# explain analyze select * from performance e JOIN part v ON
v.performance_id = e.id order by e.creation_date desc limit 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..4.25 rows=10 width=312) (actual time=0.078..0.147
rows=10 loops=1)
-> Nested Loop (cost=0.00..62180.28 rows=146294 width=312) (actual
time=0.078..0.145 rows=10 loops=1)
-> Index Scan Backward using performance_create_idx on performance
e (cost=0.00..12049.21 rows=145379 width=247) (actual time=0.051..0.087
rows=10 loops=1)
-> Index Scan using part_performance_idx on part v
(cost=0.00..0.33 rows=1 width=65) (actual time=0.005..0.005 rows=1
loops=10)
Index Cond: (v.performance_id = e.id)
Total runtime: 0.205 ms

creation_date is declared as NOT NULL, and since it's a inner join,
creation_date can never be null in this query. I'd think that if I add NULLS
LAST, it wouldn't have any effect.

However, the query with NULLS LAST (as generated by the engine):

shs=# explain analyze select * from performance e JOIN part v ON
v.performance_id = e.id order by e.creation_date desc nulls last limit 10;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=25773.76..25773.79 rows=10 width=312) (actual
time=492.959..492.963 rows=10 loops=1)
-> Sort (cost=25773.76..26139.50 rows=146294 width=312) (actual
time=492.958..492.962 rows=10 loops=1)
Sort Key: e.creation_date
Sort Method: top-N heapsort Memory: 27kB
-> Merge Join (cost=1.27..22612.40 rows=146294 width=312) (actual
time=0.064..367.160 rows=146294 loops=1)
Merge Cond: (e.id = v.performance_id)
-> Index Scan using performance_pkey on performance e
(cost=0.00..11989.20 rows=145379 width=247) (actual time=0.035..160.838
rows=145379 loops=1)
-> Index Scan using part_performance_idx on part v
(cost=0.00..8432.35 rows=146294 width=65) (actual time=0.025..91.084
rows=146294 loops=1)
Total runtime: 493.062 ms

Both tables have around 150k rows as you can read from the last plan.

Table performance:

Table "public.performance"
Column | Type |
Modifiers
-----------------+--------------------------+----------------------------------------------------------
created_by | integer | not null
creation_date | timestamp with time zone | not null
comments | text |
owned_by | integer | not null
id | integer | not null default
nextval('performance_id_seq'::regclass)
title | text |
title_ | text |
performer_id | integer |
first_medium_id | integer |
vperf_id | integer |
perf_date | partial_date |
bonustrack | boolean | not null default false
type_id | integer | not null
instrumental | boolean | not null default false
init_rev_level | smallint | not null default 1
curr_rev_level | smallint | not null default 1
revision_date | timestamp with time zone |
revised_by | integer |
object_type | text | not null default
'performance'::text
editor_note | text |
active | boolean | not null default true
Indexes:
"performance_pkey" PRIMARY KEY, btree (id)
"performance_create_idx" btree (creation_date)
"performance_medium_idx" btree (first_medium_id)
"performance_own_idx" btree (owned_by)
"performance_performer_idx" btree (performer_id)

Table part:

Table "public.part"
Column | Type | Modifiers

----------------+--------------------------+---------------------------------------------------
created_by | integer | not null
creation_date | timestamp with time zone |
comments | text |
owned_by | integer | not null
id | integer | not null default
nextval('part_id_seq'::regclass)
work_id | integer | not null
performance_id | integer | not null
Indexes:
"part_pkey" PRIMARY KEY, btree (id)
"part_own_idx" btree (owned_by)
"part_performance_idx" btree (performance_id)
"part_work_idx" btree (work_id)

Please advise!

Thanks.

Kind regards,

Mathieu

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2011-02-23 19:48:03 Re: NULLS LAST performance
Previous Message Benjamin Krajmalnik 2011-02-23 19:12:42 Unused indices