Re: Performance impact of NULLs and variable length fields

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Postgresql General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance impact of NULLs and variable length fields
Date: 2001-07-22 02:37:50
Message-ID: 20010722123750.A16356@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Well, Tom pointed out that without the query itself this was fairly
meaningless. So here it is:

select c.id, sum(d.cost)
from c, l, d
where c.id = l.cid
and l.clid = d.clid
and c.accman = '2500'
and billid is null
group by c.id;

EXPLAIN output:
Aggregate (cost=51479.55..51507.07 rows=550 width=48) (actual time=26192.07..26222.07 rows=156 loops=1)
-> Group (cost=51479.55..51493.31 rows=5503 width=48) (actual time=26165.81..26186.14 rows=2029 loops=156)
-> Sort (cost=51479.55..51479.55 rows=5503 width=48) (actual time=26164.90..26167.77 rows=2029 loops=1)
-> Hash Join (cost=676.79..51137.62 rows=5503 width=48) (actual time=3703.82..26122.23 rows=2029 loops=1)
-> Seq Scan on d (cost=0.00..50150.07 rows=32262 width=12) (actual time=67.35..24921.42 rows=41045 loops=1)
-> Hash (cost=671.31..671.31 rows=2193 width=36) (actual time=594.55..594.55 rows=0 loops=1)
-> Merge Join (cost=260.50..671.31 rows=2193 width=36) (actual time=251.95..551.40 rows=1330 loops=1)
-> Index Scan using l_cid on l (cost=0.00..364.36 rows=6308 width=20) (actual time=33.95..340.52 rows=6307 loops=1)
-> Sort (cost=260.50..260.50 rows=1305 width=16) (actual time=142.85..145.34 rows=1337 loops=1)
-> Seq Scan on c (cost=0.00..192.94 rows=1305 width=16) (actual time=14.99..103.40 rows=1314 loops=1)

d has over 1.4 million rows, l about 10,000 and c about 2,000.

NULL columns and variable length fields used extensivly.

On Sat, Jul 21, 2001 at 05:25:38PM +1000, Martijn van Oosterhout wrote:
> Sometime ago somebody asked if it made a difference adn it was suggested
> that the effect was probably marginal. I ran a profiler over postgres doing
> a large query and these are the top 10 functions:
>
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 16.04 0.51 0.51 1676772 0.00 0.00 nocachegetattr
> 11.95 0.89 0.38 1427403 0.00 0.00 heapgettup
> 10.06 1.21 0.32 2955372 0.00 0.00 LockBuffer
> 6.92 1.43 0.22 3406475 0.00 0.00 ExecEvalExpr
> 5.03 1.59 0.16 1617018 0.00 0.00 AllocSetReset
> 4.40 1.73 0.14 1427403 0.00 0.00 heap_getnext
> 4.09 1.86 0.13 1844339 0.00 0.00 ExecEvalVar
> 4.09 1.99 0.13 1441330 0.00 0.00 ReleaseBuffer
> 4.09 2.12 0.13 1427064 0.00 0.00 SeqNext
> 3.46 2.23 0.11 1552338 0.00 0.00 ExecQual

--
Martijn van Oosterhout <kleptog(at)svana(dot)org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Mark kirkwood 2001-07-22 03:06:52 How Postgresql Compares... Count(*) and others
Previous Message Tom Lane 2001-07-21 22:21:42 Re: COPY failure