Skip site navigation (1) Skip section navigation (2)

Re: BUG #3667: Job scheduling with Greenplum fails

From: "Roberts, Jon" <Jon(dot)Roberts(at)asurion(dot)com>
To: 'Andrew Sullivan' <ajs(at)crankycanuck(dot)ca>
Cc: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #3667: Job scheduling with Greenplum fails
Date: 2007-10-11 15:42:51
Message-ID: 15362F202C62EA4590F5F3E5FA15021E052866AB@nasappexc04.asurion.loc (view raw or flat)
Thread:
Lists: pgsql-bugs
Like I said in the email, I think the best solution is to put the code in a
function.  However, after I inserted a total of 25 jobs, the difference is
more noticeable and my sql is better in terms of cost and total time.

vacuum analyze pgagent.pga_job;
vacuum analyze pgagent.pga_jobclass;
vacuum analyze pgagent.pga_jobagent;

Original SQL:
explain analyze
SELECT *, 
	   (SELECT jlgstatus FROM pgagent.pga_joblog jl WHERE jl.jlgjobid =
j.jobid ORDER BY jlgid DESC LIMIT 1) AS joblastresult 
	   FROM pgagent.pga_job j
         JOIN pgagent.pga_jobclass cl ON cl.jclid=jobjclid
         LEFT OUTER JOIN pgagent.pga_jobagent ag ON ag.jagpid=jobagentid
--       + restriction +
        ORDER BY jobname
"Sort  (cost=291.31..291.37 rows=25 width=127) (actual time=2.125..2.181
rows=25 loops=1)"
"  Sort Key: j.jobname"
"  ->  Hash Join  (cost=2.13..290.73 rows=25 width=127) (actual
time=0.204..1.823 rows=25 loops=1)"
"        Hash Cond: (j.jobjclid = cl.jclid)"
"        ->  Hash Left Join  (cost=1.02..4.38 rows=25 width=105) (actual
time=0.061..0.264 rows=25 loops=1)"
"              Hash Cond: (j.jobagentid = ag.jagpid)"
"              ->  Seq Scan on pga_job j  (cost=0.00..3.25 rows=25 width=63)
(actual time=0.009..0.074 rows=25 loops=1)"
"              ->  Hash  (cost=1.01..1.01 rows=1 width=42) (actual
time=0.021..0.021 rows=1 loops=1)"
"                    ->  Seq Scan on pga_jobagent ag  (cost=0.00..1.01
rows=1 width=42) (actual time=0.005..0.008 rows=1 loops=1)"
"        ->  Hash  (cost=1.05..1.05 rows=5 width=22) (actual
time=0.050..0.050 rows=5 loops=1)"
"              ->  Seq Scan on pga_jobclass cl  (cost=0.00..1.05 rows=5
width=22) (actual time=0.005..0.026 rows=5 loops=1)"
"        SubPlan"
"          ->  Limit  (cost=0.00..11.40 rows=1 width=9) (actual
time=0.045..0.045 rows=0 loops=25)"
"                ->  Index Scan Backward using pga_joblog_pkey on pga_joblog
jl  (cost=0.00..68.38 rows=6 width=9) (actual time=0.036..0.036 rows=0
loops=25)"
"                      Filter: (jlgjobid = $0)"
"Total runtime: 2.436 ms"


My SQL:
explain analyze
select j.*, cl.*, ag.*, sub3.jlgstatus
   from pgagent.pga_job j join 
        pgagent.pga_jobclass cl on cl.jclid=jobjclid left outer join 
        pgagent.pga_jobagent ag on ag.jagpid=jobagentid 
        join (select j2.jlgstatus, 
                     sub.jlgjobid
                from pgagent.pga_joblog j2 join 
                     (select jl.jlgjobid, 
                             max(jl.jlgid) as max_jlgid 
                        from pgagent.pga_joblog jl
                       group by jl.jlgjobid) sub
                     on sub.jlgjobid = j2.jlgjobid and 
                        sub.max_jlgid = j2.jlgid) sub3
        on sub3.jlgjobid = j.jobid
--       + restriction +
order by jobname

"Sort  (cost=68.35..68.36 rows=1 width=132) (actual time=1.026..1.033 rows=2
loops=1)"
"  Sort Key: j.jobname"
"  ->  Nested Loop  (cost=36.69..68.34 rows=1 width=132) (actual
time=0.877..0.961 rows=2 loops=1)"
"        ->  Nested Loop Left Join  (cost=36.69..67.58 rows=1 width=110)
(actual time=0.838..0.882 rows=2 loops=1)"
"              ->  Hash Join  (cost=36.69..66.82 rows=1 width=68) (actual
time=0.810..0.830 rows=2 loops=1)"
"                    Hash Cond: ((j2.jlgjobid = j.jobid) AND (j2.jlgid =
sub.max_jlgid))"
"                    ->  Seq Scan on pga_joblog j2  (cost=0.00..21.50
rows=1150 width=13) (actual time=0.024..0.130 rows=44 loops=1)"
"                    ->  Hash  (cost=36.31..36.31 rows=25 width=71) (actual
time=0.542..0.542 rows=2 loops=1)"
"                          ->  Hash Join  (cost=30.81..36.31 rows=25
width=71) (actual time=0.506..0.523 rows=2 loops=1)"
"                                Hash Cond: (sub.jlgjobid = j.jobid)"
"                                ->  HashAggregate  (cost=27.25..29.75
rows=200 width=8) (actual time=0.270..0.275 rows=2 loops=1)"
"                                      ->  Seq Scan on pga_joblog jl
(cost=0.00..21.50 rows=1150 width=8) (actual time=0.013..0.116 rows=44
loops=1)"
"                                ->  Hash  (cost=3.25..3.25 rows=25
width=63) (actual time=0.206..0.206 rows=25 loops=1)"
"                                      ->  Seq Scan on pga_job j
(cost=0.00..3.25 rows=25 width=63) (actual time=0.011..0.082 rows=25
loops=1)"
"              ->  Index Scan using pga_jobagent_pkey on pga_jobagent ag
(cost=0.00..0.75 rows=1 width=42) (actual time=0.006..0.006 rows=0 loops=2)"
"                    Index Cond: (ag.jagpid = j.jobagentid)"
"        ->  Index Scan using pga_jobclass_pkey on pga_jobclass cl
(cost=0.00..0.75 rows=1 width=22) (actual time=0.016..0.020 rows=1 loops=2)"
"              Index Cond: (cl.jclid = j.jobjclid)"
"Total runtime: 1.406 ms"


I think this trend will continue as more and more jobs are inserted.


BOOYA!

Jon
-----Original Message-----
From: Andrew Sullivan [mailto:ajs(at)crankycanuck(dot)ca] 
Sent: Thursday, October 11, 2007 10:14 AM
To: Roberts, Jon
Cc: 'Tom Lane'; pgsql-bugs(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #3667: Job scheduling with Greenplum fails

On Thu, Oct 11, 2007 at 07:31:44AM -0500, Roberts, Jon wrote:

> The cost is significantly lower but the total runtime is higher.  

Um, so you want developers to change the thing so that it performs
more slowly, but has a prettier estimate of how much work it's going
to do?  That seems like a poor optimisation to me.

A

-- 
Andrew Sullivan  | ajs(at)crankycanuck(dot)ca
The whole tendency of modern prose is away from concreteness.
		--George Orwell

Responses

pgsql-bugs by date

Next:From: Greg Sabino MullaneDate: 2007-10-11 22:06:45
Subject: Deferred FK / PK deletion problems
Previous:From: Andrew SullivanDate: 2007-10-11 15:14:01
Subject: Re: BUG #3667: Job scheduling with Greenplum fails

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group