Re: High CPU Usage - PostgreSQL 7.3

From: "Neil Hepworth" <nhepworth(at)gmail(dot)com>
To: "Jeff Frost" <jeff(at)frostconsultingllc(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: High CPU Usage - PostgreSQL 7.3
Date: 2006-07-12 03:48:40
Message-ID: 5a6f767a0607112048w2d492056y7796aa28a9aa3ea9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Yes, it was the same DB so, yes 8.1 gives roughly a four fold improvement
(assuming hardware and OS differences aren't that significant - I'd expect
the Linux version to be faster if anything); which certainly ain't bad! :)

Good idea for the vacuumdb -a -v on the laptop, I re imported the database
and than ran it output below:

INFO: free space map contains 949 pages in 537 relations
DETAIL: A total of 9024 page slots are in use (including overhead).
9024 page slots are required to track all free space.
Current limits are: 20000 page slots, 1000 relations, using 186 KB.
VACUUM

I am about to start testing Scott's suggestion now (thanks Scott - wasn't
ignoring you, just didn't have time yesterday), and I'll get back with the
results.

Before I posted the problem to this list I was focusing more on the settings
in postgresql.conf than optimising the query as I thought this might be a
general problem, for all my DB updates/queries, with the way the planner was
optimising queries; maybe assuming CPU cost was too cheap? Do you think I
was off track in my initial thinking? Optimising these queries is
certainly beneficial but I don't want postgres to hog the CPU for any
extended period (other apps also run on the server), so I was wondering if
the general config settings could to be tuned to always prevent this
(regardless of how poorly written my queries are :)?

Neil

On 12/07/06, Jeff Frost <jeff(at)frostconsultingllc(dot)com> wrote:
>
> On Wed, 12 Jul 2006, Neil Hepworth wrote:
>
> > I am using version PostgreSQL 7.3.10 (RPM:
> > postgresql73-rhel21-7.3.10-2). Unfortunately vacuumdb -a -v does not
> > give the FSM info at the end (need a newer version of postgres for
> > that). Running the same queries on 8.1 reduces the time taken to
> > about 16 minutes, though I didn't run the test on the same hardware or
> > OS as I want to keep my test server as close to production as
> > possible, so I ran the 8.1 server on my Windows laptop (2GHz Centrino
> > Duo with 2GB of RAM, yes the laptop is brand new :).
>
> Well, looks like you're at least fairly up to date, but there is a fix in
> 7.3.11 that you might want to get by upgrading to 7.3.15:
>
> * Fix race condition in transaction log management
> There was a narrow window in which an I/O operation could be
> initiated for the wrong page, leading to an Assert failure or data
> corruption.
>
> It also appears that you can run autovacuum with 7.3 (I thought maybe it
> only
> went back as far as 7.4).
>
> So, is the 16 minutes on your laptop with 8.1 for windows vs 1hr on the
> server
> for the whole set of loops? If so, 4x isn't a bad improvement. :-) So,
> assuming you dumped/loaded the same DB onto your laptop's postgresql
> server,
> what does the vacuumdb -a -v say on the laptop? Perhaps we can use it to
> see
> if your fsm settings are ok.
>
> BTW, did you see Scott's posting here:
>
> http://archives.postgresql.org/pgsql-performance/2006-07/msg00091.php
>
> Since we didn't hear from you for a while, I thought perhaps Scott had hit
> on
> the fix. Have you tried that yet? It certainly would help the planner
> out.
>
> You might also want to turn on autovacuum and see if that helps.
>
> What's your disk subsystem like? In fact, what's the entire DB server
> hardware like?
>
> >
> > I run through a loop, executing the following or similar queries 8
> > times (well actually 12 but the last 4 don't do anything) - Jeff I've
> > attached complete outputs as files.
> >
> > A debug output further below (numbers after each method call name,
> > above each SQL statement, are times to run that statement in
> > milliseconds, the times on the lines "" are cumulative). So total for
> > one loop is 515 seconds, multiple by 8 and that gets me to over an
> > hour); it is actually the deletes that take the most time; 179 seconds
> > and 185 seconds each loop.
> >
> > ----------------------------------------------------
> >
> > CREATE TABLE fttemp670743219 AS SELECT * FROM ftone LIMIT 0
> > INSERT INTO fttemp670743219 ( epId, start, direction, classid,
> > consolidation, cnt ) SELECT epId, TO_TIMESTAMP(start, 'YYYY-MM-DD
> > HH24:00:00.0')::timestamp AS start, direction, classid, 60 AS
> > consolidation, SUM(cnt) AS cnt FROM ftone WHERE consolidation = 0 AND
> > start < TO_TIMESTAMP('2006-07-11 14:04:34.156433+1000', 'YYYY-MM-DD
> > HH24:00:00.0')::timestamp GROUP BY epId, direction,
> > TO_TIMESTAMP(start, 'YYYY-MM-DD HH24:00:00.0')::timestamp, classid
> > DELETE FROM ONLY ftone WHERE ftone.epId = fttemp670743219.epId AND
> > ftone.direction = fttemp670743219.direction AND ftone.start =
> > fttemp670743219.start AND ftone.consolidation =
> > fttemp670743219.consolidation AND ftone.classid =
> > fttemp670743219.classid
> > INSERT INTO ftone ( epId, start, consolidation, direction, classid,
> > cnt ) SELECT epId, start, consolidation, direction, classid, cnt FROM
> > fttemp670743219
> > DROP TABLE fttemp670743219
> > DELETE FROM ftone WHERE consolidation = 0 AND start <
> > TO_TIMESTAMP((TO_TIMESTAMP('2006-07-11 14:04:34.156433+1000',
> > 'YYYY-MM-DD 00:00:00.0')::timestamp - INTERVAL '10080 MINUTE'),
> > 'YYYY-MM-DD 00:00:00.0')::timestamp
> >
> > ----------------------------------------------------
> >
> > ftone: 0:
> > createConsolidatedInTemporary: 188:
> > CREATE TABLE fttemp678233382 AS SELECT * FROM ftone LIMIT 0
> > createConsolidatedInTemporary: 76783:
> > INSERT INTO fttemp678233382 ( epPairdefnid, start, direction, classid,
> > consolidation, cnt ) SELECT epPairdefnid, TO_TIMESTAMP(start,
> > 'YYYY-MM-DD HH24:00:00.0')::timestamp AS start, direction, classid, 60
> > AS consolidation, SUM(cnt) AS cnt FROM ftone WHERE consolidation = 0
> > AND start < TO_TIMESTAMP('2006-07-12 11:02:13.865444+1000',
> > 'YYYY-MM-DD HH24:00:00.0')::timestamp GROUP BY epPairdefnid,
> > direction, TO_TIMESTAMP(start, 'YYYY-MM-DD HH24:00:00.0')::timestamp,
> > classid
> > replaceConsolidatedInMainTable: 179178:
> > DELETE FROM ONLY ftone WHERE ftone.epPairdefnid =
> > fttemp678233382.epPairdefnid AND ftone.direction =
> > fttemp678233382.direction AND ftone.start = fttemp678233382.start AND
> > ftone.consolidation = fttemp678233382.consolidation AND ftone.classid
> > = fttemp678233382.classid
> > replaceConsolidatedInMainTable: 61705:
> > INSERT INTO ftone ( epPairdefnid, start, consolidation, direction,
> > classid, cnt ) SELECT epPairdefnid, start, consolidation, direction,
> > classid, cnt FROM fttemp678233382
> > consolidate: 2656:
> > DROP TABLE fttemp678233382
> > MAIN LOOP TOTAL consolidate: 320526
> > deleteOlderThan: 184616:
> > DELETE FROM ftone WHERE consolidation = 0 AND start <
> > TO_TIMESTAMP((TO_TIMESTAMP('2006-07-12 11:02:13.865444+1000',
> > 'YYYY-MM-DD 00:00:00.0')::timestamp - INTERVAL '10080 MINUTE'),
> > 'YYYY-MM-DD 00:00:00.0')::timestamp
> > MAIN LOOP TOTAL deleteExpiredData: 505142
> > MAIN LOOP TOTAL generateStatistics: 515611
> >
> > ----------------------------------------------------
> >
> > Thanks again,
> > Neil
> >
> >
> > On 11/07/06, Jeff Frost <jeff(at)frostconsultingllc(dot)com> wrote:
> >>
> >>
> >> On Mon, 10 Jul 2006, Neil Hepworth wrote:
> >>
> >> > I should also explain that I run through these queries on multiple
> >> > tables and with some slightly different parameters for the
> >> > "consolidation" so I run through those 3 queries (or similar) 9 times
> >> > and this takes a total of about 2 hours, with high CPU usage. And I
> >> > am running the queries from a remote Java application (using JDBC),
> >> > the client is using postgresql-8.0-311.jdbc3.jar. The explain
> analyse
> >> > results I have provided below are from running via pgAdmin, not the
> >> > Java app (I did a vacuum analyse of the db before running them):
> >> >
> >> >
> >>
> >> Neil, did you ever answer which version of 7.3 this is?
> >>
> >> BTW, you mentioned that this takes 2 hours, but even looping over this
> 9
> >> times
> >> seems like it would only take 9 minutes (55 seconds for the SELECT and
> 4
> >> seconds for the DELETE = 59 seconds times 9). Perhaps you should post
> the
> >> explain analyze for the actual query that takes so long as the planner
> >> output
> >> will likely be quite different.
> >>
> >> One thing I noticed is that the planner seems quite incorrect about the
> >> number
> >> of rows it expects in the SELECT. If you ran vacuum analyze before
> this,
> >> perhaps your fsm settings are incorrect? What does vacuumdb -a -v
> output
> >> at
> >> the end? I'm looking for something that looks like this:
> >>
> >> INFO: free space map: 109 relations, 204 pages stored; 1792 total
> pages
> >> needed
> >> DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 182 kB
> shared
> >> memory.
> >>
> >> I see your fsm settings are non-default, so it's also possible I'm not
> used
> >> to
> >> reading 7.3's explain analyze output. :-)
> >>
> >> Also, what does vmstat output look like while the query is running?
> >> Perhaps
> >> you're running into some context switching problems. It would be
> >> interesting
> >> to know how the query runs on 8.1.x just to know if we're chasing an
> >> optimization that's fixed already in a later version.
> >>
> >>
> >> > Subquery Scan "*SELECT*" (cost=59690.11..62038.38 rows=23483
> >> > width=16) (actual time=16861.73..36473.12 rows=560094 loops=1)
> >> > -> Aggregate (cost=59690.11..62038.38 rows=23483 width=16) (actual
> >> > time=16861.72..34243.63 rows=560094 loops=1)
> >> > -> Group (cost=59690.11..61451.32 rows=234827 width=16)
> >> > (actual time=16861.62..20920.12 rows=709461 loops=1)
> >> > -> Sort (cost=59690.11..60277.18 rows=234827 width=16)
> >> > (actual time=16861.62..18081.07 rows=709461 loops=1)
> >> > Sort Key: eppairdefnid, "start"
> >> > -> Seq Scan on ftone (cost=0.00..36446.66
> >> > rows=234827 width=16) (actual time=0.45..10320.91 rows=709461
> loops=1)
> >> > Filter: ((consolidation = 60) AND ("start" <
> >> > (to_timestamp('2006-07-10 18:43:27.391103+1000'::text,
> >> > 'YYYY-MM-DDHH24:00:00.0'::text))::timestamp without time zone))
> >> > Total runtime: 55378.68 msec
> >>
> >> > *** For the delete ***:
> >> >
> >> > Hash Join (cost=0.00..30020.31 rows=425 width=14) (actual
> >> > time=3767.47..3767.47 rows=0 loops=1)
> >> > Hash Cond: ("outer".eppairdefnid = "inner".eppairdefnid)
> >> > -> Seq Scan on ftone (cost=0.00..23583.33 rows=1286333 width=10)
> >> > (actual time=0.04..2299.94 rows=1286333 loops=1)
> >> > -> Hash (cost=0.00..0.00 rows=1 width=4) (actual
> >> > time=206.01..206.01 rows=0 loops=1)
> >> > -> Seq Scan on fttemp1600384653 (cost=0.00..0.00 rows=1
> >> > width=4) (actual time=206.00..206.00 rows=0 loops=1)
> >> > Total runtime: 3767.52 msec
> >>
> >> --
> >> Jeff Frost, Owner <jeff(at)frostconsultingllc(dot)com>
> >> Frost Consulting, LLC http://www.frostconsultingllc.com/
> >> Phone: 650-780-7908 FAX: 650-649-1954
> >>
> >
>
> --
> Jeff Frost, Owner <jeff(at)frostconsultingllc(dot)com>
> Frost Consulting, LLC http://www.frostconsultingllc.com/
> Phone: 650-780-7908 FAX: 650-649-1954
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message nicky 2006-07-12 07:38:30 Out of Memory Problem.
Previous Message Jeff Frost 2006-07-12 03:43:07 Re: High CPU Usage - PostgreSQL 7.3