Re: Slow query on OS X box

From: "Patrick Hatcher" <PHatcher(at)macys(dot)com>
To: "Rod Taylor <rbt" <rbt(at)rbt(dot)ca>
Cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow query on OS X box
Date: 2003-01-22 20:49:49
Message-ID: OFD9C08201.63194703-ON88256CB6.006EA449@fds.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Sorry I'm being really dense today. I didn't even notice the 3.2 million
row being returned. :(

To answer your question, no, all fields would not have data. The data we
receive is from a Web log file. It's parsed and then uploaded to this
table.

I guess the bigger issue is that when trying to do aggregates, grouping by
the wizard field, it takes just as long.

Ex:
mdc_oz=# explain analyze select wizard,count(wizard) from search_log where
sdate
between '2002-12-01' and '2003-01-15' group by wizard;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1083300.43..1112411.55 rows=388148 width=10) (actual
time=229503.85..302617.75 rows=14 loops=1)
-> Group (cost=1083300.43..1102707.84 rows=3881482 width=10) (actual
time=229503.60..286014.83 rows=3717161 loops=1)
-> Sort (cost=1083300.43..1093004.14 rows=3881482 width=10)
(actual time=229503.57..248415.81 rows=3717161 loops=1)
Sort Key: wizard
-> Seq Scan on search_log (cost=0.00..575217.57
rows=3881482 width=10) (actual time=91235.76..157559.58 rows=3717161
loops=1)
Filter: ((sdate >= '2002-12-01'::date) AND (sdate
<= '2003-01-15'::date))
Total runtime: 302712.48 msec
(7 rows)

Thanks again for the help
Patrick Hatcher


Rod Taylor
<rbt(at)rbt(dot)ca> To: Patrick Hatcher <PHatcher(at)macys(dot)com>
cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
01/22/2003 Subject: Re: [PERFORM] Slow query on OS X box
12:02 PM

On Wed, 2003-01-22 at 13:26, Patrick Hatcher wrote:
> I have a table that contains over 13 million rows. This query takes an
> extremely long time to return. I've vacuum full, analyzed, and
re-indexed
> the table. Still the results are the same. Any ideas?

Yeah, you're pulling out 3.2 million rows from (possibly) a wide table
bytewise. Do all of those fields actually have data? Thats always
going to take a while -- and I find it hard to believe you're actually
doing something with all of those rows that runs regularly.

If every one of those rows was maxed out (ignoring the text field at the
end) you could have ~ 15GB of data to pull out. Without knowing the
type of data actually in the table, I'm going to bet it's a harddrive
limitation.

The index on 'wizard' is next to useless as at least 1/4 of the data in
the table is under the same key. You might try a partial index on
'wizard' (skip the value 'Keyword'). It won't help this query, but
it'll help ones looking for values other than 'Keyword'.

Anyway, you might try a CURSOR. Fetch rows out 5000 at a time, do some
work with them, then grab some more. This -- more or less -- will allow
you to process the rows received while awaiting the remaining lines to
be processed by the database. Depending on what you're doing with them
it'll give a chance for the diskdrive to catch up. If the kernels smart
it'll read ahead of the scan. This doesn't remove read time, but hides
it while you're transferring the data out (from the db to your client)
or processing it.

> mdc_oz=# explain analyze select wizard from search_log where wizard
> ='Keyword' and sdate between '2002-12-01' and '2003-01-15';
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------

> Seq Scan on search_log (cost=0.00..609015.34 rows=3305729 width=10)
> (actual time=99833.83..162951.25 rows=3280573 loops=1)
> Filter: ((wizard = 'Keyword'::character varying) AND (sdate >
> = '2002-12-01'::date) AND (sdate <= '2003-01-15'::date))
> Total runtime: 174713.25 msec
> (3 rows)
--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc
(See attached file: signature.asc)

Attachment Content-Type Size
signature.asc application/octet-stream 194 bytes

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomasz Myrta 2003-01-22 20:58:54 Re: Slow query on OS X box
Previous Message Ron Johnson 2003-01-22 20:35:22 Re: Slow query on OS X box