Quick Links

Small Queries Really Fast, Large Queries Really Slow...

From:	Tom Wilcox <hungrytom(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Small Queries Really Fast, Large Queries Really Slow...
Date:	2010-06-24 10:45:25
Message-ID:	AANLkTikDkh7JsXoG_xBm36lD4W75tcqwRYC2p_sZ8Qn_@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi again!

I have finally got my Ubuntu VirtualBox VM running PostgreSQL with PL/Python
and am now looking at performance.

So here's the scenario:

We have a great big table:

KEY NOTE:
nlpg.match_data has approximately 27,000,000 rows..

Running this query:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id <
1000000;

I get this:

"Index Scan using match_data_pkey1 on match_data (cost=0.00..1452207.14
rows=1913756 width=302) (actual time=23.448..61559.652 rows=999999 loops=1)"
" Index Cond: (match_data_id < 1000000)"
"Total runtime: 403855.675 ms"

I copied a chunk of the table like this:

CREATE TABLE nlpg.md_copy AS SELECT * FROM nlpg.match_data WHERE
match_data_id < 1000000;

Then ran the same query on the smaller copy table:

EXPLAIN ANALYZE UPDATE nlpg.md_copy SET org = org WHERE match_data_id <
1000000;

And get this:

"Seq Scan on md_copy (cost=0.00..96935.99 rows=999899 width=301) (actual
time=26.745..33944.923 rows=999999 loops=1)"
" Filter: (match_data_id < 1000000)"
"Total runtime: 57169.419 ms"

As you can see this is much faster per row with the smaller table chunk. I
then tried running the same first query with 10 times the number of rows:

EXPLAIN ANALYZE UPDATE nlpg.match_data SET org = org WHERE match_data_id <
10000000;

This takes a massive amount of time (still running) and is definitely a
non-linear increase in the run time in comparison with the previous query.

EXPLAIN UPDATE nlpg.match_data SET org = org WHERE match_data_id < 10000000;
"Seq Scan on match_data (cost=0.00..3980053.11 rows=19172782 width=302)"
" Filter: (match_data_id < 10000000)"

Any suggestions on what I can do to speed things up? I presume if I turn off
Sequential Scan then it might default to Index Scan.. Is there anything
else?

Cheers,
Tom

Responses

Re: Small Queries Really Fast, Large Queries Really Slow... at 2010-06-24 11:20:04 from tv

Browse pgsql-performance by date

	From	Date	Subject
Next Message	tv	2010-06-24 11:20:04	Re: Small Queries Really Fast, Large Queries Really Slow...
Previous Message	Matthew Wakeling	2010-06-24 09:14:00	Re: WAL+Os on a single disk