Quick Links

Re: query performance

From:	pepone(dot)onrez <pepone(dot)onrez(at)gmail(dot)com>
To:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: query performance
Date:	2008-01-14 04:59:17
Message-ID:	198501d60801132059g67e92564n5a103ff4c18da962@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Thanks Alex

I test your solution and is realy more faster.

Nested Loop (cost=1743.31..2044.58 rows=50 width=908) (actual time=
101.695..106.178 rows=50 loops=1)
-> HashAggregate (cost=1743.31..1743.31 rows=50 width=108) (actual time=
101.509..101.567 rows=50 loops=1)
-> Subquery Scan "IN_subquery" (cost=1741.60..1743.19 rows=50
width=108) (actual time=101.327..101.456 rows=50 loops=1)
-> Limit (cost=1741.60..1742.69 rows=50 width=108) (actual
time=101.313..101.383 rows=50 loops=1)
-> Seq Scan on overview
(cost=0.00..3283.07rows=150807 width=108) (actual time=
0.036..72.249 rows=80050 loops=1)
-> Index Scan using i_documentcontent_id on t_documentcontent (cost=
0.00..6.01 rows=1 width=908) (actual time=0.083..0.085 rows=1 loops=50)
Index Cond: ((t_documentcontent._id)::text = ("outer"._id)::text)
Total runtime: 106.323 ms

I now need to see what trigers i need to add, and test the insertions.
Thanks again

On Jan 14, 2008 5:54 AM, Alex Turner <armtuk(at)gmail(dot)com> wrote:

> Here is a table I threw together to demonstrate the approximate speed of a
> materialized view in this case:
>
> trend=# explain analyze select property_id from overview order by
> property_id limit 50 offset 50000;
> QUERY
> PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------
> Limit (cost=19112.75..19112.88 rows=50 width=8) (actual time=
> 446.048..446.125 rows=50 loops=1)
> -> Sort (cost=18987.75..19400.49 rows=165094 width=8) (actual time=
> 384.788..424.433 rows=50050 loops=1)
> Sort Key: property_id
> -> Seq Scan on overview (cost=0.00..2501.94 rows=165094
> width=8) (actual time= 0.012..88.691 rows=173409 loops=1)
> Total runtime: 447.578 ms
> (5 rows)
>
> trend=# select count(*) from overview;
> count
> --------
> 173409
> (1 row)
>
> trend=#
>
> It's not great - but it's better than 47 seconds (The machine I'm running
> it on is far from big iron, so these results should be fairly typical for
> any modern x86 box - also this materialized view is almost certainly in RAM,
> and therefore IO speed is irrelevant).
>
> Tom lane has already suggested another approach, whereby you order your
> results, then select the next 10 from the set where the id is greater than
> the greatest of the last one:
>
> select id from overview order by id limit 50;
>
> x = get row['id'] // for row 1
> do something
> x=get row['id'] // for row 2
> do something
> ...
> x=get row['id'] // for row 50
>
> select id from overview where id>x order by id limit 50.
>
> The order by is relevant and infact imperative, because you must order
> your results somehow, otherwise your pagination will produce different
> results each time you try it as database updates will affect the order the
> rows come back by default without an order by clause.
>
> Let me say that again to be clear: The order rows come back if you don't
> specify an order by can change! so pulling rows without an order by is a
> REALLY bad idea. This will break your pagination if a database update
> happens between someone viewing a page and hitting next to view the next
> page.
>
> Alex
>
>
> On Jan 13, 2008 11:43 PM, Alex Turner <armtuk(at)gmail(dot)com> wrote:
>
> > If you have to access the data this way (with no where clause at all -
> > which sometimes you do) then I have already provided a solution that will
> > work reasonably well. If you create what is essentially a materialized view
> > of just the id field, the sequence scan will return much fewer pages than
> > when you do it on the main table. Then you join it to the indexed main
> > table, and page in just the rows you need. Voila - much faster result. Of
> > course we haven't really talked about how that will affect insert speed and
> > delete speed if you trigger then up, but you haven't really talked about any
> > requirements there.
> >
> > Alex
> >
> >
> > On Jan 13, 2008 11:27 PM, pepone. onrez <pepone(dot)onrez(at)gmail(dot)com > wrote:
> >
> > > Sorry Alex i forget mention that i have setscan of in my last test.
> > >
> > > now I have set seqscan on and indexscan on and added order by _id
> > >
> > > The table has an index in the _id field
> > >
> > > CREATE INDEX i_documentcontent_document
> > > ON t_documentcontent
> > > USING btree
> > > (_document);
> > >
> > > The database was rencently vacum analyze , but not vacun full
> > >
> > > here is the explain of 2 diferent queries , when i put a large OFFSET
> > >
> > > EXPLAIN ANALYZE SELECT
> > > t_documentcontent._id AS _id
> > > FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 50000
> > >
> > > "Limit (cost=137068.24..137068.36 rows=50 width=58) (actual time=
> > > 41119.702..41119.792 rows=50 loops=1)"
> > > " -> Sort (cost=136943.24..137320.26 rows=150807 width=58) (actual
> > > time=41064.802..41100.424 rows=50050 loops=1)"
> > > " Sort Key: _id"
> > > " -> Seq Scan on t_documentcontent (cost= 0.00..110772.07rows=150807 width=58) (actual time=
> > > 106.679..33267.194 rows=150807 loops=1)"
> > > "Total runtime: 41120.015 ms"
> > >
> > > EXPLAIN ANALYZE SELECT
> > > t_documentcontent._id AS _id
> > > FROM t_documentcontent ORDER BY _id LIMIT 50 OFFSET 10000
> > >
> > > "Limit (cost=39839.37..40038.56 rows=50 width=58) (actual time=
> > > 1172.969..1194.228 rows=50 loops=1)"
> > > " -> Index Scan using i_documentcontent_id on t_documentcontent
> > > (cost=0.00..600805.54 rows=150807 width=58) (actual time=
> > > 0.077..1189.688 rows=10050 loops=1)"
> > > "Total runtime: 1194.316 ms"
> > >
> > > Tom
> > > i using uuid for the _id field that is the primary key add a WHERE
> > > id > ? don 't apply
> > > the cursor aproach is also not suitable for same of my queries
> > >
> > > I use this query for paginate contents of a filesysstem with lots of
> > > documents avoid offset is not posible always
> >
> >
> >
>

In response to

Re: query performance at 2008-01-14 04:54:32 from Alex Turner

Responses

Re: query performance at 2008-01-14 05:53:08 from Alex Turner

Browse pgsql-general by date

	From	Date	Subject
Next Message	Sim Zacks	2008-01-14 05:46:15	Re: 8.2.4 serious slowdown
Previous Message	Alex Turner	2008-01-14 04:54:32	Re: query performance