Re: performance problem aftrer update from 7.1 to 7.4.2

From: "Development - multi(dot)art(dot)studio" <dev(dot)null(at)multiartstudio(dot)com>
To: "Development - multi(dot)art(dot)studio" <dev(dot)null(at)multiartstudio(dot)com>
Cc: Richard Huxton <dev(at)archonet(dot)com>, pgsql-general(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: performance problem aftrer update from 7.1 to 7.4.2
Date: 2004-04-22 00:44:10
Message-ID: 4087155A.7070204@multiartstudio.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

hello again,

i did some mistakes, site-id was wrong in my simple select querys.
(interesting that no matches would take double time on 7.4)

here are the 'right' results, showing to me postgres 7.4 is slightly
slower with simple querys (but 1000-times faster with more complex
querys ;-)

with production 7.1:

mcms09=> select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
845
(1 row)

---some users where working on it and added two newsletter since last
dump....

mcms09=> explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
NOTICE: QUERY PLAN:
Sort (cost=123.78..123.78 rows=841 width=84)
-> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=84)
EXPLAIN

and 7.4 test:

mcms=# select count(id) from newsletter where aktiv=1 and
site_id='m200384-000';
count
-------
843
(1 row)

mcms=# explain select * from newsletter where aktiv=1 and
site_id='m200384-000' order by id desc,date desc;
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=124.81..126.91 rows=841 width=598)
Sort Key: id, date
-> Seq Scan on newsletter (cost=0.00..83.95 rows=841 width=598)
Filter: ((aktiv = 1) AND ((site_id)::text = 'm200384-000'::text))
(4 rows)

so in 7.4 seq-scan is about '1' (ms or what?) slower than 7.1

really thats not a very big table, my tests will go on!
if any idea, please let me know.

thanks in advance.
volker

Development - multi.art.studio wrote:

> hello,
>
>>Richard Huxton wrote:
>>What I suggest:
>>1. Compare the two postgresql.conf files and any other config settings and
>>make sure you know what differences there are and why.
>>2. Identify what queries seem to be the cause of the problem, and pick one you
>>think is a good example.
>>3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
>>4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
>>want to make sure the data is cached, run it three times and use the last
>>one.
>>While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
>>another terminal.
>>5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
>>7.4, same as before, verify that the same results are returned. Same as
>>before for the caching.
>>Trace using vmstat for this too.
>
>
>
> thanks for the great hints, last days i was trying and trying......
> restoring databases using a long time.
> but i was surprised on the results.
> ok, im not finished because i can stop the production dbs only late in
> the night after telling some people...but here are some 'heavy' results
>
> i dumped out my database with pgdump from 7.4 as before , one dump
> with structure only, one with inserts
> after that, doing a 'vacuum full' on 7.4 and stopping new 7.4 database.
>
> i created a test-query
> (it will show the entries in table newsletter_send (historical datas)
> with links to valid newsletter and addresses always using a specific
> site-id ordered by send-date, address-id and internal-id
>
> on old production 7.1:
>
> Query:
>
> mcms09=> select * from newsletter_send where site_id='m200384-000' and
> newsletter_id in (select id from newsletter where aktiv=1 and
> site_id='m200384-000') and newsletter_adr_id in (select id from
> newsletter_address where site_id='m200384-000' and aktiv=1) and sent
> is not null order by sent desc,newsletter_adr_id desc,id desc;
> Cancel request sent
> ERROR: Query was cancelled.
>
> ------takes too long time, about 3 minutes without displaying a result ;-)
>
> Explain:
>
> mcms09=> explain select * from newsletter_send where
> site_id='m200384-000' and newsletter_id in (select id from newsletter
> where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
> (select id from newsletter_address where site_id='m200384-000' and
> aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
> desc,id desc;
> NOTICE: QUERY PLAN:
>
> Sort (cost=4224132.30..4224132.30 rows=29063 width=54)
> -> Index Scan using newsletter_sent_site_id_idx on newsletter_send
> (cost=0.00..4221402.52 rows=29063 width=54)
> SubPlan
> -> Seq Scan on newsletter_address (cost=0.00..13.61
> rows=133 width=4)
> -> Seq Scan on newsletter (cost=0.00..82.95 rows=841 width=4)
> EXPLAIN
>
> -------i do this querys about three or five times to be sure its cached
>
> now i started 7.4 again and test it with fresh 7.4.2 restored database:
>
> Query-Explain:
>
> mcms=# explain select * from newsletter_send where
> site_id='m200384-000' and newsletter_id in (select id from newsletter
> where aktiv=1 and site_id='m200384-000') and newsletter_adr_id in
> (select id from newsletter_address where site_id='m200384-000' and
> aktiv=1) and sent is not null order by sent desc,newsletter_adr_id
> desc,id desc;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------
> Sort (cost=6205.83..6275.90 rows=28026 width=76)
> Sort Key: newsletter_send.sent, newsletter_send.newsletter_adr_id,
> newsletter_send.id
> -> Hash IN Join (cost=100.00..3409.13 rows=28026 width=76)
> Hash Cond: ("outer".newsletter_id = "inner".id)
> -> Hash IN Join (cost=13.94..2887.87 rows=30991 width=76)
> Hash Cond: ("outer".newsletter_adr_id = "inner".id)
> -> Seq Scan on newsletter_send (cost=0.00..2128.29
> rows=87145 width=76)
> Filter: (((site_id)::text = 'm200384-000'::text)
> AND (sent IS NOT NULL))
> -> Hash (cost=13.61..13.61 rows=133 width=4)
> -> Seq Scan on newsletter_address
> (cost=0.00..13.61 rows=133 width=4)
> Filter: (((site_id)::text =
> 'm200384-000'::text) AND (aktiv = 1))
> -> Hash (cost=83.95..83.95 rows=841 width=4)
> -> Seq Scan on newsletter (cost=0.00..83.95 rows=841
> width=4)
> Filter: ((aktiv = 1) AND ((site_id)::text =
> 'm200384-000'::text))
> (14 rows)
>
>
> ....Heavy! its about 1000-times faster i think, and now postgres
> speeds up like a rocket.
> i dont know why this happens......, but i did another query, because
> my idea is there are problems in sorting the table in a very simple
> query....hmmmm, if im not totally wrong ;-) tell me *g
> ________________________________
>
> ok here are some more tests
>
> Query 7.1:
> mcms09=> explain select * from newsletter where aktiv=1 and
> site_id='m200384_000' order by id desc,date desc;
> NOTICE: QUERY PLAN:
> Sort (cost=9.29..9.29 rows=8 width=84)
> -> Index Scan using newsletter_site_id_date_idx on newsletter
> (cost=0.00..9.17 rows=8 width=84)
> EXPLAIN
>
> Query 7.4.2:
> mcms=# explain select * from newsletter where aktiv=1 and
> site_id='m200384_000' order by id desc,date desc;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------
> Sort (cost=18.67..18.70 rows=11 width=598)
> Sort Key: id, date
> -> Index Scan using site_id_newsletter_key on newsletter
> (cost=0.00..18.48 rows=11 width=598)
> Index Cond: ((site_id)::text = 'm200384_000'::text)
> Filter: (aktiv = 1)
> (5 rows)
>
> now it costs double time to query the table.
> Explain show me it sorts the table up by id and date, after that it
> scans and uses filter.......is this the order postgres works? maybe
> sorting the table without filters before querying it costs a lot of
> performance, only result should be sorted.
> hmmm, i will start it with higher debuglevel and try to get out.
>
>>Post the query SQL and both EXPLAIN ANALYSE outputs along with table
>>definitions and row counts for each table involved and we can see exactly
>>where the problem is. If you think this is too much info, you can post it on
>>the web instead, that's fine.
>>If you're using the vmstat info, you could post that too, otherwise just keep
>>it safe in case it's needed.
>
>
>
>
>thanks in advance, i updated http://www.erdtrabant.de/index.php?p=&l=de&i=60500
>
>yours sincerely
>volker
>
>
>
>
> Richard Huxton wrote:
>
>>On Saturday 17 April 2004 13:23, Development - multi.art.studio wrote:
>>
>>
>>>hello everyone,
>>>
>>>Richard Huxton wrote:
>>>
>>>
>>>>It's not using the date index because it's using the id index - there's
>>>>only 10 matches, so that looks like a good choice to me. It takes less
>>>>than 1ms, so I'm not sure this is a good example of a problem.
>>>>
>>>>
>>>thanks all !
>>>but this doesnt help me a lot, i dont know what what to do? im not an
>>>expert. ok, i could use another site_id and it will grow up and gives 500
>>>or more rows back from the select clause. this takes a lot of more time.
>>>would this help?
>>>
>>>
>>
>>If that is the problem, that's what you'll need to post.
>>
>>
>>
>>>for me it looks as pg742 is not as performant as 7.1 ? and takes a lot of
>>>more time, and doesnt look for the correct index? but why this? why does it
>>>take longer than in 7.1? any ideas? what can i do?
>>>
>>>
>>
>>It's not something people are generally seeing. In most cases performance is
>>the same or slightly better. For some queries it can be a lot better.
>>
>>What I suggest:
>>1. Compare the two postgresql.conf files and any other config settings and
>>make sure you know what differences there are and why.
>>2. Identify what queries seem to be the cause of the problem, and pick one you
>>think is a good example.
>>3. VACCUM FULL and ANALYZE both 7.1 and 7.4 databases.
>>4. Stop the 7.4 server and run EXPLAIN ANALYSE (from psql) for 7.1. If you
>>want to make sure the data is cached, run it three times and use the last
>>one.
>>While you're running the explain, execute "vmstat 1 > vmstat.trace-7.1.txt" in
>>another terminal.
>>5. Stop the 7.1 server, restart 7.4 and run EXPLAIN ANALYSE (from psql) for
>>7.4, same as before, verify that the same results are returned. Same as
>>before for the caching.
>>Trace using vmstat for this too.
>>
>>Post the query SQL and both EXPLAIN ANALYSE outputs along with table
>>definitions and row counts for each table involved and we can see exactly
>>where the problem is. If you think this is too much info, you can post it on
>>the web instead, that's fine.
>>If you're using the vmstat info, you could post that too, otherwise just keep
>>it safe in case it's needed.
>>
>>
>>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marc G. Fournier 2004-04-22 00:55:08 Re: [OT] Tom's/Marc's spam filters?
Previous Message Uwe C. Schroeder 2004-04-22 00:16:37 Re: ident authentication problem