H800 + md1200 Performance problem

From: Cesar Martin <cmartinp(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: H800 + md1200 Performance problem
Date: 2012-04-03 12:20:47
Message-ID: CAMAsR=4acKwYjow-VWQ0xH0+WjXEtVJWxP0A0y4vx=oyr5fhpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello there,

I am having performance problem with new DELL server. Actually I have this
two servers

Server A (old - production)
-----------------
2xCPU Six-Core AMD Opteron 2439 SE
64GB RAM
Raid controller Perc6 512MB cache NV
- 2 HD 146GB SAS 15Krpm RAID1 (SO Centos 5.4 y pg_xlog) (XFS no barriers)
- 6 HD 300GB SAS 15Krpm RAID10 (DB Postgres 8.3.9) (XFS no barriers)

Server B (new)
------------------
2xCPU 16 Core AMD Opteron 6282 SE
64GB RAM
Raid controller H700 1GB cache NV
- 2HD 74GB SAS 15Krpm RAID1 stripe 16k (SO Centos 6.2)
- 4HD 146GB SAS 15Krpm RAID10 stripe 16k XFS (pg_xlog) (ext4 bs 4096, no
barriers)
Raid controller H800 1GB cache nv
- MD1200 12HD 300GB SAS 15Krpm RAID10 stripe 256k (DB Postgres 8.3.18)
(ext4 bs 4096, stride 64, stripe-width 384, no barriers)

Postgres DB is the same in both servers. This DB has 170GB size with some
tables partitioned by date with a trigger. In both shared_buffers,
checkpoint_segments... settings are similar because RAM is similar.

I supposed that, new server had to be faster than old, because have more
disk in RAID10 and two RAID controllers with more cache memory, but really
I'm not obtaining the expected results

For example this query:

EXPLAIN ANALYZE SELECT c.id AS c__id, c.fk_news_id AS c__fk_news_id,
c.fk_news_group_id AS c__fk_news_group_id, c.fk_company_id AS
c__fk_company_id, c.import_date AS c__import_date, c.highlight AS
c__highlight, c.status AS c__status, c.ord AS c__ord, c.news_date AS
c__news_date, c.fk_media_id AS c__fk_media_id, c.title AS c__title,
c.search_title_idx AS c__search_title_idx, c.stored AS c__stored, c.tono AS
c__tono, c.media_type AS c__media_type, c.fk_editions_news_id AS
c__fk_editions_news_id, c.dossier_selected AS c__dossier_selected,
c.update_stats AS c__update_stats, c.url_news AS c__url_news, c.url_image
AS c__url_image, m.id AS m__id, m.name AS m__name, m.media_type AS
m__media_type, m.media_code AS m__media_code, m.fk_data_source_id AS
m__fk_data_source_id, m.language_iso AS m__language_iso, m.country_iso AS
m__country_iso, m.region_iso AS m__region_iso, m.subregion_iso AS
m__subregion_iso, m.media_code_temp AS m__media_code_temp, m.url AS m__url,
m.current_rank AS m__current_rank, m.typologyid AS m__typologyid,
m.fk_platform_id AS m__fk_platform_id, m.page_views_per_day AS
m__page_views_per_day, m.audience AS m__audience, m.last_stats_update AS
m__last_stats_update, n.id AS n__id, n.fk_media_id AS n__fk_media_id,
n.fk_news_media_id AS n__fk_news_media_id, n.fk_data_source_id AS
n__fk_data_source_id, n.news_code AS n__news_code, n.title AS n__title,
n.searchfull_idx AS n__searchfull_idx, n.news_date AS n__news_date,
n.economical_value AS n__economical_value, n.audience AS n__audience,
n.media_type AS n__media_type, n.url_news AS n__url_news, n.url_news_old AS
n__url_news_old, n.url_image AS n__url_image, n.typologyid AS
n__typologyid, n.author AS n__author, n.fk_platform_id AS
n__fk_platform_id, n2.id AS n2__id, n2.name AS n2__name, n3.id AS n3__id,
n3.name AS n3__name, f.id AS f__id, f.name AS f__name, n4.id AS n4__id,
n4.opentext AS n4__opentext, i.id AS i__id, i.name AS i__name, i.ord AS
i__ord, i2.id AS i2__id, i2.name AS i2__name FROM company_news_internet c LEFT
JOIN media_internet m ON c.fk_media_id = m.id AND m.media_type = 4
LEFT JOINnews_internet n ON c.fk_news_id =
n.id AND n.media_type = 4 LEFT JOIN news_media_internet n2 ON
n.fk_news_media_id = n2.id AND n2.media_type = 4 LEFT
JOINnews_group_internet n3 ON c.fk_news_group_id =
n3.id AND n3.media_type = 4 LEFT JOIN feed_internet f ON n3.fk_feed_id =
f.id LEFT JOIN news_text_internet n4 ON c.fk_news_id = n4.fk_news_id AND
n4.media_type = 4 LEFT JOIN internet_typology i ON n.typologyid = i.id LEFT
JOIN internet_media_platform i2 ON n.fk_platform_id = i2.id
WHERE(c.fk_company_id = '16073' AND c.status <> '-3' AND n3.fk_feed_id
= '30693'
AND n3.status = '1' AND f.fk_company_id = '16073') AND n.typologyid IN
('6', '7', '1', '2', '3', '5', '4') AND c.id > '49764393' AND c.news_date
>= '2012-04-02'::timestamp - INTERVAL '4 months' AND n.news_date >=
'2012-04-02'::timestamp - INTERVAL '4 months' AND c.fk_news_group_id IN
('43475') AND (c.media_type = 4) ORDER BY c.news_date DESC, c.id DESC LIMIT
200

Takes about 20 second in server A but in new server B takes 150 seconds...
In EXPLAIN I have noticed that sequential scan on table
news_internet_201112 takes 2s:
-> Seq Scan on news_internet_201112 n (cost=0.00..119749.12
rows=1406528 width=535) (actual time=0.046..2186.379 rows=1844831 loops=1)
Filter: ((news_date >= '2011-12-02 00:00:00'::timestamp without
time zone) AND (media_type = 4) AND (typologyid = ANY
('{6,7,1,2,3,5,4}'::integer[])))

While in Server B, takes 11s:
-> Seq Scan on news_internet_201112 n (cost=0.00..119520.12
rows=1405093 width=482) (actual time=0.177..11783.621 rows=1844831 loops=1)
Filter: ((news_date >= '2011-12-02 00:00:00'::timestamp without
time zone) AND (media_type = 4) AND (typologyid = ANY
('{6,7,1,2,3,5,4}'::integer[])))

Is notorious that, while in server A, execution time vary only few second
when I execute the same query repeated times, in server B, execution time
fluctuates between 30 and 150 second despite the server dont have any
client.

In other example, when I query entire table, running twice the same query:
Server 1
------------
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..457010.37
rows=6731337 width=318) (actual time=0.042..19665.155 rows=6731337 loops=1)
Total runtime: 20391.555 ms
-
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..457010.37
rows=6731337 width=318) (actual time=0.012..2171.181 rows=6731337 loops=1)
Total runtime: 2831.028 ms

Server 2
------------
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..369577.79
rows=6765779 width=323) (actual time=0.110..10010.443 rows=6765779 loops=1)
Total runtime: 11552.818 ms
-
EXPLAIN ANALYZE SELECT * from company_news_internet_201111 ;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on company_news_internet_201111 (cost=0.00..369577.79
rows=6765779 width=323) (actual time=0.023..8173.801 rows=6765779 loops=1)
Total runtime: 12939.717 ms

It seems that Server B don cache the table¿?¿?

I'm lost, I had tested different file systems, like XFS, stripe sizes...
but I not have had results

Any ideas that could be happen?

Thanks a lot!!

--
César Martín Pérez
cmartinp(at)gmail(dot)com

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mike DelNegro 2012-04-03 12:37:42 Re: H800 + md1200 Performance problem
Previous Message Jeff Janes 2012-04-02 16:51:39 Re: TCP Overhead on Local Loopback