Quick Links

Re: spikes in pgbench read-only results

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: spikes in pgbench read-only results
Date:	2012-01-23 17:03:17
Message-ID:	CAHyXU0yzsu9ZjwNFdnLvoRzFF481gzBU+D0T4vfyvjij=LjhwA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

2012/1/22 Tomas Vondra <tv(at)fuzzy(dot)cz>:
> Hi,
>
> I'm working on a benchmark that demonstrates the effects of moving
> tables or indexes to separate devices (SSD and HDD), and one thing that
> really caught my eye are spikes in the tps charts. See this:
>
> http://www.fuzzy.cz/tmp/data-indexes/indexes.html
>
> The first one is a database with data on an SSD and indexes on 7.2k HDD.
> Around 2:00, the performance significantly grows (over 4k tps) and then
> falls to about 500 tps (which is maintained for the remainder of the
> benchmark).
>
> I've seen similar spikes on HDD (both data and indexes on the same
> device) - that's the second chart. The difference is not that huge, but
> the spike at around 6:00 is noticeable.
>
> Interestingly, by separating the data and indexes to two 7.2k drives,
> the spike disappears - that's the third chart.
>
> Any ideas why this happens? Is this a pgbench-only anomaly that does not
> happen in real-world scenarios?
>
> My theory is that it's related to the strategy that chooses what to keep
> in shared_buffers (or page cache), and that somehow does not work too
> well in this case.

ISTM the spike is coming from luck such that all data is read from
either the SSD or ram. Neither the o/s or pg are smart enough to try
and push all the buffering over the spinning disk so you are going to
see some anomalies coming from the reading patterns of the test -- you
are mainly measuring the %iops that are getting send to data vs index.

I bet if the index and data both were moved to the ssd you'd see no
pronounced spike just as they are when both are on hdd.

For huge, high traffic, mostly read workloads that are not cost bound
on storage, ssd is a no-brainer.

merlin

In response to

spikes in pgbench read-only results at 2012-01-22 23:17:18 from Tomas Vondra

Responses

Re: spikes in pgbench read-only results at 2012-01-23 17:39:14 from Tomas Vondra

Browse pgsql-performance by date

	From	Date	Subject
Next Message	alexandre - aldeia digital	2012-01-23 17:22:51	Re: Partitioning by status?
Previous Message	Tomas Vondra	2012-01-22 23:17:18	spikes in pgbench read-only results