Quick Links

Re: Performance

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tomas Vondra <tv(at)fuzzy(dot)cz>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Performance
Date:	2011-04-27 21:01:46
Message-ID:	BANLkTimkNdOzx19uCkqrQaUQ4wEz+vSqnA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Apr 27, 2011 at 10:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> What if the user is using an SSD or ramdisk?
>
> Admittedly, in many cases, we could probably get somewhat useful
> numbers this way. But I think it would be pretty expensive.
> gettimeofday() is one of the reasons why running EXPLAIN ANALYZE on a
> query is significantly slower than just running it normally. I bet if
> we put such calls around every read() and write(), it would cause a
> BIG slowdown for workloads that don't fit in shared_buffers.

I've just been reading an article about something intimately related
with that in ACM.

The article was about cache-conscious scheduling. Mostly memory cache,
but disk cache isn't that different. There are lots of work, real,
serious work in characterizing cache contention, and the article
showed how a simplified version of the cache reuse profile model
behaves under various workloads.

The simplified model simply used cache miss rates, and it performed
even better than the more complex model - they went on and analyzed
why.

Long story short, there is indeed a lot of literature about the
subject, there is a lot of formal and experimental results. One of
those models have to be embodied into a patch, and tested - that's
about it.

The patch may be simple, the testing not so much. I know that.

What tools do we have to do that testing? There are lots, and all
imply a lot of work. Is that work worth the trouble? Because if it
is... why not work?

I would propose a step in the right direction: a patch to compute and
log periodical estimations of the main I/O tunables: random_page_cost,
sequential_page_cost and effective_cache_size. Maybe per-tablespace.
Evaluate the performance impact, and work from there.

Because, probably just using those values as input to the optimizer
won't work, because dbas will want a way to tune the optimizer,
because the system may not be stable enough, even because even with
accurate estimates for those values, the optimizer may not perform as
expected. I mean, right now those values are tunables, not real
metrics, so perhaps the optimizer won't respond well to real values.

But having the ability to measure them without a serious performance
impact is a step in the right direction, right?

In response to

Re: Performance at 2011-04-27 20:27:48 from Robert Haas

Responses

Re: Performance at 2011-04-29 23:03:23 from Robert Haas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Joseph Shraibman	2011-04-27 21:11:44	Re: index usage on queries on inherited tables
Previous Message	Tomas Vondra	2011-04-27 20:41:35	Re: Performance