Skip site navigation (1) Skip section navigation (2)

Re: Scalability in postgres

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, david(at)lang(dot)hm, Scott Carey <scott(at)richrelevance(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Dimitri <dimitrik(dot)fr(at)gmail(dot)com>, Flavio Henrique Araque Gurgel <flavio(at)4linux(dot)com(dot)br>, Fabrix <fabrixio1(at)gmail(dot)com>, James Mansion <james(at)mansionfamily(dot)plus(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Scalability in postgres
Date: 2009-06-05 05:56:08
Message-ID: 4A28B378.90006@mark.mielke.cc (view raw or flat)
Thread:
Lists: pgsql-performance
Greg Smith wrote:
> This thread reminds me of Jignesh's "Proposal of tunable fix for 
> scalability of 8.4" thread from March, except with only a fraction of 
> the real-world detail.  There are multiple high-profile locks causing 
> scalability concerns at quadruple digit high user counts in the 
> PostgreSQL code base, finding them is easy.  Shoot, I know exactly 
> where a couple are, and I didn't have to think about it at all--just 
> talked with Jignesh a couple of times, led me right to them.  Fixing 
> them without causing regressions in low client count cases, now that's 
> the hard part.  No amount of theoretical discussion advances that any 
> until you're at least staring at a very specific locking problem 
> you've already characterized extensively via profiling.  And even 
> then, profiling trumps theory every time.  This is why I stay out of 
> these discussions and work on boring benchmark tools instead.

I disagree that profiling trumps theory every time. Profiling is useful 
for identifying places where the existing architecture exhibits the best 
and worst behaviour. It doesn't tell you whether a different 
architecture (even a slightly different architecture) would work better 
or worse. It might help identify architecture problems. It does not 
provide you with architectural solutions.

I think it would be more correct to say that prototyping trumps theory. 
That is, if somebody has a theory, and they invest time into a 
proof-of-concept patch, and post actual results to show you that "by 
changing this code over here to that, I get a N% improvement when using 
thousands of connections, at no measurable cost for the single 
connection case", these results will be far more compelling than theory.

Still, it has to involve theory, as not everybody has the time to run 
off and prototype every wild idea. Discussion can determine whether an 
idea has enough merit to be worth investing in a prototype.

I think several valuable theories have been discussed, many of which 
directly apply to the domain that PostgreSQL fits within. The question 
isn't about how valuable these theories are - they ARE valuable. The 
question is how much support from the team can be gathered to bring 
about change, and how willing the team is to accept or invest in 
architectural changes that might take PostgreSQL to the next level. The 
real problem here is the words "invest" and "might". That is, people are 
not going to invest on a "might" - people need to be convinced, and for 
people that don't have a problem today, the motivation to make the 
investment is far less.

In my case, all I have to offer you is theory at this time. I don't have 
the time to work on PostgreSQL, and I have not invested the time to 
learn the internals of PostgreSQL well enough to comfortably and 
effectively make changes to implement a theory I might have. I want to 
get there - but there are so many other projects and ideas to pursue, 
and I only have a few hours a day to decide what to spend it on.

You can tell me "sorry, your contribution of theory isn't welcome". In 
fact, that looks like exactly what you have done. :-)

If the general community agrees with you, I'll stop my contributions of 
theories. :-)

I think, though, that some of the PostgreSQL architecture is "old 
theory". I have this silly idea that PostgreSQL could one day be better 
than Oracle (in terms of features and performance - PostgreSQL already 
beats Oracle on cost :-) ). It won't get there without some significant 
changes. In only the last few years, I have watched as some pretty 
significant changes were introduced into PostgreSQL that significantly 
improved its performance and feature set. Many of these changes might 
have started with profiling - but the real change came from applied 
theory, not from profiling. Bitmap indexes are an example of this. 
Profiling tells you what - that large joins involving OR are slow? It 
takes theory to answer "why" and "so, what do we do about it?"

Cheers,
mark

-- 
Mark Mielke <mark(at)mielke(dot)cc>


In response to

Responses

pgsql-performance by date

Next:From: Laszlo NagyDate: 2009-06-05 09:58:40
Subject: Why is my stats collector so busy?
Previous:From: davidDate: 2009-06-05 04:33:53
Subject: Re: Scalability in postgres

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group