Re: Anyone working on better transaction locking?

From: cbbrowne(at)cbbrowne(dot)com
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Anyone working on better transaction locking?
Date: 2003-04-12 15:00:51
Message-ID: 20030412150051.A7DC559A20@cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Scott Marlowe wrote:
> On Wed, 9 Apr 2003, Ron Peacetree wrote:
>
> > "Andrew Sullivan" <andrew(at)libertyrms(dot)info> wrote in message
> > news:20030409170926(dot)GH2255(at)libertyrms(dot)info(dot)(dot)(dot)
> > > On Wed, Apr 09, 2003 at 05:41:06AM +0000, Ron Peacetree wrote:
> > > Nonsense. You explicitly made the MVCC comparison with Oracle, and
> > > are asking for a "better" locking mechanism without providing any
> > > evidence that PostgreSQL's is bad.
> > >
> > Just because someone else's is "better" does not mean PostgreSQL's is
> > "bad", and I've never said such. As I've said, I'll get back to Tom
> > and the list on this.
>
> But you didn't identify HOW it was better. I think that's the point
> being made.

Oh, but he presented such detailed statistics to prove his case, didn't you
see it? :-)

> > > > Please see my posts with regards to ...
> > >
> > > I think your other posts were similar to the one which started this
> > > thread: full of mighty big pronouncements which turned out to depend
> > > on a bunch of not-so-tenable assumptions.
> > >
> > Hmmm. Well, I don't think of algorithm analysis by the likes of
> > Knuth, Sedgewick, Gonnet, and Baeza-Yates as being "not so tenable
> > assumptions", but YMMV. As for "mighty pronouncements", that also
> > seems a bit misleading since we are talking about quantifiable
> > programming and computer science issues, not unquantifiable things
> > like politics.
>
> But the real truth is revealed when the rubber hits the pavement.
> Remember that Linux Torvalds was roundly criticized for his choice of a
> monolithic development model for his kernel, and was literally told that
> his choice would restrict to "toy" status and that no commercial OS could
> scale with a monolithic kernel.

Indeed. I have the books from all of the above (when I studied databases
under Gonnet, Baeza-Yates was his TA...). And I have seen enough cases of the
conglomeration of multiple algorithms not behaving the way a blind read of
their books might suggest to refuse to blindly assume that things are so
simple.

In the /real/ world, the dictates of flushing buffers to help ensure
robustness can combine with having enough memory to virtually eliminate read
I/O to substantially change the results from some simplistic O(f(n)) analysis.

Which is NOT to say that computational complexity is unimportant; what it
indicates is that theoretical results are merely theoretical. And may only
represent a small part of what happens in practice. The nonsense about radix
sorts was a wonderful example; it would likely only be useful with PostgreSQL
if you had some fantastical amount of memory that might not actually be able
to be constructed within the confines of our solar system.

> There's no shortage of people with good ideas, just people with the skills
> to implement those good ideas. If you've got a patch to apply that's been
> tested to show something is faster EVERYONE here wants to see it.
>
> If you've got a theory, no matter how well backed up by academic research,
> it's still just a theory. Until someone writes to code to implement it,
> the gains are theoretical, and many things that MIGHT help don't because
> of the real world issues underlying your database, like I/O bandwidth or
> CPU <-> memory bandwidth.

An unfortunate thing (to my mind) is that *genuinely novel* operating system
research has pretty much disappeared. All we see, these days, are rehashes of
VMS, MVS, and Unix, along with some reimplementations of P-Code under monikers
like "JVM", ".NET" or "Parrot."

There's good reason for it; if you build something that is much more than 95%
indistinguishable from Unix, then you'll be left with the *enormous* projects
of creating completely new infrastructure for compilers, data persistence
("novel" would mean, to my mind, concepts different from files), program
editors, and such. But if it's 95% the same as Unix, then Emacs, GCC, CVS,
PostgreSQL, and all sorts of "tool chain" are available to you.

What is unfortunate is that it would be nice to try out some things that are
Very Different. Unfortunately, it might take five years of slogging through
recreating compilers and editors in order to get in about 6 months of "solid
novel work."

Of course, if you don't plan to lift your finger to help make any of it
happen, it's easy enough to "armchair quarterback" and suggest that someone
else do all sorts of would-be "neat things."

> > > I'm sorry to be so cranky about this, but I get tired of having to
> > > defend one of my employer's core technologies from accusations based
> > > on half-truths and "everybody knows" assumptions. For instance,
> > >
> > Again, "accusations" is a bit strong. I thought the discussion was
> > about the technical merits and costs of various features and various
> > ways to implement them, particularly when this product must compete
> > for installed base with other solutions. Being coldly realistic about
> > what a product's strengths and weaknesses are is, again, just good
> > business. Sun Tzu's comment about knowing the enemy and yourself
> > seems appropriate here...

> No, you're wrong. Postgresql doesn't have to compete. It doesn't have to
> win. it doesn't need a marketing department. All those things are nice,
> and I'm glad if it does them, but doesn't HAVE TO. Postgresql has to
> work. It does that well.

Having a bit more of a "marketing department" might be a nice thing; it could
make it easier for people that would like to deploy PG to get the idea past
the higher-ups that have a hard time listening to things that *don't* come
from that department.

> > > > I'll mention thread support in passing,
> > >
> > > there's actually a FAQ item about thread support, because in the
> > > opinion of those who have looked at it, the cost is just not worth
> > > the benefit. If you have evidence to the contrary (specific
> > > evidence, please, for this application), and have already read all
> > the
> > > previous discussion of the topic, perhaps people would be interested
> > in
> > > opening that debate again (though I have my doubts).
> > >
> > Zeus had a performance ceiling roughly 3x that of Apache when Zeus
> > supported threading as well as pre-forking and Apache only supported
> > pre forking. The Apache folks now support both. DB2, Oracle, and SQL
> > Server all use threads. Etc, etc.
>
> Yes, and if you configured your apache server to have 20 or 30 spare
> servers, in the real world, it was nearly neck and neck to Zeus, but since
> Zeus cost like $3,000 a copy, it is still cheaper to just overwhelm it
> with more servers running apache than to use zeus.

All quite entertaining. Andrew was perhaps trolling just a little bit there;
our resident "algorithm expert" was certainly easily sucked into leaping down
the path-too-much-trod. Just as with choices of sorting algorithms, it's easy
enough for there to be more to things than whatever the latest academic
propaganda about threading is.

The VITAL point to be made about threading is that there is a tradeoff, and
it's not the one that "armchair-quarterbacks-that-don't-write-code" likely
think of.

--> Hand #1: Implementing a threaded model would require a lot of work, and
the *ACTUAL* expected benefits are unknown.

--> Hand #2: So far, other *easier* optimizations have been providing
significant speedups, requiring much less effort.

At some point in time, it might be that "doing threading" might become the
strategy most expected to reap the most rewards for the least amount of
programmer effort. Until that time, it's not worth worrying about it.

> > That's an awful lot of very bright programmers and some serious $$
> > voting that threads are worth it.
>
> For THAT application. for what a web server does, threads can be very
> useful, even useful enough to put up with the problems created by running
> threads on multiple threading libs on different OSes.
>
> Let me ask you, if Zeus scrams and crashes out, and it's installed
> properly so it just comes right back up, how much data can you lose?
>
> If Postgresql scrams and crashes out, how much data can you lost?

There's another possibility, namely that the "voting" may not have anything to
do with threading being "best." Instead, it may be a road to allow the
largest software houses, that can afford to have enough programmers that can
"do threading," to crush smaller competitors. After all, threading offers
daunting new opportunities for deadlocks, data overruns, and crashes; if only
those with the most, best thread programmers can compete, that discourages
others from even /trying/ to compete.
--
output = ("cbbrowne" "@ntlug.org")
http://www3.sympatico.ca/cbbrowne/sgml.html
"I visited a company that was doing programming in BASIC in Panama
City and I asked them if they resented that the BASIC keywords were in
English. The answer was: ``Do you resent that the keywords for
control of actions in music are in Italian?''" -- Kent M Pitman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lamar Owen 2003-04-12 16:00:20 Re: Upgrade to Red Hat Linux 9 broke PostgreSQL
Previous Message Greg Stark 2003-04-12 14:59:57 Re: Anyone working on better transaction locking?