Re: Dreaming About Redesigning SQL

From: Lauri Pietarinen <lauri(dot)pietarinen(at)atbusiness(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dreaming About Redesigning SQL
Date: 2003-10-26 23:06:27
Message-ID: bnhk4n$i3t$1@nyytiset.pp.htv.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Anthony W. Youngman wrote:

>In article <bn72o3$as$1(at)nyytiset(dot)pp(dot)htv(dot)fi>, Lauri Pietarinen <lauri.pie
>tarinen(at)atbusiness(dot)com> writes
>
>
>>Anthony W. Youngman wrote:
>>
>>
>>
>>>In article <bn4cca$dj0$1(at)nyytiset(dot)pp(dot)htv(dot)fi>, Lauri Pietarinen
>>><lauri(dot)pietarinen(at)atbusiness(dot)com> writes
>>>
>>>
>>>
>>>
>>>>Anthony W. Youngman wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>Well, if it is normalised, how easy is it for you to change the
>>>>customer_id of an order? Anyway,
>>>>
>>>>
>>>>
>>>>
>>>Incredibly easy. Just update the "customer_id" field of the invoice
>>>record. A single change to a single "row"
>>>
>>>
>>>
>>And I presume the system will automatically move all related stuff
>>(order details etc.) into
>>the same block as the new customer? How long will that take? What if
>>there is no room for it there?
>>
>>
>
>Well, I'd view an order as an entity. As such, I would give it its own
>FILE, and your question doesn't make sense.
>
But then your formula for disk head movements does not make sense either!

>But if the system did move
>the stuff, it would be four disk accesses - read/write to delete the old
>entry, read/write to save the new. As for "enough room" - well - it'll
>fall over if we have a "disk full" (or it might not).
>
"Not enough room" here means not enought room in the block of the
customer (from which you
were supposed to get all data in one read, or disk head movement). That
would mean that your
order information would be moved perhaps to another block and result in
an extra head movement,
or am I right?

>>>
>>>
>>>If we're indexed on order
>>>detail. If Hammer appears in N invoices, then T = (1+N) * ST * 1.05 for
>>>hammers, and the same for all the other products.
>>>
>>>Theory favours us, in that if a product appears X times in one invoice,
>>>that's one read for us and X for you, but hardware will probably help
>>>you more than us (that is, assuming thrashing cuts in) in that you stand
>>>a marginally higher chance of getting multiple instances of a product in
>>>any given read.
>>>
>>>
>>>
>>So for each product you get T = (1+N) * ST * 1.05.
>>
>>Now, for our SQL-DBMS, presuming that we build indexes for detail and
>>product:
>>
>>order_detail(product_id, qty, unit_price) = 20 bytes/row
>>product(product_id, product_name) = 50 bytes/row
>>
>>With 2 disk reads I would get
>>8K/20 = 400 order detail rows and
>>8K/50 = 160 product rows
>>
>>Since all rows are in product_id order, no need for random disk reads so
>>T = 1 + N/400 + P/160 (N=number of details, P=number of products)
>>for ALL products and details.
>>
>>And, because of sequential prefetch, we probably would not have to wait
>>for I/O's at all.
>>
>>Really, however you calculate it, it is an order of magnitude less
>>than your alternative.
>>
>>And please don't tell me that using indexes is not fair or not in the
>>spirit of the
>>relational model ;-)
>>
>>
>
>Well, it does result in data being stored multiple times ;-)
>
What on earth is wrong with that? Do you know how much 160GB of disk
cost's today?
I could ask: does your system work in, say 4KB? That's how much memory
the first
computer I used (a Wang 2000) had. Probably it would not work at
all. In the 50's
they did amazing things with hardly any compilers and very little
memory. I am referring
to Whirlwind. See http://www.cedmagic.com/history/whirlwind-computer.html.
Could you have done that with MV? My point? Why are we discussing
restrictions
to memory and CPU speed of the 70's and 80's? If an SQL DBMS uses more
memory
and disk, and it is available, why complain about *that*. Im not
impying that you
cannot complain about other matters, e.g. ease of development etc. and
you might
even be right. Be it as it is, I am not trying to make you abandon
your MV database.

>And while it maybe doesn't affect the result that much, you wanted the
>value? Where has that come from?
>
From e.g.
select p.product_id, product_name, sum(qty*unit_price)
from product, order_detail od
where p.product_id = od.product_id
group by p.product_id, product_name

This is the SQL statement that will result in

1 + N/400 + P/160 disk reads
(if rows not found in cache)

>What if the price changed half way
>through the period you're calculating?
>
Which price? The price that has already been paid by customer?

> :-) You've failed to answer your
>own question, so maybe I could match you ...
>
How have I failed?

>>>>>>And: what if I was just reading customer-data. Would the same formula
>>>>>>apply (= (2+N)*ST*1.05)?
>>>>>>
>>>>>>
>>>>>>
>>>>>Nope. If I understand you correctly, you want attributes that belong to
>>>>>the entity "customer", not the entity "invoice". T = ST * 1.05. (By the
>>>>>way, billing and/or invoice address (for example) are invoice
>>>>>attributes, not company attributes.)
>>>>>
>>>>>
>>>>>
>>>>No, I want you to give me a list of all your customers. How many disk
>>>>reads?
>>>>
>>>>
>>>>
>>>T = N * 1.05 where N is the number of customers. What do you want to
>>>know about those customers? Address? Phone number*s*? Anything else?
>>>That's *all* at no extra cost.
>>>
>>>
>>>
>>Well, no thanks. I just wanted their names this time.
>>The relational alternative, with an index on customer_name, would be
>>again an order
>>of magnitune less disk reads.
>>
>>
>>
>Well, if you let me use an index here, I'm sorry, GAME OVER! The best
>you can do would be a photo finish.
>

>Assuming an overhead of, say, 4 bytes per index entry, the entire index
>would be
>
>Size = 4 * N + sigma(name_length) + sigma(key_length)
>
>Okay, I've probably got some padding there as well, but so will you. And
>note I didn't say N * field_length, I said sigma(name_length). ;-)
>
What is sigma(key_length)?

>I notice that at no point have you asked where this strange 1.05 keeps
>coming from. That's why I keep hammering performance ... okay, maybe
>I've lost this "order detail" but it's why I keep hammering my
>confidence in general.
>
Sorry, I lost you here?

>Given that your data is all ordered optimally for
>answering this "detail" request, what's it going to cost you in time or
>disk space to answer the request "please recreate invoice X for me"?
>
Well, first of all, the fact that I optimized the "detail" request does
not cost me anything regarding
the other queries. It will impose a slight cost on updates, however
that would be hardly noticable
execpt for mass updates (update batch jobs) that change the column value.

>
>MV stores data efficently - look at how little space the index took :-)
>
>It accesses data efficiently - that 1.05 is actually the "how many
>places do I need to look" value for the database to respond to a
>userland request, given a known primary key or index value. Okay - that
>means we push back at the programmer some of the management of data
>access, but why should that be solely the response of the dbms? If it
>makes sense for the app to do it, then it should ... why should the dbms
>have to guess at how to optimise a request if the app has all the
>necessary information at its fingertips?
>
1) Your database might change over time and say a table that originally
had only a few rows
could suddenty grow considerably. Now an optimiser would insulate you
from these changes
or in the worst case all that would need to be done would be to create
an index (and, yes, check
that the DBMS starts using it).

2) You might have a product that runs in a number of sites: large ones
and small
ones. Now you would not have to reoptimise the programs for each type site.

3) Complex SQL-queries do quite a lot of things and it might not be very
obvious for
the programmer how to optimise best.

4) depending on input from user (say, a search screen) the optimal
access path may be different. An optimiser
could generate a different path depending on this input.

>But surely, your requirement for grabbing data across multiple invoices
>is statistically unusual.
>
You mean the product department would not be interested in seeing how
their products have
been selling, and to whom?

>And I benefit just as much as you from any ram
>being available to cache :-) although I wouldn't benefit so much from
>prefetch. The probability is that consecutive requests for data are
>either "can I know something else about the entity I've just looked at",
>or "can I access another entity at random".
>
However, if you think of all data relating to a customer, that could
amount to, say, 300KB, if
he had a long history. Do you think it is a good idea to pull all that
into memory just in case
the user want's to see his history for all 10 recent years? And there
is lot's of different kinds
of information related to customers. Would the user want to see
everything? Isn't it more
probable that a spesific user want's a certain *view* of the customer?

>In the former case, if you've stored it in another table, it's another
>request from the app to the dbms. With MV, it all came in the first
>request. In the latter case, this is where my 1.05 factor cuts in - bear
>in mind even for a simple btree file, this factor is only 1 for a
>1-level root only file - it goes up to 1.5 when the root bucket splits
>and keeps rising from there :-)
>
>So as an engineer, here I am appealing to stats :-) But this is the real
>world, and no stats? no real world! Because we have no optimiser, it
>encourages the programmer to optimise - I've heard various people say
>that if you want a SQL-using app to run fast you mustn't use views -
>forcing the programmer to interest themselves in the db in a manner that
>relational says they shouldn't.
>
You are correct. But those result mostly from irregularities of SQL (so
I have been told). They result
from the fact that SQL does *not* follow the theory, but instead was
changed to be more "practical".

The story of views is not quite so simple. In some cases views are very
useful and can be used safely.
In other situations you might get ugly surprises.

Take the view

create view california_customers as
select cust_id, cust_name
from customer
where state = 'CA'

The query
select cust_id, from cust_name
from california_customers
where cust_name like 'A%'

Will be transformed (under the covers) into
select cust_id, cust_name
from customer
where state = 'CA' and
cust_name like 'A%'

But in other cases the view will first be materialised into
a temporary table and the rest of the query would be
evaluated on that temp table.

Understanding when a DBMS knows how to do what
is not simple, and, hence, you are quite correct in
your observation.

> We're not interested in being able to
>improve the speed at which the db can find data to respond to an app
>request - with an access factor of 1.05 (actually, it's nearer 1.02 or
>1.03) we consider any effort there to be a waste of time ...
>
But isn't it better to have NO disk reads than one? I thought disk I/O
was rather expensive? With
that mentality you will always be disk bound.

>Basically, the only way you can beat us in the real world is to throw
>hardware at the problem - and like I said with linux and macro/micro
>kernels, we can do the same :-)
>
Well, please do!

>>>>>>>But as I understand relational theory, such a question is completely
>>>>>>>outside the scope of the theory. Seeing as it tries to divorce the
>>>>>>>database logic from the practical implementation ...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>The theory, indeed, does not say anything about buffer pools, but by
>>>>>>
>>>>>>
>>>>decoupling
>>>>
>>>>
>>>>>>logic
>>>>>>
>>>>>>
>>>>>>from implementation we leave the implementor (DBMS) to do as it feels fit
>>>>>
>>>>>
>>to
>>
>>
>>>>do.
>>>>
>>>>
>>>>
>>>>>>As DBMS technology advances, we get faster systems without having to
>>>>>>
>>>>>>
>>change
>>
>>
>>>>our
>>>>
>>>>
>>>>>>programs.
>>>>>>
>>>>>>
>
>Can you improve on what I've just done? Is any improvement POSSIBLE?
>
The whole trick is to have as much stuff in memory as possible, and
preferably the *right* stuff. Even
if we have a small memory the most used rows will remain in memory and
hence minimise the need
for disk I/O. And writing to disk is nearly always asyncronous so it
will not affect response time.
"The only good I/O is a dead I/O" ;-)

>
>
>>>>>But with MV, if our database is too large for current technology, we
>>>>>kick the shit out of relational for speed ...
>>>>>
>>>>>
>>>>>
>>What is "too large"?
>>
>>
>>
>>>>>Don't forget. You've already said that, if nothing is cached, my average
>>>>>case exceeds your best. And my case is *already* assuming that the
>>>>>system is seriously stressed and struggling ...
>>>>>
>>>>>
>>>>>
>>It does?
>>
>>
>
>Yes. I'll only be in trouble if I'm so short of ram that my working set
>gets forced into swap ...
>
What if your customer has accumulated, over the years, say 1000 orders?
Would you want to pollute
your cache with all those orders? Note that this is a problem that you
will first accounter after the
system has been running for quite a long time. In MV, what would you
do in a situation like this?

>>>>>>When we design databases we can decouple logical planning from performance
>>>>>>considerations, which, you must agree, are two separate issues.
>>>>>>
>>>>>>
>>>>>>
>>>Yes. BUT what's the point of having a database that is logically
>>>perfect, and who's performance is slow to the point of being unusable?
>>>
>>>Don't forget - in practice MultiValue ends up with a database that is
>>>*inherently* optimised such that it almost invariably outperforms an
>>>equivalent SQL database, AND we don't normally have DBAs to help us
>>>achieve that nirvana ...
>>>
>>>
>>>
>>Frankly, it may well be that PICK systems run faster and cheaper than
>>relational ones, but certainly
>>not for the reasons you state.
>>
>>
>>
>Well, could you optimise that index any more?
>
Which index?

>>>>>I can't find the post now :-( but is Christopher reading this? You know
>>>>>I compared that relational system on a twin Xeon 800, to an MV system
>>>>>running on a P90? Christopher made the (reasonable in the circumstances)
>>>>>assumption that the relational consultants must be crap, and the MV guy
>>>>>a guru. Actually, I'd come to exactly the OPPOSITE conclusion. My MV
>>>>>experience tells me that MV query was probably thrown together, by an
>>>>>average programmer, in 30 seconds. On the other hand, those SQL
>>>>>consultants had an axe to grind and a point to prove. They couldn't
>>>>>afford to let this "old fashioned" system beat them. That SQL query
>>>>>would have been optimised to within an inch of its life over weeks.
>>>>>Don't forget how proud they were to beat this MV system! Yet with
>>>>>hardware that was so much more powerful and a query that was heavily
>>>>>optimised, they had great difficulty beating a query that was thrown
>>>>>together in seconds by an average MV guy (or even just a luser!).
>>>>>
>>>>>Don't forget. I said I am a database *engineer*. Engineers believe in
>>>>>elegance, they believe in beauty. And when I look at relational, all I
>>>>>see is the theorists pleading "power", "hardware", "brute force", to get
>>>>>them out of trouble.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>You said that logical planning and performance are separate issues. And
>>>I wouldn't expect you to address the above example in a discussion of
>>>relational, because performance is irrelevant to relational.
>>>
>>>
>>>
>>I would have to know a lot more details to address it properly.
>>Performance is irrelevant to the model.
>>It's like E=mc**2. Nice theory and it actually works. But to get
>>performance out of it
>>(=exploding bomb) you have to solve lots of practical details. However,
>>without the theory
>>you could experiment for a milloin years without being able to build an
>>atom bomb.
>>
>>
>>
>>>But surely, the fact that I am SUPREMELY CONFIDENT that I can get
>>>superior performance from inferior hardware should give you pause for
>>>thought that maybe, just maybe, the relational model is flawed from an
>>>engineer's or scientist's viewpoint?
>>>
>>>
>>>
>>That's OK with me. But the most you can claim is that todays
>>IMPLEMENTATIONS are flawed,
>>and you would be 100% correct. How would you go and prove that the model
>>is flawed?
>>You should prove that a relational DBMS could not POSSIBLY be efficient.
>>
>>
>
>Well, if the relational people insist on divorcing theory from
>implementation, it's hard to see how they can prove it is efficient.
>While that is exactly what I'm trying to prove for MV. Whether
>relational is efficient or not is irrelevant, if I can prove MV is
>efficient and you can't prove the same for relational.
>
Well, I know of a lot of large banks, insurance companies etc... that
are using SQL DBMS'es
and I think they are running just fine. Amazon uses Oracle. Would you
say that their performance
is adequate? And I have first hand witnessed and built lot's of fast
systems that use SQL DBMS'es.

>If that results in running SQL over MV then we've won, I think :-) We
>can do that already ...
>
Not really, because it's the SQL that is the "relational" part (well,
it's not purely relational). So
the funny thing is that, what ever lies below the surface (=whatever
engine we are using) relational
get's all the credit!! Unfair, isn't it? As long as it LOOKS like its
relational to the user, it does not really matter
what happens under the hood.

>>>From the mathematician's (or logician's) viewpoint I agree it's
>>
>>
>>>flawless. But that's true of plenty of broken scientific theories...
>>>
>>>
>>>
>>Could you give me some other examples?
>>
>>
>
>Euclidean Geometry - just look at the equatorial event horizon of a
>black hole.
>Newtons laws of motion - just look at Mercury's orbit.
>Quantum mechanics - just look at a black hole.
>Relativity - just look at quantum mechanics :-) or Schrodinger's cat.
>
>Actually, it's probably true of pretty much all of theoretical physics
>since the start of last century ... in each case the only thing wrong
>with the theory is that reality just doesn't happen to agree ...
>
Are you suggesting that Newtons theories are totally useless and irrelevant?

kindest regards,
Lauri Pietarinen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lauri Pietarinen 2003-10-26 23:11:49 Re: Dreaming About Redesigning SQL
Previous Message Anthony W. Youngman 2003-10-26 23:04:02 Re: Dreaming About Redesigning SQL