Fwd: Joins and links

From: Leon <leon(at)udmnet(dot)ru>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Fwd: Joins and links
Date: 1999-07-05 15:24:52
Message-ID: 6850.990705@udmnet.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers!

I posted this message to general mailing list, and was told
that hackers' list is more appropriate place to post this
message to. What will you say about it?

This is a forwarded message
From: Leon <leon(at)udmnet(dot)ru>
To: pgsql-general <pgsql-general(at)postgreSQL(dot)org>
Date: Monday, July 05, 1999, 5:46:36 PM
Subject: Joins and links

===8<==============Original message text===============
Hello all!

You probably remember me - recently I complained about speed
of joins in Postgres. After a short investigation the way was
found in which Postgres's optimizer can do the right job. It
was constructive discussion. Now I want to tell you what could
make Postgres better and faster. And what will make us (our
development group) happy. Maybe I am bothering someone, if
I do - tell me that.

Let me begin.

First of all, some accounting reports need to be delivered
very fast - within minutes or so. And what's bad is that
quite a few of these reports are quite time-consuming and search
intensive. In particular, internals of these reports include
a lot of joins on tables.

Secondly, almost all of accounting information naturally
fits into network data model, which can be implemented very
efficiently.

This stuff described here is not accounting-specific, it
can be found in every database which uses master-detail
tables and other such types of relations.

So. How is join being performed in such cases? Although I am
not an expert, I can imagine the way: first there is an (index)
scan on first table, and then an (index) scan on the second.
It is the best way, reality could be much worse as we have seen.

How can we radically improve performance in such cases? There
is a simple and quite obvious way. (For you not to think that
I am hallucinating I will tell you that there exist some
real servers that offer such features I am talking about)
We should make a real reference in one table to another! That
means there could be special data type called, say, "link",
which is a physical record number in the foreign table.

Queries could look like this:

table1:
a int4
b link (->table2)

table2:
c int4
recnum (system auxiliary field, really a record number in the table)

select * from table2 where table1.a > 5 and table1.b = table2.recnum

Such joins can fly really fast, as practice shows :)
Just consider: the thing table1.b = table2.recnum is a READY-MADE
join, so server doesn't have to build anything on top of that. It
can simply perform lookup through link, and since it is a physical
record number, this is done with the efficiency of C pointers! Thus
performance gain is ENORMOUS.

And it simplifies the optimizer, because it doesn't have to decide
anything about whether to use indices and such like. The join is
performed always the same way, and it is the best way.

This feature, being implemented, could bring Postgres ahead
of most commercial servers, so proving creative abilities of
free software community. Let us make a step in the future!

Best regards,
Leon

===8<===========End of original message text===========

Best regards,
Leon mailto:leon(at)udmnet(dot)ru

Browse pgsql-hackers by date

  From Date Subject
Next Message Maarten Boekhold 1999-07-05 15:37:56 Re: [GENERAL] Joins and links
Previous Message Bruce Momjian 1999-07-05 15:10:06 Re: [GENERAL] Joins and links