Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)

From: Jan Vicherek <honza(at)ied(dot)com>
To: The Hermit Hacker <scrappy(at)hub(dot)org>
Cc: Dustin Sallings <dustin(at)spy(dot)net>, pgsql-questions(at)postgreSQL(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject: Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)
Date: 1998-01-24 03:41:50
Message-ID: Pine.LNX.3.96.980123213848.497I-100000@ann.ied.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Fri, 23 Jan 1998, The Hermit Hacker wrote:

> On Fri, 23 Jan 1998, Jan Vicherek wrote:
>
> > On Fri, 23 Jan 1998, The Hermit Hacker wrote:
> >
> > > I'm curious, but can the "Big (commercial) Boys" do this? If so,
> > > can you please provide an example of which and how?
> >
> > Hmm, well, the one we are switching from does this ;-) (Informix 3.3
> > ALL-II C interface). It's not SQL, tho.
>
> Okay, well, ummm...now you are comparing apples to oranges
> then...if you wanted a non-SQL engine to replace Informix, PostgreSQL
> isn't what you are looking for :(

Hmm, I thought that PostgreSQL would have two interfaces :

an SQL interface (doing all the optimizing, joining and the complicated
stuff) and
a low(er)-level C/C++/whatever interface which would allow to use less
of the intelligence of PostgreSQL and more of it's retireval system,
including indexing.

> > > all I've seen is ... and several examples on how to accomplish it using
> > > PostgreSQL, but that's it...
> >
> > Wait, have you seen here an example that accomplishes this which
> > wouldn't need the whole table copied and wouldn't lock the table against
> > updates ?
>
> First off, *nothing* you are going to be able to accomplish in
> *any* SQL engine is going to prevent locking the table against
> updates...

Sorry, I wasn't precise again : I mean I can't stop other people from
updating a table just because one guy is out there "browsing" through it.

> > > Why not put a sequence field on the table so that you can do:
> > > select * from table where rowid = n; -or-
> > > select * from table where rowid = n - 1; -or-
> > > select * from table where rowid = n + 1; -or-
> > > select * from table where rowid >= n and rowid <= n+x;
> > >
> > > And create the index on rowid?
> >
> > Because I also need to be able to INSERT rows. That would require
> > renumeration of half the table (remember, it's 40MB, 400,000 rows) every
> > time I do an INSERT.
>
> Okay, you are confusing INSERT then...INSERT in SQL just adds a
> row, that's it...it doesn't perform any "sorting" on it...that's what the
> ORDER BY command does...

The INSERT causes all indices on that table to be updated.
If I used a field in the table to serve as a pseudo-index (user-level
index, not system-level index), *then* that would require "manual"
renumeration of rowid field in half the table every time I do an INSERT.

> ...but, I now understand what *you* mean by INSERT...

good :)

> > I *still* think that there *has to* be a way to find a value that is
> > immediatelly next to one I have. This seems like such a primitive
> > operation. Even the backend must be doing it on some operations, it would
> > seem.
>
> Not possible...

Remeber that I have an index on the field that I'm looking for the
"next" value.
Is it not possible in such case ?
Even by calling some lower-level C function to give me the next value ?

> INSERT into a table doesn't "merge" the record
> between its lower/higher bounds...it just adds it to the very end of the
> table. And an index doesn't "sort" the data either...that is what the
> ORDER BY clause is for...

I guess I wasn't precise enough. I was aware of these two facts when I
was writing the previous msgs.
Index doesn't "sort". Index gets sorted. But pseudo-index doesn't even
get sorted, I would have to sort it "manually" after every INSERT I would
make.

> > Maybe even in SQL. Maybe something like (I'm not an SQL expert) : "SELECT
> > IndexField from MyTable where InxdexField > 'my_current_value' and
> > IndexField < ("all IndexFields that are bigger than the IndexField
> > searched for")
>
> From your sample above, is your first SQL call going to pull out
> all 40MB of data in one select statement, then your second 40MB minus the
> first X records?

Hmm, I could have hoped that the optimizer would recognize what I'm
trying to do and would only pull out that one record, but I guess not.
Never mind this attempt then.

> What you want to do is:
>
> begin
> declare mycursor cursor for select * from pg-user order by <somefield>;
> move $forward in FOO;
> fetch $retrieve in FOO;
> close foo;
> end;

(I guess you meant "close mycursor", not "close foo".)

I've just tried this on half-the-size table, and it took 70 seconds :-(

meray=> begin;
BEGIN
meray=> declare mycursor cursor for select * from master order by mpartno;
SELECT
meray=> fetch forward 1 in mycursor;
close mycursor;
end;
mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip
-------------------+--------+--------+-----+--------+-------+-----+--------+----------------
F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
(1 row)

meray=> close mycursor;
CLOSE
meray=> end;
END
meray=> \d masterind

Table = masterind
+----------------------------------+----------------------------------+-------+
| Field | Type | Length|
+----------------------------------+----------------------------------+-------+
| mpartno | (bp)char | 19 |
+----------------------------------+----------------------------------+-------+

Since the ORDER BY field (mpartno) has an index on it, why wouldn't
Postgres just take that index and go with it one-by-one ?! Such approach
wouldn't take 70 seconds on idle P200+ 64MB, but would take 1/2 a second.

Is such approach pricipally/fundamentally wrong, or is this correct
approach, and postgres just needs some more code written for this thing to
work ? (I.e. would other things break if this was implemented ?)

-- whereas single select (which uses index) takes half a second.
meray=> select * from master where mpartno = 'F 0 AVZK ';
mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip
-------------------+--------+--------+-----+--------+-------+-----+--------+----------------
F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
(2 rows)
meray=>

Now I want do see the row that is in the masterind after the first, in
this case it contains the same data.

If we can get "this new approach" to work,

> Basically, take your table, move to the $forward record in it,
> grab the next $retrieve records and then release the table.

But this creates 70-second lock on the table every time I want to see
next record :-(

> > Important : I'm not looking for a "pure SQL" solution. I'm writing a C
> > emulation library, so if it can be achieved via a call to a C Postgres
> > function, it would be great.
>
> You'd be better off looking at something like GDBM (which, by the
> way, also creates a lock against updates while another is reading the
> database)...unless I'm missing something, you aren't looking at doing
> anything that *requires* an SQL engine :(

This looks like a sad suggestion to me. I'll kick and scream just that
I wouldn't have to go with Velocis RAIMA's lower-level C interface. (I'm
just staring into its manual to figure out the sequence of C calls I would
have to make to accomplish this.)

Thanx to all,

Jan

-- Gospel of Jesus is the saving power of God for all who believe --
Jan Vicherek ## To some, nothing is impossible. ## www.ied.com/~honza
>>> Free Software Union President ... www.fslu.org <<<
Interactive Electronic Design Inc. -#- PGP: finger honza(at)ied(dot)com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-01-24 03:50:30 Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
Previous Message The Hermit Hacker 1998-01-24 01:32:03 Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)