Quick Links

Re: Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)

From:	The Hermit Hacker <scrappy(at)hub(dot)org>
To:	Jan Vicherek <honza(at)ied(dot)com>
Cc:	Dustin Sallings <dustin(at)spy(dot)net>, pgsql-questions(at)postgreSQL(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?)
Date:	1998-01-24 04:28:27
Message-ID:	Pine.BSF.3.96.980123235442.28536k-100000@thelab.hub.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 23 Jan 1998, Jan Vicherek wrote:

> an SQL interface (doing all the optimizing, joining and the complicated
> stuff) and
> a low(er)-level C/C++/whatever interface which would allow to use less
> of the intelligence of PostgreSQL and more of it's retireval system,
> including indexing.

*scratch head* indexing != sorting...indexing doesn't produce
sorted data...indexing provides a means to very quickly retrieve
*segments* of data from a large table of data...

> > First off, *nothing* you are going to be able to accomplish in
> > *any* SQL engine is going to prevent locking the table against
> > updates...
>
> Sorry, I wasn't precise again : I mean I can't stop other people from
> updating a table just because one guy is out there "browsing" through it.

The new code that Bruce put in *should* provide for better
schedualing of the locks on a table. Bruce can explain it alot better
then me, but the way I understand it, before, if someone had a READ lock
on the table and someone came in looking for a WRITE lock, they got held
out. If, while the READ lock was still in effect, someone else came in to
READ the table, they got in, effectively pushing the WRITE lock further
down the queue. With Bruce's patches, that WRITE lock will prevent
another READ from grabbing a "lock" until after the WRITE finishes. As I
haven't had a chance to look at it yet, I'm curious as to how often
someone sees a deadlock with the new code.

I'm running an radiusd accounting system on v6.2.1 that has one
table in it that is:

postgres(at)zeus> ls -lt rad*
-rw------- 1 postgres wheel 2678784 Jan 23 23:01 radlog_uniq_id
-rw------- 1 postgres wheel 1564672 Jan 23 23:01 radlog_stop
-rw------- 1 postgres wheel 2031616 Jan 23 23:01 radlog_start
-rw------- 1 postgres wheel 6406144 Jan 23 23:01 radlog
-rw------- 1 postgres wheel 1572864 Jan 23 23:01 radlog_userid
-rw------- 1 postgres wheel 79519744 Jan 23 22:51 radhist
-rw------- 1 postgres wheel 16818176 Jan 23 22:51 radhist_userid
-rw------- 1 postgres wheel 16793600 Jan 23 22:51 radhist_uniq_id
-rw------- 1 postgres wheel 11927552 Jan 23 22:51 radhist_stop
-rw------- 1 postgres wheel 11919360 Jan 23 22:51 radhist_start

Those are just two tables in it. The only problem that I've yet
to have with it, as far as "deadlocks" are concerned (deadlocks mean that
one lock is in place while a write lock is trying to happen) is during a
vacuum...and that's running on an old 486 with one hard drive and *just
upgraded to* 64Meg of RAM.

Oh...and I'd say a good 75% of the activities on those above
tables is multiple simultaneous updates...and the only thing that disrupts
that is the vacuum process...

Vadim...what is that humongous database you are running? And what
are you running it on?

> > > I *still* think that there *has to* be a way to find a value that is
> > > immediatelly next to one I have. This seems like such a primitive
> > > operation. Even the backend must be doing it on some operations, it would
> > > seem.
> >
> > Not possible...
>
> Remeber that I have an index on the field that I'm looking for the
> "next" value.
> Is it not possible in such case ?

No, as the index isn't "ordered"...there is no logical next/prev
record in an index

> Even by calling some lower-level C function to give me the next value ?

there are no lower-level C functions...for that, you'd be looking
at something like GDBM...

> Index doesn't "sort". Index gets sorted. But pseudo-index doesn't even
> get sorted, I would have to sort it "manually" after every INSERT I would
> make.

Index gets sorted by...?

> I've just tried this on half-the-size table, and it took 70 seconds :-(
>
> meray=> begin;
> BEGIN
> meray=> declare mycursor cursor for select * from master order by mpartno;
> SELECT
> meray=> fetch forward 1 in mycursor;
> close mycursor;
> end;
> mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip
> -------------------+--------+--------+-----+--------+-------+-----+--------+----------------
> F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
> (1 row)
>
> meray=> close mycursor;
> CLOSE
> meray=> end;
> END
> meray=> \d masterind
>
> Table = masterind
> +----------------------------------+----------------------------------+-------+
> | Field | Type | Length|
> +----------------------------------+----------------------------------+-------+
> | mpartno | (bp)char | 19 |
> +----------------------------------+----------------------------------+-------+
>
> Since the ORDER BY field (mpartno) has an index on it, why wouldn't
> Postgres just take that index and go with it one-by-one ?! Such approach

An index isn't sorted. An index is only used with a WHERE clause
(Bruce, I'm not misrepresenting that, am I?). A 'select * from table'
will only do a Sequential Scan through the table for the data it needs.

acctng=> select count(start) from radhist;
count
------
563258
(1 row)

acctng=> explain select * from radhist;
NOTICE:QUERY PLAN:

Seq Scan on radhist (cost=27022.38 size=537648 width=72)

EXPLAIN
acctng=> explain select * from radhist where userid = 'scrappy';
NOTICE:QUERY PLAN:

Index Scan on radhist (cost=2.05 size=1 width=72)

EXPLAIN
acctng=>

Damn, I wasn't thinking. Okay...an index *is* sorted, in a sense.
You index on *one* field of a table (with one exception that I know of,
but I'm not getting into that here). The index only makes use of that
if/when you do a WHERE clause on that field.

acctng=> explain select * from radhist where port = 1;
NOTICE:QUERY PLAN:

Seq Scan on radhist (cost=27022.38 size=1 width=72)

EXPLAIN

Notice above it does a Seq Scan? I don't have an index on port,
since I'll never do a WHERE "search" on it.

> wouldn't take 70 seconds on idle P200+ 64MB, but would take 1/2 a second.

How are you starting up postmaster? Take a look at the man page
for postgres, which has an option:

-S Specifies the amount of memory to be used by internal
sorts before using disk files for sorting. This
value is specified in 1k bytes, and defaults to 512.

By default, very little memory is "reserved" for sorting (ORDER
BY)...try increasing that and see if your performance improves any. My
accounting server has it set for 10240 (10Meg), since it too had 64Meg of
RAM and nobody expect as an accounting/RDBMS server. I could probably
increase it past that, not sure...

> Is such approach pricipally/fundamentally wrong, or is this correct
> approach, and postgres just needs some more code written for this thing to
> work ? (I.e. would other things break if this was implemented ?)

See my above explanation...Bruce or Vadim could probably explain
it better, as they are more solid in the internals themselves, but you are
misunderstanding what an index is and what it is used for :(

> -- whereas single select (which uses index) takes half a second.
> meray=> select * from master where mpartno = 'F 0 AVZK ';
> mpartno |mlistprc|mwarcomp|mcost|msupercd|mbasecd|mflag|mpackqty|mdescrip
> -------------------+--------+--------+-----+--------+-------+-----+--------+----------------
> F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
> F 0 AVZK | 581| 418| 311|S | | 6| 0|SPARK PLUG
> (2 rows)
> meray=>

Assuming you have an index built on mpartno, then yes, this would
use the index because (and only because) of the WHERE clause. Try using
'explain' at the beginning of your select statements...it will explain
*what* it is going to do, but not actually do it.

> Now I want do see the row that is in the masterind after the first, in
> this case it contains the same data.
>
> If we can get "this new approach" to work,

You can't, unless there is some logical "greater/less then" for
mpartno, such that you could do something like (syntax will be wrong, its
the concept that i'm showing):

begin;
declare cursor mycursor for select where mpartno >= 'F 0 AVZK '
ORDER by mpartno;
fetch forward 1 from mycursor;
close mycursor;
end;

That would speed up the search (should?) because now you are finding, in
the index, all mpartno's that come after the one your stipulate, and then
you are grabbing the first one off the stack...I don't guarantee it will
be faster, because it also depends on how much data is returned.

Just thought of it...after you load this table, you did do a 'vacuum
analyze' on it, right? Just to get the stats up dated?

> > Basically, take your table, move to the $forward record in it,
> > grab the next $retrieve records and then release the table.
>
> But this creates 70-second lock on the table every time I want to see
> next record :-(

Most of that 70-seconds, I am guess, was spent in the ORDER
by...try increasing -S to as much as you dare and see if that helps any...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

In response to

Attn PG gurus / coders : New approach for ORDER BY ? (was: Re: Show stopper ?) at 1998-01-24 03:41:50 from Jan Vicherek

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	The Hermit Hacker	1998-01-24 04:32:25	Re: [HACKERS] Show stopper ? (was: Re: "cruising" or "browsing" through tables using an index / ordering)
Previous Message	Jan Vicherek	1998-01-24 04:26:31	Re: Attn PG gurus / coders : New approach for ORDER BY ?