Re: Access to current database from C-language function

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Achim Domma <domma(at)procoders(dot)net>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Access to current database from C-language function
Date: 2011-08-01 12:01:29
Message-ID: BB534F1B-18E7-488A-8F22-C5871E36E9C9@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Aug1, 2011, at 13:23 , Achim Domma wrote:
> I have tables which store two integer IDs and a floating point rank. So the table MyTable might have these columns:
>
> EntityID -> int
> PropertyID -> int
> Rank -> float
>
> My algorithm needs to retrieve EntityID-Rank-Pairs for some given PropertyIDs. So I basically want to execute a "select EntityID, Rank from MyTable where PropertyID=123 oder by Rank desc". But I need to execute multiple of those statements and I don't want to load all the data into memory, but rather iterate over the results step by step, stopping at certain thresholds.
>
> My algorithm is somewhat nested and contains logic which I cannot express in SQL,

Are you aware that postgres supports recursive subselects via the SQL standard's WITH RECURSIVE notation? I've found that WITH RECURSIVE lets you express a lot of things nicely which are otherwise very difficult, or even impossible, to express in SQL. Whether or not WITH RECURSIVE helps depends on whether your algorithm can be stated as a kind of closure process - i.e., a process where you continue to compute new rows from existing ones until no more new rows can found.

> but retrieving those pairs is the most basic operation I would need.
> Are cursors and option too?

It certainly seems so. Rows are computed and returned on demand when fetched through a cursor - opening the cursor does *not*, in general, load the whole result set into memory. There are statements where parts of the result are materialized before the first row is returned, though - statements which require a sorting step, for example. But in your case, an index on (PropertyID, Rank DESC) should avoid the need for explicit sorting, and instead allow the executor to read the rows in the desired output order. Thus, a cursor for your SQL statements should have a small startup cost and an equally small (and constant) cost per row afterwards.

In fact, for your SQL statement, any hand-crafted code that you might come up with will end up being very similar to what the executor does if you simply us a cursor.

> Is there a limitation for the number of open cursors? One call might open 100 cursors or so.

I don't know of any hard limit, and a couple of 100 cursors doesn't sounds unreasonable. I suggest you simply try it out.

best regards,
Florian Pflug

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-01 12:12:42 Re: lazy vxid locks, v3
Previous Message Simon Riggs 2011-08-01 11:35:26 Re: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)