Re: Planning without reason.

From: Tzahi Fadida <Tzahi(dot)ML(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Planning without reason.
Date: 2006-06-23 14:12:14
Message-ID: 200606231712.14671.Tzahi.ML@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday 23 June 2006 16:14, Martijn van Oosterhout wrote:
> On Fri, Jun 23, 2006 at 03:57:19PM +0300, Tzahi Fadida wrote:
> > R contains indices but not on all attributes or not on
> > all ordered subset of keys.
> >
> > Query example:
> > (SELECT * FROM R
> > WHERE a=3, b=6,. ...)
> > UNION
> > (SELECT * FROM R
> > WHERE b=5, d=2,. ...)
> > UNION
> > ....
> > And lots of unions.
>
> Do you need UNION, or do you actually mean UNION ALL?

I am using UNION ALL, but it doesn't matter
i am actually doing:
(SELECT * FROM R
WHERE a=3, b=6,. ... LIMIT 1)
UNION ALL
(SELECT * FROM R
WHERE b=5, d=2,. ... LIMIT 1)
UNION ALL
....

with LIMIT 1.
My initial reasoning was to avoid extra sorts but i guess that the planner
just doesn't get the LIMIT 1. I see now that UNION should be better
for the planner to undestand (not performance wise).
However, UNION alone, doesn't seem to cut it.
Following is an example. t7 has 2 attributes and a non-unique index on one
attribute. here is a printout:
explain analyze (select * from t7 where a4=113 LIMIT 1) UNION (select * from
t7 where a2=139 LIMIT 1);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=23.18..23.19 rows=2 width=8) (actual time=0.149..0.165 rows=1
loops=1)
-> Sort (cost=23.18..23.18 rows=2 width=8) (actual time=0.142..0.148
rows=2 loops=1)
Sort Key: a4, a2
-> Append (cost=0.00..23.17 rows=2 width=8) (actual
time=0.052..0.106 rows=2 loops=1)
-> Limit (cost=0.00..5.65 rows=1 width=8) (actual
time=0.046..0.049 rows=1 loops=1)
-> Index Scan using indext7 on t7 (cost=0.00..5.65
rows=1 width=8) (actual time=0.038..0.038 rows=1 loops=1)
Index Cond: (a4 = 113)
-> Limit (cost=0.00..17.50 rows=1 width=8) (actual
time=0.035..0.038 rows=1 loops=1)
-> Seq Scan on t7 (cost=0.00..17.50 rows=1 width=8)
(actual time=0.029..0.029 rows=1 loops=1)
Filter: (a2 = 139)
Total runtime: 0.334 ms
(11 rows)

>
> Also, couldn't you just do:
>
> SELECT * FROM R
> WHERE (a=3, b=6, ...)
> OR (b=5, d=2, ...)
> etc

No, a filtering action is not enough since my goal is to only use indices
when retrieving single tuples each time thus, if i will use OR i cannot
control the number of tuples returned by each Or clause.

>
> > I am currently just writing the query as a string and open a cursor.
> > Is there a simple way to use Datums instead of converting the attributes
> > to strings to create a plan for SPI.
> > 10x.
>
> I imagine SPI_prepare() and SPI_execp() would be used for this.

I am already using SPI_prepare but it uses a query of the form of a char
string, which i need to prepare and is quite long. I.e. if i have 100 tuples
i wish to retrieve it can be very wasteful to prepare the string in memory
and use SPI_prepare to prepare and later execute it.
better to use directly the datums (which i already have deformed from
previous operations).

>
> Have a nice day,

--
Regards,
        Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS:  see at
http://members.lycos.co.uk/my2nis/spamwarning.html

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2006-06-23 14:14:45 Re: [CORE] GPL Source and Copyright Questions
Previous Message Jonah H. Harris 2006-06-23 14:10:55 Re: Full Disjunction