Re: Source Code Help Needed

From: Vikram Kalsi <vikramkalsi(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Source Code Help Needed
Date: 2005-06-03 07:25:11
Message-ID: ed5f0fd7050603002551a4b897@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom, Thanks a ton again, and, here's another problem that has me really
puzzled-

I'm starting with a fresh install of pgsql-8.0.1, and make 3 changes-

1.) src/include/nodes/relation.h, Add a new Variable, hutz_idx_benefit to
IndexOptInfo

typedef struct IndexOptInfo
{..............
/* Per IndexScan benefit, More than 1 indexscan maybe used for 1 tablescan
ex. w/ OR */
Cost hutz_idx_benefit;
..............} IndexOptInfo;

2.) src/backend/optimizer/path/costsize.c, cost_index(), assign value to
index->hutz_idx_benefit

run_cost += indexTotalCost - indexStartupCost;
index->hutz_idx_benefit = run_cost;
elog(NOTICE,"cost_index():index->indexoid=%u index->hutz_idx_benefit=%.2f",
index->indexoid, index->hutz_idx_benefit);

3.) src/backend/optimizer/path/orindxpath.c, best_or_subclause_indexes(),
Read the value(s) of index->indexoid and index->hutz_idx_benefit

/* Gather info for each OR subclause */
foreach(slist, subclauses)
{...................
infos = lappend(infos, best_indexinfo);
...................}

/* DEBUG */
ListCell *l;
int count=0;
foreach(l, infos)
{
IndexOptInfo *index = (IndexOptInfo *) lfirst(l);
elog(NOTICE,"best_or_subclause_indexes():infos c=%i: indexoid=%u
hutz_idx_benefit=%.2f", count, index->indexoid, index->hutz_idx_benefit);
count++;
}
...................
pathnode->indexinfo = infos; /* indexinfo' is a list of IndexOptInfo nodes,
one per scan to be performed */

So, basically I have added a new variable alongside indexoid which is the
run_cost of one of the index scans if there are multiple index scans such as
in the case of OR subclauses for 1 table.
Now, I do a complete build and run two queries with OR subclauses as
follows-

tpcd=# select s_suppkey from supplier where (s_suppkey>125 and
s_suppkey<128) or (s_suppkey>175 and s_suppkey<185) or (s_suppkey>200 and
s_suppkey<215);
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.02
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.06
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.09
NOTICE: best_or_subclause_indexes():infos c=0: indexoid=186970
hutz_idx_benefit=2.09
NOTICE: best_or_subclause_indexes():infos c=1: indexoid=186970
hutz_idx_benefit=2.09
NOTICE: best_or_subclause_indexes():infos c=2: indexoid=186970
hutz_idx_benefit=2.09

On the second occasion, I change the order of the OR subclauses...

tpcd=# select s_suppkey from supplier where (s_suppkey>200 and
s_suppkey<215) or (s_suppkey>175 and s_suppkey<185) or (s_suppkey>125 and
s_suppkey<128);
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.09
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.06
NOTICE: cost_index():index->indexoid=186970 index->hutz_idx_benefit=2.02
NOTICE: best_or_subclause_indexes():infos c=0: indexoid=186970
hutz_idx_benefit=2.02
NOTICE: best_or_subclause_indexes():infos c=1: indexoid=186970
hutz_idx_benefit=2.02
NOTICE: best_or_subclause_indexes():infos c=2: indexoid=186970
hutz_idx_benefit=2.02

From the output, it can be seen that when I try to read the value(s), the
last value is stored in all the positions of the List "infos" which is later
assigned to "(IndexPath) pathnode->indexinfo" which is a List of
"IndexOptInfo" nodes, one per scan to be performed. Actually, it seems all
the pointers in the List "indexinfo" or "infos" are pointing to the same
object.

So,
Ques 1) Is my assumption correct that IndexPath->indexinfo should contain
all distinct IndexOptInfo structs with one for each of the scans to be
performed? If not, then why do we have multiple pointers to the same object?

(Ques 2) How can this be fixed? Is this a bug or something else?

(Ques 3) Is this a problem in other areas as well, for example the following
query doesn't give the expected values as well-
select s_suppkey, c_custkey from supplier, customer where s_suppkey>125 and
s_suppkey<128 and c_custkey>125 and c_custkey<135 and c_custkey=s_suppkey;

I appreciate all the help of this group,
Thanks,

On 5/25/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Vikram Kalsi <vikramkalsi(at)gmail(dot)com> writes:
> > So, I suppose that during the query planning and optimization stage,
> > the value of the original variables in the plan are somehow copied to
> > the plan which is finally returned inside pg_plan_query().
>
> Look in createplan.c --- there are a couple places in there you need to
> fix.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-06-03 07:58:10 Re: Tablespaces
Previous Message Hans-Jürgen Schönig 2005-06-03 06:41:29 Re: Tablespaces