Re: About "Our CLUSTER implementation is pessimal" patch

From: Leonardo F <m_lists(at)yahoo(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: About "Our CLUSTER implementation is pessimal" patch
Date: 2010-01-20 17:48:00
Message-ID: 925824.32998.qm@web29018.mail.ird.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I read the thread "Our CLUSTER implementation is pessimal"
> http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php .
>
> I would like to try/integrate that patch as we use CLUSTER a lot on our system.
>
> I was going to try to add the proper cost_index/cost_sort calls to decide which
> "path" should be executed, as in:
>
> http://archives.postgresql.org/pgsql-hackers/2008-09/msg00517.php

I think I got something up and running to check if a table scan + sort is supposed
to be faster than an index scan for a certain CLUSTER operation.

The way I did it is (I guess...) wrong: I created the elements needed by
get_relation_info, create_seqscan_path, create_index_path, cost_sort.

It has been, obviously, a trial and error approach: I added the member values as
soon as one function call crashed... and I bet I didn't get all the corner cases.
Is there any better way of doing it?

Leonardo

(this is called in copy_heap_data to decide which path to choose:)

static bool use_index_scan(Oid tableOid, Oid indexOid)
{
RelOptInfo *rel;
PlannerInfo *root;
Query *query;
PlannerGlobal *glob;
Path *seqAndSortPath;
IndexPath *indexPath;
RangeTblEntry *rte;

rel = makeNode(RelOptInfo);
rel->reloptkind = RELOPT_BASEREL;
rel->relid = 1;
rel->rtekind = RTE_RELATION;

/* needed by get_relation_info */
glob = makeNode(PlannerGlobal);

/* needed by get_relation_info: */
query = makeNode(Query);
query->resultRelation = 0;

root = makeNode(PlannerInfo);

root->parse = query;
root->glob = glob;

get_relation_info(root, tableOid, false, rel);
seqAndSortPath = create_seqscan_path(NULL, rel);

rel->rows = rel->tuples;

rte = makeNode(RangeTblEntry);
rte->rtekind = RTE_RELATION;
rte->relid = tableOid;

root->simple_rel_array_size = 2;
root->simple_rte_array = (RangeTblEntry **)
palloc0(root->simple_rel_array_size * sizeof(RangeTblEntry *));
root->simple_rte_array[1] = rte;

root->total_table_pages = rel->pages;

indexPath = create_index_path(root, (IndexOptInfo*)(list_head(rel->indexlist)->data.ptr_value), NULL, NULL, ForwardScanDirection, NULL);
cost_sort(seqAndSortPath, root, NULL, seqAndSortPath->total_cost, rel->tuples, rel->width, -1);

return indexPath->path.total_cost < seqAndSortPath->total_cost;
}

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-01-20 17:59:31 Re: Python verison for build in config.pl (Win32)
Previous Message Andres Freund 2010-01-20 17:41:15 Re: An example of bugs for Hot Standby