| From: | Teodor Sigaev <teodor(at)sigaev(dot)ru> | 
|---|---|
| To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> | 
| Cc: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: POC, WIP: OR-clause support for indexes | 
| Date: | 2016-03-17 17:19:55 | 
| Message-ID: | 56EAE73B.3010603@sigaev.ru | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
> I also wonder whether the patch should add explanation of OR-clauses
> handling into the READMEs in src/backend/access/*
Oops, will add shortly.
>
> The patch would probably benefit from transforming it into a patch
> series - one patch for the infrastructure shared by all the indexes,
> then one patch per index type. That should make it easier to review, and
> I seriously doubt we'd want to commit this in one huge chunk anyway.
Ok, will do it.
> 1) fields in BrinOpaque are not following the naming convention (all the
> existing fields start with bo_)
fixed
>
> 2) there's plenty of places violating the usual code style (e.g. for
> single-command if branches) - not a big deal for WIP patch, but needs to
> get fixed eventually
hope, fixed
>
> 3) I wonder whether we really need both SK_OR and SK_AND, considering
> they are mutually exclusive. Why not to assume SK_AND by default, and
> only use SK_OR? If we really need them, perhaps an assert making sure
> they are not set at the same time would be appropriate.
In short: possible ambiguity and increasing stack machine complexity.
Let we have follow expression in reversed polish notation (letters represent a 
condtion, | - OR, & - AND logical operation, ANDs are omitted):
a b c |
Is it ((a & b)| c) or (a & (b | c)) ?
Also, using both SK_ makes code more readable.
> 4) scanGetItem is a prime example of the "badly needs comments" issue,
> particularly because the previous version of the function actually had
> quite a lot of them while the new function has none.
Will add soon
>
> 5) scanGetItem() may end up using uninitialized 'cmp' - it only gets
> initialized when (!leftFinished && !rightFinished), but then gets used
> when either part of the condition evaluates to true. Probably should be
>
>      if (!leftFinished || !rightFinished)
>          cmp = ...
fixed
>
> 6) the code in nodeIndexscan.c should not include call to abort()
>
>      {
>          abort();
>          elog(ERROR, "unsupported indexqual type: %d",
>              (int) nodeTag(clause));
>      }
fixed, just forgot to remove
>
> 7) I find it rather ugly that the paths are built by converting BitmapOr
> paths. Firstly, it means indexes without amgetbitmap can't benefit from
> this change. Maybe that's reasonable limitation, though?
I based on following thoughts:
1 code which tries to find OR-index path will be very similar to existing
   generate_or_bitmap code. Obviously, it should not be duplicated.
2 all existsing indexes have amgetbitmap method, only a few don't. amgetbitmap
   interface is simpler. Anyway, I can add an option for generate_or_bitmap
   to use any index, but, in current state it will just repeat all work.
>
> But more importantly, this design already has a bunch of unintended
> consequences. For example, the current code completely ignores
> enable_indexscan setting, because it merely copies the costs from the
> bitmap path.
I'd like to add separate enable_indexorscan
> That's pretty dubious, I guess. So this code probably needs to be made
> aware of enable_indexscan - right now it entirely ignores startup_cost
> in convert_bitmap_path_to_index_clause(). But of course if there are
> multiple IndexPaths, the  enable_indexscan=off will be included multiple
> times.
>
> 9) This already breaks estimation for some reason. Consider this
...
> So the OR-clause is estimated to match 0 rows, less than each of the
> clauses independently. Needless to say that without the patch this works
> just fine.
fixed
>
> 10) Also, this already breaks some regression tests, apparently because
> it changes how 'width' is computed.
fixed too
> So I think this way of building the index path from a BitmapOr path is
> pretty much a dead-end.
I don't think so because separate code path to support OR-clause in index will 
significanlty duplicate BitmapOr generator.
Will send next version as soon as possible. Thank you for your attention!
-- 
Teodor Sigaev                                   E-mail: teodor(at)sigaev(dot)ru
                                                    WWW: http://www.sigaev.ru/
| Attachment | Content-Type | Size | 
|---|---|---|
| index_or-3.patch.gz | application/x-gzip | 20.2 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Steele | 2016-03-17 17:22:25 | Re: Re: Add generate_series(date,date) and generate_series(date,date,integer) | 
| Previous Message | Robert Haas | 2016-03-17 17:17:44 | Re: Re: Add generate_series(date,date) and generate_series(date,date,integer) |