Order-preserving function transforms and EquivalenceClass

From: Mat Arye <mat(at)timescaledb(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Order-preserving function transforms and EquivalenceClass
Date: 2017-03-23 16:33:46
Message-ID: CADsUR0Ccq3tMrcQwYd3c+o0yEL4mX761z-nkG1jos4iuZRNemQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am on a team developing an open-source extension for time-series data
storage in PostgreSQL (https://github.com/timescaledb/timescaledb).

We are trying to extend/hook into the planner so that it understands that
date_trunc('minute', time) has the same ordering as time (or rather that a
sort ordering on the latter is always a valid sort ordering on the former).
But this question really applies to any order-preserving transform such as
(time+1) vs (time).

Let's assume we can detect such cases. How do we tell the planner about it.

I see two ways of doing this:

(1) Having an EquivalenceClass with two members - the "time" and
"date_trunc" member. Then the pathkey for the sort would reference this
member and understand the equivalence. I think this is a pretty clean way
of doing things. But this equivalence between "time" and "date_trunc" only
applies to sorts. It should not apply to joins (for which EquivalenceClass
is also used) because these values are not actually equivalent. They are
only equivalent for sorting purposes. I know an EquivalenceClass has
a ec_has_volatile field which marks the EquivalenceClass as only for sorts
and not for joins. But is it an incorrect hack to set this for cases where
the EquivalenceMembers are not volatile? Is there some other way as marking
an EquivalenceClass as only for sorts that I am missing? Another wrinkle is
that while a date_trunc sort can be safely represented by a time sort the
reverse isn't true. Has there been any discussion on supporting such cases
in EquivalenceClasses? I'd be happy to submit a patch to the core if people
think this is worthwhile.

(2) I can create new EquivalenceClass objects and pathkeys and use planner
hooks to substitute the appropriate pathkeys for doing things like finding
indexes etc. I have a prototype where this works, but I just think approach
1 is much cleaner.

Any thoughts would be greatly appreciated. I am pretty new to the planner
codebase and just want to avoid any obvious pitfalls.

Thanks,
Matvey Arye

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2017-03-23 16:39:41 Re: Review: GIN non-intrusive vacuum of posting tree
Previous Message Pierre Ducroquet 2017-03-23 16:26:03 Re: Other formats in pset like markdown, rst, mediawiki