[RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?
Date: 2017-08-19 14:29:25
Message-ID: FC76AD8AE7C94E88BF94A37A2B28945E@tunaPC
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Please forgive me for asking such a stupid and rough question.

I'm thinking of making PostgreSQL a multi-model database by supporting
data models other than the current relational model. A data model
consists of a query language (e.g. SQL for relational model, Cypher
for graph model), a parser and analyzer to transform a query into a
query tree, a planner to transform the query tree into an execution
plan, an executor, and a storage engine.

To promote the data model development, I want to make data models
pluggable. The rough sketch is:

1) A data model developer implements the parser, analyzer,
transformer, planner, executor, and storage engine functions in a
shared library.

2) The DBA registers the data model.

CREATE QUERY LANGUAGE Cypher (
PARSER = <parser function>
);

CREATE DATA MODEL graph (
QUERY LANGUAGE = Cypher,
ANALYZER = <analyzer function>,
TRANSFORMER = <transformer function>,
PLANNER = <planner function>,
EXECUTOR = <executor function>,
STORAGE ENGINE = <storage engine function>,
);

CREATE REGION cypher_graph (
QUERY LANGUAGE = Cypher,
DATA MODEL = graph
);

The region is just a combination of a query language and a data model,
much like a locale is a combination of a language and a country. This
is because there may be multiple popular query languages for a data
model.

3) The application connects to the database, specifying a desired
region. The specified region's query language becomes the default
query language for the session.

The application can use the data of multiple data models in one query
by specifying another region and its query via in_region(). For
example, the following query combines the relational restaurant table
and a social graph to get the five chinese restaurants in Tokyo that
are most popular among friends of John and their friends.

SELECT r.name, g.num_likers
FROM restaurant r,
cast_region(
in_region('cypher_graph',
'MATCH (:Person {name:"John"})-[:IS_FRIEND_OF*1..2]-(friend),
(friend)-[:LIKES]->(restaurant:Restaurant)
RETURN restaurant.name, count(*)'),
'relational', 'g', '(name text, num_likers int')
WHERE r.name = g.name AND
r.city = 'Tokyo' AND r.cuisine = 'chinese'
ORDER BY g.num_likers DESC LIMIT 5;

What do you think would be difficult to make data models pluggable,
especially related to plugging the parser, planner, executor, etc?
One possible concern is that various PostgreSQL components might be
too dependent on the data model being relational, and it would be
difficult to separate tight coupling.

Regards
MauMau

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2017-08-19 15:54:59 Re: [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?
Previous Message Piotr Stefaniak 2017-08-19 10:04:37 Re: recovery_target_time = 'now' is not an error but still impractical setting