Re: Proposal: http2 wire format

From: Damir Simunic <damir(dot)simunic(at)wa-research(dot)ch>
To: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: http2 wire format
Date: 2018-03-26 13:05:25
Message-ID: A7ADB162-DD2F-439E-93AE-420CFEC3EDC1@wa-research.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 26 Mar 2018, at 11:06, Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com> wrote:
>
> Hi,
>
> >If anyone finds the idea of Postgres speaking http2 appealing
>
> HTTP/2 sounds interesting.
> What do you think of https://grpc.io/ ?
>
> Have you evaluated it?
> It does sound like a ready RPC on top of HTTP/2 with support for lots of languages.
>
> The idea of reimplementing the protocol for multiple languages from scratch does not sound too appealing.

This proposal takes the stance that having HTTP2 wire protocol in place will enable wide experimentation with and implementation of many new features and content types, but is not concerned with the specifics of those.

---
Let me illustrate with an example how it would look if we already had HTTP2 as proposed.

Lets’ say you have a building automation device on your network that happens to speak grpc, and you decided to use Postgres to store published topics in the database.

Your grpc-speaking device might connect to Postgres and issue a request like this:

HEADERS (flags = END_HEADERS)
:method = POST
:scheme = http
:path = /CreateTopic
pg-database = Publisher
content-type = application/grpc+proto
grpc-encoding = gzip
authorization = Bearer y235.wef315yfh138vh31hv93hv8h3v

DATA (flags = END_STREAM)
<Length-Prefixed Message>

(This is from grpc.io homepage; uppercase HEADERS and DATA are frame names from the HTTP2 specification).

Postgres would take care of TLS negotiation, unpack the frames, decompress the headers (:method, :path, etc are transferred compressed with a lookup table) and copy the payload into memory and make it all available to the backend. If this was the first request, it would start the backend for you as well.

Postgres doesn’t know about grpc, so it would just conveniently return "406 Not Supported” to your client and close the stream (but not the connection). Still connected and authenticated, the device could retry the request with `content-type: application/json`, and if you somehow programmed a function that accepts json, the request would go through. (Let’s imagine we have some kind of mechanism to associate functions to requests and content types, maybe through some function attributes in the catalog).

Say that someone else took the time and programmed a plugin that knows how to talk grpc. Then the server would call that plugin for you, validate and insert the data in the right table, and return 200 OK or 204 or whatever is appropriate to return according to grpc protocol semantics.

Obviously, someone has to implement a bunch of new code on the server side to ungzip, to interpret the content of the protobuf message and take action. But that someone doesn’t need to think of getting to all the metadata like compression type, payload format etc. Just somehow plug into the server at the right level read the data and metadata from memory, and then call into SPI to do its thing. Similar to how application servers work today. (Or Postgres for that matter, though it’s just it speaks FEBE and there’s no content type negotiation).

The same goes for the ‘authorization’ header. Postgres does not support Bearer token authorization today. But maybe you’ll be able to define a function that knows how to deal with the token, and somehow signal to Postgres that you want it to call this function when it sees such a header. Or maybe someone wrote a plugin that does that, and you configure your server to use it.

Then when connecting to Postgres with the above request, it would start the backend and call the function/plugin for you to decide whether to authorize the request. (As a side note, subsequent requests within the same connection would have this header compressed on the wire; that’s also a HTTP2 feature).

---

That’s only one possible scenario, and not the only one. In this specific scenario, the benefit is that Postgres will give you content negotiation built in, and will talk to any HTTP2 conforming client. Like you said, you don’t want to reimplement the protocol over and over.

But whether that content is grpc or something else, that's for a future discussion.

Current focus is really on getting the framing and extensibility in the core. Admittedly, haven’t yet figured out how to code all the details, but I’m more and more clear how this will work architecturally. Now it’s about putting lots of elbow grease into understanding the source, coding in C, and addressing all the issues that make sure the new protocol is 100% supporting all existing v3 use cases.

Beyond v3 use cases, top of my mind are improvements like you comment on in the topic “Binary transfer” in your “v4 wanted features” doc (and most of the other stuff you mention).

Damir

>
> Vladimir

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-03-26 13:08:43 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message Daniel Verite 2018-03-26 12:44:20 Re: Re: csv format for psql