Re: Is there a good way to handle sum types (or tagged unions) in PostgreSQL?

From: Dominique Devienne <ddevienne(at)gmail(dot)com>
To: Victor Nordam Suadicani <v(dot)n(dot)suadicani(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Is there a good way to handle sum types (or tagged unions) in PostgreSQL?
Date: 2023-05-19 10:44:59
Message-ID: CAFCRh-_5vmC3oUOzNjtHKf_YpViq0D6AeJiHYUoh9jBaqRdX9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, May 18, 2023 at 2:28 PM Victor Nordam Suadicani <
v(dot)n(dot)suadicani(at)gmail(dot)com> wrote:

> Is there any nice way to handle sum types (aka tagged unions) in a
> PostgreSQL database? [...]
>
A third method would be to save all fields of all variants into a single
> table, with all fields being nullable.
>
So you'd have a nullable text field, nullable integer and nullable double
> precision field.
>

Yes, we do that. That's IMHO the only sane way to do it.
And if some of those alternatives are FKs (relations), that's the only
choice.

You'd then need an additional tag field to indicate which variant of the
> union is used
>

No, you don't need it. That's implicit from the NULL'ability of the
alternative columns.
If you want, you can have it as a generated column, thus read-only.
Worse, having it as an explicit column would make it denormalized, and
possibly out of sync.

> and you'd have to write check constraints for each variant to ensure that
> all the fields in that variant are not null and all the fields not in that
> variant *are* null.
>

Yes indeed.

> This *almost* works, but has two major problems:
>
> 1. It wastes space. In Rust, an enum is only as big as its largest
> variant. Using this method, a table row would be as big as the sum of all
> the variants.
>

Not really, or not to a point it matters that much.
I don't know about the actual physical bytes on disk for PostgreSQL, but as
an example in SQLite,
all columns have *at least* 1 "header" byte per value, and NULL values (and
0 and 1) have no "payload" bytes.
In PostgreSQL (which is more "rigidly typed" as DRH would say :)) you may
waste space for primitive types,
but not for text and bytea, which is where it really matters IMHO.

> 2. Querying the data is very cumbersome, [...].
>

Sure, it's cumbersome. But I don't get your point here. NULL handling is
part of SQL.
And sum-types (variants) implemented via exclusive NULL'ness is just a
special case.
You "dispatch" to the proper column on writes. You read all alternatives
and assign the one (if any) NOT NULL to the variant.

> Both of these problems get bigger and bigger as you add more variants - it
> doesn't scale well.
>

ORMs cannot magically resolve the impedence mismatch between SQL and
OO-based or sum-type based type systems a la Rust (and co).
If you need SQL, you need to design for SQL for the get go. Not shoehorn
your Rust data model into SQL.

My $0.02.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message sujay kadam 2023-05-19 11:47:47 How to connect with PostgreSQL Database with SSL using Certificates and Key from client Eclipse in Java
Previous Message FOUTE K. Jaurès 2023-05-19 09:04:37 Re: PostgreSQL 13 - Logical Replication - ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF detected