| From: | Mark Dilger <hornschnorter(at)gmail(dot)com> |
|---|---|
| To: | Andrew Kane <andrew(at)chartkick(dot)com> |
| Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Subject: | Re: A space-efficient, user-friendly way to store categorical data |
| Date: | 2018-02-13 03:01:33 |
| Message-ID: | A838A06F-5481-49BF-A4F1-23A17F7779DB@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Feb 12, 2018, at 5:08 PM, Andrew Kane <andrew(at)chartkick(dot)com> wrote:
>
> Thanks everyone for the feedback. The current enum implementation requires you to create a new type and add labels outside a transaction prior to an insert.
>
> -- on table creation
> CREATE TYPE city AS ENUM ();
> CREATE TABLE "users" ("city" city);
>
> -- on insert
> ALTER TYPE city ADD VALUE IF NOT EXISTS 'Chicago';
> BEGIN;
> INSERT INTO "users" ("city") VALUES ('Chicago');
> COMMIT;
>
> What would be ideal:
>
> -- on table creation
> CREATE TABLE "users" ("city" dynamic_enum);
>
> -- on insert
> BEGIN;
> INSERT INTO "users" ("city") VALUES ('Chicago');
> COMMIT;
>
> Since enums have a fixed number of labels, this type of feature may be better off as a property you could add to text columns (as Thomas mentions). This would avoid issues with hitting the max number of labels.
In your proposed feature, what happens if I create two tables:
CREATE TABLE myusers (city dynamic_enum);
CREATE TABLE yourusers (city dynamic_enum);
Do you imagine that myusers and yourusers are referring to the
same enum or to two different enums? Are the enums stored in
a new table within pg_catalog, or are they stored in something akin
to a toast table? If you insert billions of rows into a table, but only
have 30 distinct values, can you quickly query for all 30 distinct enum
values, or would you have to walk billions of rows to find them all?
mark
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mark Dilger | 2018-02-13 03:11:25 | Re: A space-efficient, user-friendly way to store categorical data |
| Previous Message | Masahiko Sawada | 2018-02-13 02:58:35 | Typo in origin.c |