[RFC] SLIM Data Type - Compact JSON Alternative (17-62% smaller)

From: Marco Matteucci <marco(dot)matteucci(dot)1972(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: [RFC] SLIM Data Type - Compact JSON Alternative (17-62% smaller)
Date: 2026-01-10 16:28:03
Message-ID: CALOXXf5iPrp3wu88T_vzQ6aH0M2T5kYm-wySjhrP8PFKw6pqJQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

I'd like to propose a new data type for PostgreSQL: SLIM (Structured
Lightweight Interchange Markup), a compact alternative to JSONB that
achieves 17-62% storage reduction while remaining human-readable.

== The Problem ==

JSONB is great, but has inherent inefficiencies:
- Redundant field names in arrays of objects
- Verbose syntax (quotes, colons, brackets)
- Boolean representation uses 4-5 bytes each
- No format-level compression

For databases with millions of JSON documents, this adds up.

== SLIM Format ==

JSON:
{"users":[{"id":1,"name":"Alice","active":true},{"id":2,"name":"Bob","active":false}]}
SLIM: {users:|2|id#,name,active?|1,Alice,T|2,Bob,F}

Savings: 45%

Key optimizations:
- Table format for arrays: schema once, data in rows
- Type markers: # (number), ? (boolean), ! (null)
- Compact booleans: T/F instead of true/false
- No quotes for simple strings

== Real-World Benchmarks ==

Tested on a production TimescaleDB database (74.9 million rows, 74 GB):

Format | Sample | JSON | SLIM | Savings
--------------------------|-----------|---------|---------|--------
Object (token_metrics) | 100K rows | 16 MB | 13 MB | 18.7%
Object (wallet_activity) | 100K rows | 20 MB | 16 MB | 17.0%
Table format (arrays) | 10K rows | 580 KB | 222 KB | 61.7%

Extrapolated on 74 GB: 12.5-45.9 GB savings depending on data structure.

== Implementation ==

I've built a working PostgreSQL extension (pg_slim) that provides:
- Native SLIM data type with TOAST support
- Implicit casts to/from JSONB and TEXT
- Full operator support: ->, ->>, @>, <@, ?, ?|, ?&
- B-tree and hash indexing
- Comparison operators for sorting

Example usage:

CREATE TABLE documents (id SERIAL PRIMARY KEY, data SLIM);
INSERT INTO documents (data) VALUES ('{name:Alice,age:#30}'::slim);
INSERT INTO documents (data) VALUES
(slim_encode('{"name":"Bob"}'::jsonb));
SELECT data->>'name' FROM documents; -- Works like JSONB
SELECT * FROM documents WHERE data @> '{name:Alice}'::slim;

The extension compiles and passes tests on PostgreSQL 15.

== Questions for Discussion ==

1. Is there community interest in a SLIM type?
2. Should this remain an extension or be considered for contrib/core?
3. Which index types should be prioritized (GIN support is planned)?
4. Any concerns about the format design?

== Links ==

- Extension: https://github.com/matteuccimarco/pg-slim
- Full RFC:
https://github.com/matteuccimarco/pg-slim/blob/main/RFC-SLIM-TYPE.md
- SLIM spec: https://github.com/matteuccimarco/slim-protocol-core
- Benchmarks:
https://github.com/matteuccimarco/pg-slim/blob/main/BENCHMARK-RESULTS.md

Looking forward to your feedback.

Best regards,
Marco Matteucci

--
Marco Matteucci
+39 340 7063047
Vuoi un appuntamento con me? Clicca qui
<https://calendar.app.google/SaCBAJfiJywVfQJb7>

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-01-10 16:37:31 Re: Maybe BF "timedout" failures are the client script's fault?
Previous Message Japin Li 2026-01-10 16:20:19 Re: amcheck: support for GiST