| From: | Chris Papademetrious <Christopher(dot)Papademetrious(at)synopsys(dot)com> |
|---|---|
| To: | "pgsql-novice(at)lists(dot)postgresql(dot)org" <pgsql-novice(at)lists(dot)postgresql(dot)org> |
| Subject: | is there a way to automate deduplication of strings? |
| Date: | 2025-12-27 12:36:20 |
| Message-ID: | DM4PR12MB603953767048EE1B8A39283ADDB1A@DM4PR12MB6039.namprd12.prod.outlook.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-novice |
Hello everyone! First time poster here.
I have a question about deduplicating text strings stored in a database. I am aware of the pattern of creating a separate table for unique values, then referencing those values by key. But this requires some transactional complexity for storage and retrieval, along with cleanup of no-longer-referenced values over time. And, this complexity grows with the number of primary-table columns that use this indirection.
I would only use this for (1) seldom-referenced columns that (2) have a high rate of duplication and (3) have an average string length that makes deduplication worthwhile.
Are there any native or extension-based methods to simplify this in Postgres? I searched and came up empty, but maybe I'm not searching with the right terms.
Thanks!
* Chris
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Greg Sabino Mullane | 2025-12-31 15:11:52 | Re: is there a way to automate deduplication of strings? |
| Previous Message | Laurenz Albe | 2025-11-28 07:15:31 | Re: AW: how long should Archive logs be retained |