Re: [GENERAL] 4B row limit for CLOB tables

From: Matthew Kelly <mkelly(at)tripadvisor(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, José Luis Tallón <jltallon(at)adv-solutions(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] 4B row limit for CLOB tables
Date: 2015-02-05 06:53:11
Message-ID: A4F793D3-76D5-49EB-9121-BDB867955A11@tripadvisor.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> That's assuming that toasting is evenly spread between tables. In my experience, that's not a great bet...

Time to create a test:
SELECT chunk_id::bigint/100000 as id_range, count(*), count(*)/(100000::float) density FROM (SELECT chunk_id FROM pg_toast.pg_toast_39000165 WHERE chunk_id <
100000000 AND chunk_seq = 0) f GROUP BY id_range ORDER BY id_range;

The machine in question was restored in parallel in Sept 2013 as part of an upgrade from 8.4. It has about 2000 tables, so its definitely not dominated by a couple tables. Progress towards oid wrap around is about 25.6%.

With minimal effort, I found 2 bad examples, and I’m sure I can easily find more. I attached the results for those two.

There were runs of 1,100,000+ and 600,000+ chunk_ids where more than 99% of the chunk_id are taken. After restore completion, oid densities averaged less than 20 per 100,000 and 400 per 100,000 respectively. The only reasons those runs seem to be so short is because the tables were much smaller back then. I expect that next time I dump restore (necessary for upgrading OS versions due to the collation issue), I’m going to have runs closer to 20,0000,000.

> ... this "fix" would actually make things enormously worse. With the
> single counter feeding all tables, you at least have a reasonable
> probability that there are not enormously long runs of consecutive OIDs in
> any one toast table. With a sequence per table, you are nearly guaranteed
> that there are such runs, because inserts into other tables don't create a
> break.

It makes each toast table independent (and far less likely to wrap) . It would wrap when the sum(mods on THIS toast table) > 2^32. Right now the function looks like:

sum(mods on ALL toast tables in cluster) + sum(created normal tables in cluster * k) + sum(created temp tables in cluster * k) + [...] > 2^32,
WHERE k average number of ids consumed for pg_class, pg_type, etc...

In the case of an insert only table (which is a common use case for partitions), the id would only wrap when the TOAST table was “full”. On the other hand currently, it would wrap into its pg_restored section when the combined oid consuming operations on the cluster surpassed 4 billion.

That being said, I’m certainly not attached to that solution. My real argument is that although its not a problem today, we are only about 5 years from it being a problem for large installs and the first time you’ll hear about it is after someone has a 5 minute production outage on a database thats been taking traffic for 2 years.

- Matt K.

Attachment Content-Type Size
oid_densities_1.txt text/plain 2.3 KB
oid_densities_2.txt text/plain 18.8 KB

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sterfield 2015-02-05 08:54:36 DB encoding, locale and indexes
Previous Message Elijah Zupancic 2015-02-05 02:01:31 Re: pg_dump search path issue

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2015-02-05 08:50:05 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Peter Geoghegan 2015-02-05 04:39:02 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0