Quick Links

Re: Replication vs. float timestamps is a disaster

From:	Chapman Flack <chap(at)anastigmatix(dot)net>
To:	Omar Kilani <omar(dot)kilani(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Replication vs. float timestamps is a disaster
Date:	2017-09-07 02:55:26
Message-ID:	59B0B51E.5020406@anastigmatix.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 09/06/17 18:33, Omar Kilani wrote:

> Is there anything people using float datetimes can do that isn't a
> pg_dumpall | pg_restore to do a less painful update?
>
> We have several TB of data still using float datetimes and I'm trying
> to figure out how we can move to 10 (currently on 9.6.x) without
> massive amounts of $ in duplicated hardware or downtime.

I ran into an analogous issue with endianness of PL/Java-defined datatypes,
and ended up devising a procedure [1] for treating the type one way on
output and another way on input, and then constructing an UPDATE query
that just assigns the column to itself using a function that's essentially
identity, but forces the output-input conversion (and doesn't look like
a simple identity function to the query optimizer, which otherwise might
optimize the whole update away).

While the details of that were specific to PL/Java and byte order,
something similar might work in your case if you can afford some period,
either just before or just after migration, when your normal workload
is blocked, long enough to run such updates on your several TB of data.

Or if that's too big a chunk of downtime, you might be able to add
a column in advance, of some user-defined type the same width as an
integer datetime, populate that in advance, migrate, then do a quick
catalog switcheroo of the column names and types. I've never done that,
so someone else might have comments on its feasibility or the best way
of doing it.

> the exact crash, but the datetime values were either too small or too
> large to fit into the integer datetimes field -- I can retry this if

That does seem important to get the details on. If the integer datetime
column won't represent your stuff (and the too-large or too-small values
are necessary to represent), then some schema change might be necessary.
Or, if the outlying values were sentinel values of some kind (I wonder,
how else would you even have datetimes outside of the int64 range?),
maybe they can be replaced with others that fit.

-Chap

[1]: https://tada.github.io/pljava/use/byteordermigrate.html

In response to

Re: Replication vs. float timestamps is a disaster at 2017-09-06 22:33:56 from Omar Kilani

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kyotaro HORIGUCHI	2017-09-07 03:33:47	Re: [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Previous Message	Peter Geoghegan	2017-09-07 02:26:36	Re: A design for amcheck heapam verification