Re: ALTER TABLE ADD COLUMN fast default

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE ADD COLUMN fast default
Date: 2018-02-27 03:59:44
Message-ID: CAA8=A7_pNEZoiX3iSbqKC7QeV0N8Un32dEYH633cxH_t3i+NXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 21, 2018 at 7:48 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,

Comments interspersed.

>
> On 2018-02-20 17:03:07 +1030, Andrew Dunstan wrote:
>> + <row>
>> + <entry><structfield>attmissingval</structfield></entry>
>> + <entry><type>bytea</type></entry>
>> + <entry></entry>
>> + <entry>
>> + This column has a binary representation of the value used when the
>> + column is entirely missing from the row, as happens when the column is
>> + added after the row is created. The value is only used when
>> + <structname>atthasmissing</structname> is true.
>> + </entry>
>> + </row>
>
> Hm. Why is this a bytea, and why is that a good idea? In pg_statistics,
> which has to deal with something similar-ish, we deal with the ambiguity
> of the storage by storing anyarrays, which then can display the contents
> properly thanks to the embedded oid. Now it'd be a bit sad to store a
> one element array in the table. But that seems better than storing a
> bytea that's typeless and can't be viewed reasonably. The best solution
> would be to create a variadic any type that can actually be stored, but
> I see why you'd not want to go there necessarily.

The one-element array idea is a slight hack, but it's not that bad,
and I agree a good deal better than using a bytea. I've made that
change.

>
> But also, a bit higher level, is it a good idea to store this
> information directly into pg_attribute? That table is already one of the
> hotter system catalog tables, and very frequently the largest. If we
> suddenly end up storing a lot of default values in there, possibly ones
> that aren't ever actually used because all the default values are long
> since actually stored in the table, we'll spend more time doing cache
> lookups.
>
> I'm somewhat tempted to say that the default data should be in a
> separate catalog table. Or possibly use pg_attrdef.
>

This was debated quite some time ago. See for example
<https://www.postgresql.org/message-id/22999.1475770835%40sss.pgh.pa.us>
The consensus then seemed to be to store them in pg_attribute. If not,
I think we'd need a new catalog - pg_attrdef doesn't seem right. They
would have to go on separate rows, and we'd be storing values rather
than expressions, so it would get somewhat ugly.

>>
>> +/*
>> + * Return the missing value of an attribute, or NULL if there isn't one.
>> + */
>> +static Datum
>> +getmissingattr(TupleDesc tupleDesc,
>> + int attnum, bool *isnull)
>> +{
>> + int missingnum;
>> + Form_pg_attribute att;
>> + AttrMissing *attrmiss;
>> +
>> + Assert(attnum <= tupleDesc->natts);
>> + Assert(attnum > 0);
>> +
>> + att = TupleDescAttr(tupleDesc, attnum - 1);
>> +
>> + if (att->atthasmissing)
>> + {
>> + Assert(tupleDesc->constr);
>> + Assert(tupleDesc->constr->num_missing > 0);
>> +
>> + attrmiss = tupleDesc->constr->missing;
>> +
>> + /*
>> + * Look through the tupledesc's attribute missing values
>> + * for the one that corresponds to this attribute.
>> + */
>> + for (missingnum = tupleDesc->constr->num_missing - 1;
>> + missingnum >= 0; missingnum--)
>> + {
>> + if (attrmiss[missingnum].amnum == attnum)
>> + {
>> + if (attrmiss[missingnum].ammissingNull)
>> + {
>> + *isnull = true;
>> + return (Datum) 0;
>> + }
>> + else
>> + {
>> + *isnull = false;
>> + return attrmiss[missingnum].ammissing;
>> + }
>> + }
>> + }
>
> I think requiring this is a bad idea. If, and only if, there's default
> attributes with missing values we should build an array that can be
> directly indexed by attribute number, accessing the missing value. What
> you're doing here is essentially O(n^2) afaict, with n being the number
> of columns missing values.
>
>

Done

>> +/*
>> + * Fill in missing values for a TupleTableSlot
>> + */
>> +static void
>> +slot_getmissingattrs(TupleTableSlot *slot, int startAttNum, int lastAttNum)
>> +{
>> + AttrMissing *attrmiss;
>> + int missingnum;
>> + int missattnum;
>> + int i;
>> +
>> + /* initialize all the missing attributes to null */
>> + for (missattnum = lastAttNum - 1; missattnum >= startAttNum; missattnum--)
>> + {
>> + slot->tts_values[missattnum] = PointerGetDatum(NULL);
>> + slot->tts_isnull[missattnum] = true;
>> + }
>
> Any reason not to memset this in one (well two) go with memset? Also,
> how come you're using PointerGetDatum()? Normally we just use (Datum) 0,
> imo it's confusing to use PointerGetDatum(), as it implies things that
> are quite possibly not the case.
>
>

rewritten.

>> +/*
>> + * Fill in one attribute, either a data value or a bit in the null bitmask
>> + */
>
> Imo this isn't sufficient documentation to explain what this function
> does. Even if you just add "per-attribute helper for heap_fill_tuple
> and other routines building tuples" or such, I think it'd be clearer.

Changed that way.

>
>> +static inline void
>> +fill_val(Form_pg_attribute att,
>> + bits8 **bit,
>> + int *bitmask,
>> + char **dataP,
>> + uint16 *infomask,
>> + Datum datum,
>> + bool isnull)
>> +{
>> + Size data_length;
>> + char *data = *dataP;
>> +
>> + /*
>> + * If we're building a null bitmap, set the appropriate bit for the
>> + * current column value here.
>> + */
>> + if (bit != NULL)
>> + {
>> + if (*bitmask != HIGHBIT)
>> + *bitmask <<= 1;
>> + else
>> + {
>> + *bit += 1;
>> + **bit = 0x0;
>> + *bitmask = 1;
>> + }
>> +
>> + if (isnull)
>> + {
>> + *infomask |= HEAP_HASNULL;
>> + return;
>> + }
>> +
>> + **bit |= *bitmask;
>> + }
>> +
>> + Assert(att);
>
> Why isn't this asserted at the beginning?
>
>

I don't think it's worth very much anyway, so I just removed it.

>
>> @@ -293,10 +408,18 @@ heap_fill_tuple(TupleDesc tupleDesc,
>> * ----------------
>> */
>> bool
>> -heap_attisnull(HeapTuple tup, int attnum)
>> +heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
>> {
>> + /*
>> + * We allow a NULL tupledesc for relations not expected to have
>> + * missing values, such as catalog relations and indexes.
>> + */
>> + Assert(!tupleDesc || attnum <= tupleDesc->natts);
>
> This seems fairly likely to hide bugs.
>

How? There are plenty of cases where we simply expect quite reasonably
that there won't be any missing attributes, and we don't need a
tupleDesc at all in such cases.

>
>> if (attnum > (int) HeapTupleHeaderGetNatts(tup->t_data))
>> - return true;
>> + {
>> + return !(tupleDesc &&
>> + TupleDescAttr(tupleDesc, attnum - 1)->atthasmissing);
>
> Brrr.
>
>

rewritten.

>> +/*
>> + * Expand a tuple with missing values, or NULLs. The source tuple must have
>> + * less attributes than the required number. Only one of targetHeapTuple
>> + * and targetMinimalTuple may be supplied. The other argument must be NULL.
>> + */
>
> I can't quite make head or tails of this comment "with missing values, or NULLs"?
>
>

Rewritten.

>> +static void
>> +expand_tuple(HeapTuple *targetHeapTuple,
>> + MinimalTuple *targetMinimalTuple,
>> + HeapTuple sourceTuple,
>> + TupleDesc tupleDesc)
>> +{
>> + AttrMissing *attrmiss = NULL;
>> + int attnum;
>> + int missingnum;
>> + int firstmissingnum = 0;
>> + int num_missing = 0;
>> + bool hasNulls = HeapTupleHasNulls(sourceTuple);
>> + HeapTupleHeader targetTHeader;
>> + HeapTupleHeader sourceTHeader = sourceTuple->t_data;
>> + int sourceNatts = HeapTupleHeaderGetNatts(sourceTHeader);
>> + int natts = tupleDesc->natts;
>> + int sourceIndicatorLen;
>> + int targetIndicatorLen;
>> + Size sourceDataLen = sourceTuple->t_len - sourceTHeader->t_hoff;
>> + Size targetDataLen;
>> + Size len;
>> + int hoff;
>> + bits8 *nullBits = NULL;
>> + int bitMask = 0;
>> + char *targetData;
>> + uint16 *infoMask;
>> +
>> + Assert((targetHeapTuple && !targetMinimalTuple)
>> + || (!targetHeapTuple && targetMinimalTuple));
>> +
>> + Assert(sourceNatts < natts);
>> +
>> + sourceIndicatorLen = (hasNulls ? BITMAPLEN(sourceNatts) : 0);
>
> Maybe I'm just grumpy right now (did response work on 2016 taxes just
> now, brrrrr), but I don't find "indicatorLen" a descriptive term.
>
>

changed

>> + targetDataLen = sourceDataLen;
>> +
>> + if (tupleDesc->constr &&
>> + tupleDesc->constr->num_missing)
>> + {
>> + /*
>> + * If there are missing values we want to put them into the tuple.
>> + * Before that we have to compute the extra length for the values
>> + * array and the variable length data.
>> + */
>> + num_missing = tupleDesc->constr->num_missing;
>> + attrmiss = tupleDesc->constr->missing;
>
> You're accessing a hell of a lot of expensive stuff outside outside of
> this if block. E.g. HeapTupleHeaderGetNatts specifically is the prime
> source of cache misses in a lot of workloads. Now a smart compiler might
> be able to optimize this away, but...
>
>
>> + /*
>> + * Find the first item in attrmiss for which we don't have a value in
>> + * the source (i.e. where its amnum is > sourceNatts). We can ignore
>> + * all the missing entries before that.
>> + */
>> + for (firstmissingnum = 0;
>> + firstmissingnum < num_missing
>> + && attrmiss[firstmissingnum].amnum <= sourceNatts;
>> + firstmissingnum++)
>> + { /* empty */
>> + }
>> +
>> + /*
>> + * If there are no more missing values everything else must be
>> + * NULL
>> + */
>> + if (firstmissingnum >= num_missing)
>> + {
>> + hasNulls = true;
>> + }
>> +
>> + /*
>> + * Now walk the missing attributes one by one. If there is a
>> + * missing value make space for it. Otherwise, it's going to be NULL.
>> + */
>> + for (attnum = sourceNatts + 1;
>> + attnum <= natts;
>> + attnum++)
>> + {
>> + Form_pg_attribute att = TupleDescAttr(tupleDesc, attnum - 1);
>> +
>> + for (missingnum = firstmissingnum; missingnum < num_missing; missingnum++)
>> + {
>> +
>> + /*
>> + * If there is a missing value for this attribute calculate
>> + * the required space if it's not NULL.
>> + */
>> + if (attrmiss[missingnum].amnum == attnum)
>> + {
>> + if (attrmiss[missingnum].ammissingNull)
>> + hasNulls = true;
>> + else
>> + {
>> + targetDataLen = att_align_datum(targetDataLen,
>> + att->attalign,
>> + att->attlen,
>> + attrmiss[missingnum].ammissing);
>> +
>> + targetDataLen = att_addlength_pointer(targetDataLen,
>> + att->attlen,
>> + attrmiss[missingnum].ammissing);
>> + }
>> + break;
>> + }
>> + }
>> + if (missingnum > num_missing)
>> + {
>> + /* there is no missing value for this attribute */
>> + hasNulls = true;
>> + }
>> + }
>> + } /* end if have missing values */
>> + else
>> + {
>> + /*
>> + * If there are no missing values at all then NULLS must be allowed,
>> + * since some of the attributes are known to be absent.
>> + */
>> + hasNulls = true;
>
> Why are we doing anything at all in this branch? ISTM we're reforming
> the tuple for absolutely no gain here, just making it larger by
> including additionall null bits?
>
>
>> + /*
>> + * Allocate and zero the space needed. Note that the tuple body and
>> + * HeapTupleData management structure are allocated in one chunk.
>> + */
>> + if (targetHeapTuple)
>> + {
>> + len += offsetof(HeapTupleHeaderData, t_bits);
>> + hoff = len = MAXALIGN(len); /* align user data safely */
>> + len += targetDataLen;
>
> So len isn't the length of the tuple, but the null bitmap length? That
> seems, uh, non-obvious.
>
>
>> + *targetHeapTuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
>> + (*targetHeapTuple)->t_data
>> + = targetTHeader
>> + = (HeapTupleHeader) ((char *) *targetHeapTuple + HEAPTUPLESIZE);
>
> Yuck.
>
>
>> + (*targetHeapTuple)->t_len = len;
>> + (*targetHeapTuple)->t_tableOid = sourceTuple->t_tableOid;
>> + ItemPointerSetInvalid(&((*targetHeapTuple)->t_self));
>
> Seems like a lot of this code would be more readable by storing
> *targetHeapTuple in a local var.
>
>
>> + targetTHeader->t_infomask = sourceTHeader->t_infomask;
>> + targetTHeader->t_hoff = hoff;
>> + HeapTupleHeaderSetNatts(targetTHeader, natts);
>> + HeapTupleHeaderSetDatumLength(targetTHeader, len);
>> + HeapTupleHeaderSetTypeId(targetTHeader, tupleDesc->tdtypeid);
>> + HeapTupleHeaderSetTypMod(targetTHeader, tupleDesc->tdtypmod);
>> + /* We also make sure that t_ctid is invalid unless explicitly set */
>> + ItemPointerSetInvalid(&(targetTHeader->t_ctid));
>
> This seems fairly expensive, why aren't we just memcpy'ing the old
> header into the new one?
>
>
>
>
>> + if (targetIndicatorLen > 0)
>> + {
>> + if (sourceIndicatorLen > 0)
>> + {
>> + /* if bitmap pre-existed copy in - all is set */
>
> can't parse.
>
>> + memcpy(nullBits,
>> + ((char *) sourceTHeader)
>> + + offsetof(HeapTupleHeaderData, t_bits),
>> + sourceIndicatorLen);
>> + nullBits += sourceIndicatorLen - 1;
>> + }
>> + else
>> + {
>> + sourceIndicatorLen = BITMAPLEN(sourceNatts);
>> + /* Set NOT NULL for all existing attributes */
>> + memset(nullBits, 0xff, sourceIndicatorLen);
>> +
>> + nullBits += sourceIndicatorLen - 1;
>> +
>> + if (sourceNatts & 0x07)
>> + {
>> + /* build the mask (inverted!) */
>> + bitMask = 0xff << (sourceNatts & 0x07);
>> + /* Voila */
>> + *nullBits = ~bitMask;
>> + }
>
> It's late, I'm tired. But how can this be correct? Also, this code is
> massively under-commented.
>
>

I'm going to revisit expand_tuple from scratch.

>
>> +
>> +/*
>> + * Fill in the missing values for an ordinary HeapTuple
>> + */
>> +MinimalTuple
>> +minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleDesc)
>> +{
>> + MinimalTuple minimalTuple;
>> +
>> + expand_tuple(NULL, &minimalTuple, sourceTuple, tupleDesc);
>> + return minimalTuple;
>> +}
>> +
>> +/*
>> + * Fill in the missing values for a minimal HeapTuple
>> + */
>> +HeapTuple
>> +heap_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleDesc)
>> +{
>> + HeapTuple heapTuple;
>> +
>> + expand_tuple(&heapTuple, NULL, sourceTuple, tupleDesc);
>> + return heapTuple;
>> +}
>
> Comments appear to be swapped.
>
>

Fixed.

>
>> /* ----------------
>> * heap_copy_tuple_as_datum
>> *
>> @@ -1012,13 +1422,10 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
>>
>> /*
>> * If tuple doesn't have all the atts indicated by tupleDesc, read the
>> - * rest as null
>> + * rest as null, or a missing value if there is one.
>> */
>
> "a missing value" isn't there quite possibly multiple ones?
>
>

Rewritten

>> /*
>> @@ -1192,8 +1599,7 @@ slot_getattr(TupleTableSlot *slot, int attnum, bool *isnull)
>> tup = tuple->t_data;
>> if (attnum > HeapTupleHeaderGetNatts(tup))
>> {
>> - *isnull = true;
>> - return (Datum) 0;
>> + return getmissingattr(slot->tts_tupleDescriptor, attnum, isnull);
>> }
>
> This turned a ~4 instruction case into considerably more. I've recently
> removed one of the major slot_getattr sources (hashagg comparisons), but
> we still call this super frequently for hashagg's hash computations. I
> suggest micro-benchmarking that case.
>
>

I'll look at it. I have rewritten getmissingattr along the line you've
suggested, so it should be more efficient, and most compilers will
inline it.

>
>> Oid
>> StoreAttrDefault(Relation rel, AttrNumber attnum,
>> - Node *expr, bool is_internal)
>> + Node *expr, bool is_internal, bool add_column_mode)
>> {
>> char *adbin;
>> char *adsrc;
>> @@ -1939,9 +1945,20 @@ StoreAttrDefault(Relation rel, AttrNumber attnum,
>> Relation attrrel;
>> HeapTuple atttup;
>> Form_pg_attribute attStruct;
>> + Form_pg_attribute defAttStruct;
>> Oid attrdefOid;
>> ObjectAddress colobject,
>> defobject;
>> + ExprState *exprState;
>> + Expr *expr2 = (Expr *) expr;
>> + EState *estate = NULL;
>> + ExprContext *econtext;
>> + char *missingBuf = NULL;
>> + Datum valuesAtt[Natts_pg_attribute];
>> + bool nullsAtt[Natts_pg_attribute];
>> + bool replacesAtt[Natts_pg_attribute];
>> + Datum missingval = (Datum) 0;
>> + bool missingIsNull = true;
>>
>> /*
>> * Flatten expression to string form for storage.
>> @@ -1956,6 +1973,44 @@ StoreAttrDefault(Relation rel, AttrNumber attnum,
>> RelationGetRelid(rel)),
>> false, false);
>>
>> + /*
>> + * Compute the missing value
>> + */
>> + expr2 = expression_planner(expr2);
>> +
>> + exprState = ExecInitExpr(expr2, NULL);
>
> This should instead use ExecPrepareExpr()
>

Fixed

>
>> + if (add_column_mode)
>> + {
>> + missingval = ExecEvalExpr(exprState, econtext,
>> + &missingIsNull);
>> +
>> + defAttStruct = TupleDescAttr(rel->rd_att, attnum - 1);
>> +
>> + if (missingIsNull)
>> + {
>> + missingval = PointerGetDatum(NULL);
>
> (Datum) 0;
>
>
>> + }
>> + else if (defAttStruct->attbyval)
>> + {
>> + missingBuf = palloc(VARHDRSZ + sizeof(Datum));
>> + memcpy(VARDATA(missingBuf), &missingval, sizeof(Datum));
>> + SET_VARSIZE(missingBuf, VARHDRSZ + sizeof(Datum));
>> + missingval = PointerGetDatum(missingBuf);
>> + }
>> + else if (defAttStruct->attlen >= 0)
>> + {
>> + missingBuf = palloc(VARHDRSZ + defAttStruct->attlen);
>> + memcpy(VARDATA(missingBuf), DatumGetPointer(missingval),
>> + defAttStruct->attlen);
>> + SET_VARSIZE(missingBuf,
>> + VARHDRSZ + defAttStruct->attlen);
>> + missingval = PointerGetDatum(missingBuf);
>> + }
>
> What if we get an extended datum back in the varlena case? Even if we
> wouldn't change the storage from bytea to something else, you really
> would need to force detoasting here.
>

This is now rewritten.

>
>> + }
>> +
>> /*
>> * Make the pg_attrdef entry.
>> */
>> @@ -1996,7 +2051,22 @@ StoreAttrDefault(Relation rel, AttrNumber attnum,
>> attStruct = (Form_pg_attribute) GETSTRUCT(atttup);
>> if (!attStruct->atthasdef)
>> {
>> - attStruct->atthasdef = true;
>> + MemSet(valuesAtt, 0, sizeof(valuesAtt));
>> + MemSet(nullsAtt, false, sizeof(nullsAtt));
>> + MemSet(replacesAtt, false, sizeof(replacesAtt));
>> + valuesAtt[Anum_pg_attribute_atthasdef - 1] = true;
>> + replacesAtt[Anum_pg_attribute_atthasdef - 1] = true;
>> + if (add_column_mode)
>> + {
>> + valuesAtt[Anum_pg_attribute_atthasmissing - 1] = true;
>> + replacesAtt[Anum_pg_attribute_atthasmissing - 1] = true;
>> + valuesAtt[Anum_pg_attribute_attmissingval - 1] = missingval;
>> + replacesAtt[Anum_pg_attribute_attmissingval - 1] = true;
>> + nullsAtt[Anum_pg_attribute_attmissingval - 1] = missingIsNull;
>> + }
>> + atttup = heap_modify_tuple(atttup, RelationGetDescr(attrrel),
>> + valuesAtt, nullsAtt, replacesAtt);
>> +
>
> This seems weird. We've above formed a tuple, now we just modify it
> again?
>
>

I'll take a look at making it more straightforward. But It's not as
simple as the atthasdef case, since attmissing is part of the variable
portion of pg_attribute.

>> @@ -2296,7 +2377,14 @@ AddRelationNewConstraints(Relation rel,
>> (IsA(expr, Const) &&((Const *) expr)->constisnull))
>> continue;
>>
>> - defOid = StoreAttrDefault(rel, colDef->attnum, expr, is_internal);
>> + /* If the default is volatile we cannot use a missing value */
>> + if (contain_volatile_functions((Node *) expr))
>> + {
>> + colDef->missingMode = false;
>> + }
>> + defOid = StoreAttrDefault(rel, colDef->attnum, expr, is_internal,
>> + colDef->missingMode);
>
> Doesn't this have to be immutable rather than just stable? I mean just
> storing the value of now() doesn't seem like it'd do the trick?
>
>

I don't see why. We've stated explicitly in the docs that the existing
rows get the value of the default expression as evaluated at the time
the column is created. I don't think there is any need to restrict
that to an immutable expression.

>
>> @@ -663,7 +671,23 @@ ExecFetchSlotTuple(TupleTableSlot *slot)
>> * If we have a regular physical tuple then just return it.
>> */
>> if (TTS_HAS_PHYSICAL_TUPLE(slot))
>> - return slot->tts_tuple;
>> + {
>> + if (HeapTupleHeaderGetNatts(slot->tts_tuple->t_data) <
>> + slot->tts_tupleDescriptor->natts)
>> + {
>> + MemoryContext oldContext = MemoryContextSwitchTo(slot->tts_mcxt);
>> +
>> + slot->tts_tuple = heap_expand_tuple(slot->tts_tuple,
>> + slot->tts_tupleDescriptor);
>> + slot->tts_shouldFree = true;
>> + MemoryContextSwitchTo(oldContext);
>> + return slot->tts_tuple;
>> + }
>> + else
>> + {
>> + return slot->tts_tuple;
>> + }
>> + }
>
> I'm extremely dubious about this. HeapTupleHeaderGetNatts() commonly
> causes cachemisses and therefore causes slowdowns when unnecessarily
> done. Additionally here we're expanding the tuple to include columns
> that aren't even likely to be accessed. This is a fairly hot codepath,
> and made hugely more expensive by this.

Will look further.

>
>
>> @@ -508,6 +508,8 @@ tlist_matches_tupdesc(PlanState *ps, List *tlist, Index varno, TupleDesc tupdesc
>> return false; /* out of order */
>> if (att_tup->attisdropped)
>> return false; /* table contains dropped columns */
>> + if (att_tup->atthasmissing)
>> + return false; /* table contains cols with missing values */
>
> Hm, so incrementally building a table definitions with defaults, even if
> there aren't ever any rows, or if there's a subsequent table rewrite,
> will cause slowdowns. If so, shouldn't a rewrite remove all
> atthasmissing values?
>
>

Yes, I think so. Working on it. OTOH, I'm somewhat mystified by having
had to make this change. Still, it looks like dropping a column has
the same effect.

>
>
>> @@ -493,7 +494,10 @@ RelationBuildTupleDesc(Relation relation)
>
>> @@ -561,10 +568,73 @@ RelationBuildTupleDesc(Relation relation)
>> MemoryContextAllocZero(CacheMemoryContext,
>> relation->rd_rel->relnatts *
>> sizeof(AttrDefault));
>> - attrdef[ndef].adnum = attp->attnum;
>> + attrdef[ndef].adnum = attnum;
>> attrdef[ndef].adbin = NULL;
>> +
>> ndef++;
>> }
>> +
>> + /* Likewise for a missing value */
>> + if (attp->atthasmissing)
>> + {
>
> As I've written somewhere above, I don't think it's a good idea to do
> this preemptively. Tupledescs are very commonly copied, and the
> missing attributes are quite likely never used. IOW, we should just
> remember the attmissingattr value and fill in the corresponding
> attmissingval on-demand.
>
>

Defaults are frequently not used either, yet this function fills those
in regardless. I don't see why we need to treat missing values
differently.

>> + Datum missingval;
>> + bool missingNull;
>> +
>> + if (attrmiss == NULL)
>> + attrmiss = (AttrMissing *)
>> + MemoryContextAllocZero(CacheMemoryContext,
>> + relation->rd_rel->relnatts *
>> + sizeof(AttrMissing));
>> +
>> + /* Do we have a missing value? */
>> + missingval = SysCacheGetAttr(ATTNUM, pg_attribute_tuple,
>> + Anum_pg_attribute_attmissingval,
>> + &missingNull);
>
> Shouldn't this be a normal heap_getattr rather than a SysCacheGetAttr
> access?
>
> * SysCacheGetAttr
> *
> * Given a tuple previously fetched by SearchSysCache(),
> * extract a specific attribute.
>
>

Fixed.

>> @@ -620,7 +691,19 @@ RelationBuildTupleDesc(Relation relation)
>> AttrDefaultFetch(relation);
>> }
>> else
>> + {
>> constr->num_defval = 0;
>> + }
>> +
>> + if (nmissing > 0)
>> + {
>> + if (nmissing < relation->rd_rel->relnatts)
>> + constr->missing = (AttrMissing *)
>> + repalloc(attrmiss, nmissing * sizeof(AttrMissing));
>> + else
>> + constr->missing = attrmiss;
>> + }
>> + constr->num_missing = nmissing;
>
> Am I understand correctly that you're *downsizing* the allocation with
> repalloc here? That seems likely to commonly lead to some wastage...
>

This code is gone. But note the similar code nearby that is already there.

>> @@ -4059,10 +4142,6 @@ AttrDefaultFetch(Relation relation)
>>
>> systable_endscan(adscan);
>> heap_close(adrel, AccessShareLock);
>> -
>> - if (found != ndef)
>> - elog(WARNING, "%d attrdef record(s) missing for rel %s",
>> - ndef - found, RelationGetRelationName(relation));
>> }
>
> Hm, it's not obvious why this is a good thing?
>
>
>
>> diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
>> index 415efbab97..ed30f98e33 100644
>> --- a/src/include/access/tupdesc.h
>> +++ b/src/include/access/tupdesc.h
>> @@ -14,6 +14,7 @@
>> #ifndef TUPDESC_H
>> #define TUPDESC_H
>>
>> +#include "postgres.h"
>> #include "access/attnum.h"
>
> postgres.h should never be included in headers, why was it added here?
>
>

Fixed.

> This doesn't seem ready yet.
>

I hope we're a lot closer now.

I'm working on rejigging expand_tuple and on clearing the missing
attributes on a table rewrite.

Attached is the current state of things, with the fixes indicated
above, as well as some things pointed out by David Rowley.

None of this seems to have had much effect on performance.

Here's what oprofile reports from a run of pgbench with
create-alter.sql and the query "select sum(c1000) from t":

Profiling through timer interrupt
samples % image name symbol name
22584 28.5982 postgres ExecInterpExpr
11950 15.1323 postgres slot_getmissingattrs
5471 6.9279 postgres AllocSetAlloc
3018 3.8217 postgres TupleDescInitEntry
2042 2.5858 postgres MemoryContextAllocZeroAligned
1807 2.2882 postgres SearchCatCache1
1638 2.0742 postgres expression_tree_walker

That's very different from what we see on the master branch. The time
spent in slot_getmissingattrs is perhaps not unexpected, but I don't
(at least yet) understand why we're getting so much time spent in
ExecInterpExpr, which doesn't show up at all when the same benchmark
is run on master.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
fast_default-v12.patch application/octet-stream 101.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2018-02-27 05:15:29 RE: [bug fix] Cascaded standby cannot start after a clean shutdown
Previous Message Tatsuo Ishii 2018-02-27 03:27:29 Re: TODO item for broken \s with libedit seems fixed