From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Making jsonb_agg() faster |
Date: | 2025-08-27 01:22:17 |
Message-ID: | 2613D418-67E0-4DD8-BDA6-AB1BB04DB1A2@gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>> On Aug 23, 2025, at 03:11, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>
>> v2-0001 takes care of that, and also adopts your suggestion in [1]
>> about not using two calls of pushJsonbValueScalar where one would do.
>> I also did a bit more micro-optimization in appendKey, appendValue,
>> appendElement to avoid redundant copying, because perf testing showed
>> that appendElement is still a hot-spot for jsonb_agg. Patches 0002
>> and 0003 are unchanged.
>>
>
>
A few more suggestions for pushJsonValue():
+ /* If an object or array is pushed, recursively push its contents */
+ if (jbval->type == jbvObject)
{
pushJsonbValue(pstate, WJB_BEGIN_OBJECT, NULL);
for (i = 0; i < jbval->val.object.nPairs; i++)
@@ -581,32 +607,29 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
pushJsonbValue(pstate, WJB_KEY, &jbval->val.object.pairs[i].key);
pushJsonbValue(pstate, WJB_VALUE, &jbval->val.object.pairs[i].value);
}
-
- return pushJsonbValue(pstate, WJB_END_OBJECT, NULL);
+ pushJsonbValue(pstate, WJB_END_OBJECT, NULL);
+ return;
}
To push WJB_BEGIN_OBJECT and WJB_END_OBJECT, we can directly call pushJsonValueScalar(), because once entering pushJsonbValue, they will meet the check of (seq != WJB_ELEM && seq != WJB_VALUE). Directly calling pushJsonValueScalar() will saves one level of recursion.
- if (jbval && (seq == WJB_ELEM || seq == WJB_VALUE) && jbval->type == jbvArray)
+ if (jbval->type == jbvArray)
{
pushJsonbValue(pstate, WJB_BEGIN_ARRAY, NULL);
for (i = 0; i < jbval->val.array.nElems; i++)
{
pushJsonbValue(pstate, WJB_ELEM, &jbval->val.array.elems[i]);
}
-
- return pushJsonbValue(pstate, WJB_END_ARRAY, NULL);
+ pushJsonbValue(pstate, WJB_END_ARRAY, NULL);
+ return;
}
Same thing for pushing WJB_BEGIN_ARRAY and WJB_END_ARRAY.
And for pushJsonbValueScalar():
- (*pstate)->size = 4;
+ ppstate->size = 4; /* initial guess at array size */
Can we do lazy allocation? Initially assume size = 0, only allocate memory when pushing the first element? This way, we won’t allocate memory for empty objects and arrays.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2025-08-27 01:23:13 | Re: Why CI doesn't run? |
Previous Message | Chao Li | 2025-08-27 01:05:15 | Why CI doesn't run? |