From: | Cory Nemelka <cnemelka(at)gmail(dot)com> |
---|---|
To: | Geoff Winkless <pgsqladmin(at)geoff(dot)dj> |
Cc: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Processing very large TEXT columns (300MB+) using C/libpq |
Date: | 2017-10-20 15:54:51 |
Message-ID: | CAMe5Gn0T7OtneYReFCvpSqhaTSGvO0Ujf+RjKVX_eeaRRgEPQg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
I'll take out all the code that isn't directly related to reading the data
and see if that helps. That was next step I intended anyway.
thank you for the reply
--cnemelka
On Fri, Oct 20, 2017 at 9:43 AM, Geoff Winkless <pgsqladmin(at)geoff(dot)dj> wrote:
> It's probably worth removing the iterating code Just In Case.
>
> Apologies for egg-suck-education, but I assume you're not doing something
> silly like
>
> for (i=0; i < strlen(bigtextstring); i++) {
> ....
> }
>
> I know it sounds stupid, but you'd be amazed how many times that crops up,
> and for small strings it doesn't matter, but for large strings it's
> catastrophic.
>
> Geoff
>
> On 20 October 2017 at 16:16, Cory Nemelka <cnemelka(at)gmail(dot)com> wrote:
>
>> All I am am doing is iterating through the characters so I know it isn't
>> my code.
>>
>> --cnemelka
>>
>> On Fri, Oct 20, 2017 at 9:14 AM, Cory Nemelka <cnemelka(at)gmail(dot)com> wrote:
>>
>>> Yes, but I should be able to read them much faster. The psql client can
>>> display an 11MB column in a little over a minute, while in C using libpg
>>> library, it takes over an hour.
>>>
>>> Anyone have any experience with the same issue that can help me resolve?
>>>
>>> --cnemelka
>>>
>>> On Thu, Oct 19, 2017 at 5:20 PM, Aldo Sarmiento <aldo(at)bigpurpledot(dot)com>
>>> wrote:
>>>
>>>> I believe large columns get put into a TOAST table. Max page size is
>>>> 8k. So you'll have lots of pages per row that need to be joined with a size
>>>> like that: https://www.postgresql.org/docs/9.5/static/storage-toa
>>>> st.html
>>>>
>>>> *Aldo Sarmiento*
>>>> President & CTO
>>>>
>>>>
>>>>
>>>> 8687 Research Dr
>>>> <https://maps.google.com/?q=8687+Research+Dr,+Irvine,+CA+92618&entry=gmail&source=g>,
>>>> Irvine, CA 92618
>>>> <https://maps.google.com/?q=8687+Research+Dr,+Irvine,+CA+92618&entry=gmail&source=g>
>>>> *O*: (949) 223-0900 - *F: *(949) 727-4265
>>>> aldo(at)bigpurpledot(dot)com | www.bigpurpledot.com
>>>>
>>>> On Thu, Oct 19, 2017 at 2:03 PM, Cory Nemelka <cnemelka(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> I have getting very poor performance using libpq to process very large
>>>>> TEXT columns (300MB+). I suspect it is IO related but can't be sure.
>>>>>
>>>>> Anyone had experience with same issue that can help me resolve?
>>>>>
>>>>> --cnemelka
>>>>>
>>>>
>>>>
>>>
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-10-20 16:55:22 | Re: Processing very large TEXT columns (300MB+) using C/libpq |
Previous Message | Geoff Winkless | 2017-10-20 15:43:42 | Re: Processing very large TEXT columns (300MB+) using C/libpq |