Quick Links

Re: No easy way to join discussion in existing thread when not subscribed

From:	Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To:	Amir Rohan <amir(dot)rohan(at)mail(dot)com>
Cc:	Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL www <pgsql-www(at)postgresql(dot)org>, magnus(at)hagander(dot)net, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject:	Re: No easy way to join discussion in existing thread when not subscribed
Date:	2015-09-30 06:53:28
Message-ID:	560B86E8.4020600@kaltenbrunner.cc
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-www

On 09/30/2015 03:27 AM, Amir Rohan wrote:
> On 09/29/2015 10:51 PM, Stefan Kaltenbrunner wrote:
>> On 09/29/2015 09:34 PM, Amir Rohan wrote:
>>
>> for most accesses to the archives the string for the basic auth reply
>> quotes the "archives" and "password" strings with ' - see
>
> Fixed.

I think you missed at least one spot in the code you added and also at
least one occurance in existing code.

>
>> we have a number of current issues where data in the archives gets
>> mangled/corrupted we are looking into. We are currently working on some
>> infrastructure to "test" parsing fixes across all the messages in the
>> archives to get a better understanding of what kind effect a change has.
>> For this specific message I'm curious of how you found it though?
>>
>
> I made a prototype before looking at the repo, using
> python's 'mailbox' parser module, and some asserts failed
> when some messages parsed out as lacking Message-ID. I had
> also read the mbox spec in order to write the patch, and
> put the two together.

ah - nice effort!

>
>>>> <...>
>>>> Have you done any (approximate) measurements on what the additional
>>>> in-memory overhead in both pg (to build the response) and in django is
>>>> compared to the resulting mbox?
>>>>
>>>>> Amir Wrote:
>>>>> <some napkins and mitigations>
>> My concern mostly stems from operational
>> experience(on the sysadmin team) that some operations on the archives
>> currently are fairly computational and memory intensive causing issues
>> with availability and we would want to not add more vectors that can
>> cause that.
>>
>
> You're right to be concerned, I raised the issue myself to begin with.
> We can solve any particular problem, but how to optimize depends too
> much on particulars I don't have.
>
> If you have both cpu and memory shortage, we could trade storage.
> You already serve monthly mbox's, having per thread mboxes which are
> updated in batch (say hourly) could be managable, and that code
> is practically written already. Serving static is as cheap as it gets
> on noth cpu and memory.

yeah that is what I was thinking - though I dont think we want hourly.
Went went a long way to actually get the current system to be "almost
instant" in terms of having the archives in sync with the lists(at least
for the basic stuff). What I was thinking is doing the mbox creating
during the import - we already serialize the process (on the MTA/LDA
side) there to have only one message imported concurrently so there is
way less risk of overwhelming the box.

>
> But for now, see attached patch, which adds a tweakable for setting a
> cap on the max size of the response. It still gets everything
> from the database at once, so it may not be of much help except
> perhaps as a metric for you to easily monitor.
>
> There's also an EJECT button that turns all thread mbox requests into
> 403, so you can just throw this in production and flip the switch
> if a problem appears. Also fixes the quoting in the message.

thanks for the updated patch - will take a look and see whether I can
find out what the worst case is in the archives later today.

Stefan

In response to

Re: No easy way to join discussion in existing thread when not subscribed at 2015-09-30 01:27:45 from Amir Rohan

Browse pgsql-www by date

	From	Date	Subject
Next Message	Amir Rohan	2015-09-30 06:55:21	Re: SEO for documentation
Previous Message	Amir Rohan	2015-09-30 01:43:22	Re: No easy way to join discussion in existing thread when not subscribed