Re: BUG #16363: Memory increases for each table accessed until connection is closed

From: Eduardo Barreto Alexandre <sezdocs(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16363: Memory increases for each table accessed until connection is closed
Date: 2020-04-16 14:18:04
Message-ID: CAG6JkqN7zUOaNdAsEru91UQYk=ov-rWHxZCSOm=MnwXyf+hsyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Sending the reply again since it seems that it didn't go through the
mailing list

Hello Tom, first of all, thanks for the response!

Sorry to not have added a lot of the needed information, I will try to
rectify that in this message.

So, first is what I'm trying to do, in my backend, I store financial data
(candlesticks and trades) in Postgres + TimescaleDB, the way I do it is
separating each exchange in one database and each market and timeframe in
one table, for example, giving that I have `ExchangeA` with 1 market
EUR/USD and 4 timeframes, I would get these 5 tables:
candles_eur_usd_one_minute
candles_eur_usd_five_minutes
candles_eur_usd_third_minutes
candles_eur_usd_one_hour
trades_eur_usd

Normally in my backend, I have, per database (exchange) ~750 markets, and
~16 timeframes, so 750 * 16 = 12000 tables.

I noticed that Postgres would continuously increase my memory usage per
connection until I got an OOM error, which, in my backend terms, means
inserting candles and trades, ~2000 rows per minute. For a system with
16GB, it would take like 1 to 3 hours and then I would get the OOM. For my
other system which has 64GB, it would take like 2 days but it would come a
time when Postgres would use everything and crash too.

For versions, I was using Postgres 11 with latest TimescaleDB, since I was
not sure if this was a TimescaleDB issue, I first created a minimal example
that triggered the issue faster (the one I talked in the first e-mail that
uses 700 tables) and then tested without TimescaleDB in Postgres 11 and 12.
I got the same result in both, so removing TimescaleDB out of the equation.

For the operational system, I'm using ubuntu, but I do test it in my Gentoo
machine with the same results, and I also tested it using Postgres in
docker.

For postgresql.conf, I was using first one tunned by the TimescaleDB tuner
software, but then changed to a default one to see if it would make a
change.

My backend is written in Elixir, so I'm using Ecto library to connect with
Postgres, Ecto will create a pool of connections that will be alive as long
as my backend is alive too. To make it easier for me to test the issue, I
limited the number of connections to 1, so I can keep an eye on that
connection process memory usage, etc.

> Yes, Postgres caches some information about every table your session
> has accessed, and no you don't have a lot of control over that, and
> no it isn't a bug.

> Having said that, the amount of memory consumed this way is in the
> vicinity of a few tens of KB per table, according to some simple
> experiments I just did. I couldn't get a process touching 700 simple
> tables to eat more than ~70MB of space. So either you are talking about
> tables with enormously complex schemas, or there's something else going
> on. (Please note that the SO post you are referencing describes an
> attempt to use circa 100000 tables/views in one session, not 700.)

Here are the details of one of the tables I'm using to do my test (all of
then are equal, just the name is different):
test_dev=# \d+ trades_eur_usd
Table "public.trades_eur_usd"
Column | Type | Collation | Nullable | Default |
Storage | Stats target | Description
-----------+-----------------------------+-----------+----------+---------+---------+--------------+-------------
id | integer | | not null | |
plain | |
timestamp | timestamp without time zone | | not null | |
plain | |
amount | numeric | | not null | |
main | |
price | numeric | | not null | |
main | |
Indexes:
"trades_eur_usd_id_desc_index" btree (id DESC)
"trades_eur_usd_timestamp_desc_index" btree ("timestamp" DESC)
Access method: heap

One thing that I forgot to explain better is what I'm doing to touch those
tables. Basically I created a little Elixir backend with Ecto which will
keep 1 connection alive for the pool and start a bulk insertion of rows to
one of the 700 tables, so, basically, the steps are:
1) Choose one of the tables randomly;
2) Insert in one query 13000 rows to that table;
3) Go back to step 1 and repeat.

This will trigger the issue instantly and will make the memory usage of
that connection grow very fast as shown in the last e-mail.

One very important thing to note is that if I change the above algorithm to
do the same thing, but always choose the same table in step 1 (so I will do
the bulk insertion always in 1 table instead of one of the 700), I will not
trigger the issue, memory usage will not go up at all. it only triggers it
if I'm doing the insertion in different tables each time.

> The memory consumption curve you showed looks suspiciously like you have
> circa 4GB shared buffers allocated and the reported number is mostly about
> how many of those buffers the process has touched so far. This is a
> common misunderstanding; "top" and similar tools tend to be really bad
> about accounting for shared memory usage in any useful way. In any case,
> the process you've shown us stopped growing its space consumption
> some time ago, so I wonder where the OOM complaint actually came from.

> The short answer is that you're probably barking up the wrong tree,
> but there's not enough information here to provide any useful guesses
> about which is the right tree.

I guess you are right, maybe the issue is with some shared buffers cache or
something, I tried lowering to the minimum value each possible option in
postgresql.conf that I found but still got the same OOM result.

Thanks a lot for the help, and feel free to ask for more information!

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message luis.roberto 2020-04-16 14:43:16 Re: BUG #16363: Memory increases for each table accessed until connection is closed
Previous Message Devrim Gündüz 2020-04-16 11:03:35 Re: BUG #16366: unable to install pgadmin4