My company is currently hitting a problem with mysql/innodb having really slow insert performance (we're seeing ~1K rows/second). My boss wants to go and spend a bunch of money on the Tokutek backend. I'd rather we save the money and go to postgres instead. We're not heavily invested in mysql at this point (fixing our queries to switch from mysql to postgres would take about five minutes). But my boss wants to see some benchmarks. I've googled around for a while, but all the benchmarks I've found commit one or more "fatal flaws", which render the benchmark pointless at best: 1) Comparing Postgres to MyISAM. Transactions are not an option for us, so it doesn't matter if MyISAM is a hundred times faster. I want to compare Postgres to InnoDB (bonus points for Postgres vr.s Tokutek). 2) Using the default configurations. Be serious- is there any one who cares the least about performance who uses the default configuration? 3) Using old versions of Postgres. I'd like the survey to at least use the 8.x series, bonus points for it being 9.x. 4) Not using COPY for inserts. We would, of course, be using the copy command for inserts. Here's the thing. I have personally seen postgres 8.1 insert 30K rows/second, in to a real table, on crappy hardware (single slow IDE drive, old crappy hardware). I would be shocked if I can't improve on the InnoDB numbers by at least an order of magnitude. I'm whipping together a personal benchmark to show this. But I need a "professional looking" benchmark, with pretty charts and graphs and etc., to back me up. Help? Brian
On 10/29/2010 02:58 PM, Brian Hurt wrote: > > My company is currently hitting a problem with mysql/innodb having > really slow insert performance (we're seeing ~1K rows/second). My boss > wants to go and spend a bunch of money on the Tokutek backend. I'd > rather we save the money and go to postgres instead. We're not heavily > invested in mysql at this point (fixing our queries to switch from mysql > to postgres would take about five minutes). But my boss wants to see > some benchmarks. hmm 1k/s sounds very slow for MySQL/innodb if you are batching your inserts (either multi-value inserts or larger transactions) - is there anything special to that data(very wide, enourmous number of indexes etc)? > > I've googled around for a while, but all the benchmarks I've found > commit one or more "fatal flaws", which render the benchmark pointless > at best: > > 1) Comparing Postgres to MyISAM. Transactions are not an option for us, > so it doesn't matter if MyISAM is a hundred times faster. I want to > compare Postgres to InnoDB (bonus points for Postgres vr.s Tokutek). never seen a comparison against Tokutek so you might have to benchmark for yourself. > > 2) Using the default configurations. Be serious- is there any one who > cares the least about performance who uses the default configuration? > > 3) Using old versions of Postgres. I'd like the survey to at least use > the 8.x series, bonus points for it being 9.x. > > 4) Not using COPY for inserts. We would, of course, be using the copy > command for inserts. > > Here's the thing. I have personally seen postgres 8.1 insert 30K > rows/second, in to a real table, on crappy hardware (single slow IDE > drive, old crappy hardware). I would be shocked if I can't improve on > the InnoDB numbers by at least an order of magnitude. I'm whipping > together a personal benchmark to show this. But I need a "professional > looking" benchmark, with pretty charts and graphs and etc., to back me up. Not sure what you would consider "professional" - but I did some testing back in the 8.4 days here: http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html - not sure if that actually matches your workload(but I guess you could easily test yourself if it is that simple to convert your app). > > Help? I don't think advocacy is actually the right list to ask maybe you would get a wider audience on -performance or -general. Stefan
On Fri, Oct 29, 2010 at 5:58 AM, Brian Hurt <bhurt(at)spnz(dot)org> wrote: > > My company is currently hitting a problem with mysql/innodb having really > slow insert performance (we're seeing ~1K rows/second). My boss wants to go > and spend a bunch of money on the Tokutek backend. I'd rather we save the > money and go to postgres instead. We're not heavily invested in mysql at > this point (fixing our queries to switch from mysql to postgres would take > about five minutes). But my boss wants to see some benchmarks. > > I've googled around for a while, but all the benchmarks I've found commit > one or more "fatal flaws", which render the benchmark pointless at best: > > 1) Comparing Postgres to MyISAM. Transactions are not an option for us, so > it doesn't matter if MyISAM is a hundred times faster. I want to compare > Postgres to InnoDB (bonus points for Postgres vr.s Tokutek). > > 2) Using the default configurations. Be serious- is there any one who cares > the least about performance who uses the default configuration? > > 3) Using old versions of Postgres. I'd like the survey to at least use the > 8.x series, bonus points for it being 9.x. > > 4) Not using COPY for inserts. We would, of course, be using the copy > command for inserts. > > Here's the thing. I have personally seen postgres 8.1 insert 30K > rows/second, in to a real table, on crappy hardware (single slow IDE drive, > old crappy hardware). I would be shocked if I can't improve on the InnoDB > numbers by at least an order of magnitude. I'm whipping together a personal > benchmark to show this. But I need a "professional looking" benchmark, with > pretty charts and graphs and etc., to back me up. > > Help? > > Brian If you post your conf I can give you pointers on InnoDB performance. Please also post a hardware profile (including ram and IO hardware). One way or another this is the right first step for a comparison with PG. As for tokutek, I think they are probably the wave of the future. I have suggested that they create a pg product. -- Rob Wultsch wultsch(at)gmail(dot)com
Brian,
> My company is currently hitting a problem with mysql/innodb having
> really slow insert performance (we're seeing ~1K rows/second).
As others have said, there's clearly something wrong with your setup
and/or application which would likely affect PostgreSQL as well. I can
name several ways in which PostgreSQL is better than MySQL/InnoDB, but
insert performance isn't one of them.
>My boss
> wants to go and spend a bunch of money on the Tokutek backend.
I'm sure that Tokutek would be thrilled to have a customer (they're very
new). Again, though, insert performance isn't Tokutek's specialty; the
arguments for using them is that (a) they're not owned by oracle, and
(b) the "fractal indexes", which are cool.
> I've googled around for a while, but all the benchmarks I've found
> commit one or more "fatal flaws", which render the benchmark pointless
> at best:
Doing real benchmarks is an involved, expensive process, and both DBMSes
are fast-moving targets which need to be benchmarked every year. I've
talked with Percona about doing some MySQL vs. PostgreSQL head-to-heads,
but in the absence of specific funding that's unlikely to be completed.
Besides, we both really want to do head-to-heads with MSSQL & Oracle,
not each other.
> Here's the thing. I have personally seen postgres 8.1 insert 30K
> rows/second, in to a real table, on crappy hardware (single slow IDE
> drive, old crappy hardware). I would be shocked if I can't improve on
> the InnoDB numbers by at least an order of magnitude. I'm whipping
> together a personal benchmark to show this. But I need a "professional
> looking" benchmark, with pretty charts and graphs and etc., to back me up.
Don't have anything like that, I'm afraid. Last real benchmark was
SpecJAppserver from 2007 (which at the time was 90% of comparable Oracle
performance), which has long been superceded.
However, given that what you want to know about is insert performance,
it seems like it would be easy enough to mock up your own comparison.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
Brian, Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting to your case: http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/ It's also interesting to me because apparently InnoDB *does* have an issue with large numbers of inserts to an already-large table, which we don't have (I don't think we do, anyway). -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Fri, 2010-10-29 at 13:53 -0700, Josh Berkus wrote: > Brian, > > Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting > to your case: > > http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/ > > It's also interesting to me because apparently InnoDB *does* have an > issue with large numbers of inserts to an already-large table, which we > don't have (I don't think we do, anyway). I thought InnoDB still had P/K insert/update issues. I could be cracked (would have to dig up the article) JD > > -- > -- Josh Berkus > PostgreSQL Experts Inc. > http://www.pgexperts.com > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
For the record, the table we're having trouble inserting into is ~100 rows with ~50 indexes on it. E.F Codd is spinning in his grave. The reason they went with this design (instead of one that has two tables, each with 3-6 columns, and about that many indexes) is that "joins are slow". Which they may be on Mysql, I don't know. But this is (unfortunately) a different battle. Brian On Fri, 29 Oct 2010, Josh Berkus wrote: > Brian, > > Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting > to your case: > > http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/ > > It's also interesting to me because apparently InnoDB *does* have an > issue with large numbers of inserts to an already-large table, which we > don't have (I don't think we do, anyway). > > -- > -- Josh Berkus > PostgreSQL Experts Inc. > http://www.pgexperts.com > > -- > Sent via pgsql-advocacy mailing list (pgsql-advocacy(at)postgresql(dot)org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-advocacy >
On 10/29/2010 11:37 PM, Brian Hurt wrote: > > For the record, the table we're having trouble inserting into is ~100 > rows with ~50 indexes on it. E.F Codd is spinning in his grave. The > reason they went with this design (instead of one that has two tables, > each with 3-6 columns, and about that many indexes) is that "joins are > slow". Which they may be on Mysql, I don't know. But this is > (unfortunately) a different battle. is that really only 100 rows or are you actually talking about columns? if the later you will have a very hard time getting reasonable bulk/mass loading performance in most databases (and also pg) - a table that wide and with a that ridiculous number of indexes is just bound to be slow. Now I actually think that the figures you are getting from innodb are fairly reasonable... Stefan
On Sat, 30 Oct 2010 15:57:39 +0200, Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> wrote: > On 10/29/2010 11:37 PM, Brian Hurt wrote: >> >> For the record, the table we're having trouble inserting into is ~100 >> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The >> reason they went with this design (instead of one that has two tables, >> each with 3-6 columns, and about that many indexes) is that "joins are >> slow". Which they may be on Mysql, I don't know. But this is >> (unfortunately) a different battle. As opposed to the slowness you are having now? /me smacks your developers for you JD -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On 10/29/10 2:37 PM, Brian Hurt wrote:
>
> For the record, the table we're having trouble inserting into is ~100
> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
> reason they went with this design (instead of one that has two tables,
> each with 3-6 columns, and about that many indexes) is that "joins are
> slow". Which they may be on Mysql, I don't know. But this is
> (unfortunately) a different battle.
Not sure that that'll be any better on Postgres then. Few DBMSes
optimize for "stupid design". Maybe they should go with a "NoSQL"
database.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
On Fri, Oct 29, 2010 at 1:25 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote: > Brian, > > > My company is currently hitting a problem with mysql/innodb having >> really slow insert performance (we're seeing ~1K rows/second). >> > > As others have said, there's clearly something wrong with your setup and/or > application which would likely affect PostgreSQL as well. I can name several > ways in which PostgreSQL is better than MySQL/InnoDB, but insert performance > isn't one of them. > > > My boss >> wants to go and spend a bunch of money on the Tokutek backend. >> > > I'm sure that Tokutek would be thrilled to have a customer (they're very > new). Again, though, insert performance isn't Tokutek's specialty; the > arguments for using them is that (a) they're not owned by oracle, <snip> That seems like a really weak argument, unless Tokutek has come out with a stand-alone product I'm not aware of. At best you'll still need to deal with Oracle to get the MySQL bits, at worse Tokutek's technology is as good as it sounds and they are not an acquisition target for Oracle. (Assuming you don't want to deal with Oracle, which you're reasoning above implies. Many people won't care about that though). > > > I've googled around for a while, but all the benchmarks I've found > >> commit one or more "fatal flaws", which render the benchmark pointless >> at best: >> > > Doing real benchmarks is an involved, expensive process, and both DBMSes > are fast-moving targets which need to be benchmarked every year. I've > talked with Percona about doing some MySQL vs. PostgreSQL head-to-heads, but > in the absence of specific funding that's unlikely to be completed. > Besides, we both really want to do head-to-heads with MSSQL & Oracle, not > each other. > > Actually in this case, "real" benchmarks should be pretty easy. You already have a running application that you have numbers on. I'd say swap in a copy of xtradb, a copy of tokutek's tech, and a copy of Postgres (after all, it's only "5 minutes" to change the queries around), and benchmark all three. Realistically it would probably take a dedicated week to do it, but you're probably making a 3 year decision, so this bit of time up front seems like a good way to go. (And don't forget to publish your results afterwards). Robert Treat play: http://www.xzilla.net work: http://www.omniti.com/is/hiring
On Sat, 30 Oct 2010, Stefan Kaltenbrunner wrote: > On 10/29/2010 11:37 PM, Brian Hurt wrote: >> >> For the record, the table we're having trouble inserting into is ~100 >> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The >> reason they went with this design (instead of one that has two tables, >> each with 3-6 columns, and about that many indexes) is that "joins are >> slow". Which they may be on Mysql, I don't know. But this is >> (unfortunately) a different battle. > > is that really only 100 rows or are you actually talking about columns? Bleh, I meant columns. 100 rows is nothing. > if > the later you will have a very hard time getting reasonable bulk/mass loading > performance in most databases (and also pg) - a table that wide and with a > that ridiculous number of indexes is just bound to be slow. Now I actually > think that the figures you are getting from innodb are fairly reasonable... > > > Stefan > Brian
The "insert buffer" in InnoDB accelerates this workload. It buffers changes in a special b-tree to avoid disk IO during secondary index maintenance. For my workloads the special b-tree is able to capture multiple changes to blocks and is likely reduce the IO requirements for the application. Even without that benefit it allows the server to absorb workload spikes as the disk reads for secondary index maintenance are deferred. This is done for inserts in MySQL 5.1 and for inserts, updates and deletes in MySQL 5.5. This won't allow InnoDB to match TokuDB in performance, but it should provide much better throughput than you would expect from an engine that does update in place. http://www.google.com/search?hl=en&q=insert+buffer+innodb On Sat, Oct 30, 2010 at 4:44 PM, Brian Hurt <bhurt(at)spnz(dot)org> wrote: > > > On Sat, 30 Oct 2010, Stefan Kaltenbrunner wrote: > >> On 10/29/2010 11:37 PM, Brian Hurt wrote: >>> >>> For the record, the table we're having trouble inserting into is ~100 >>> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The >>> reason they went with this design (instead of one that has two tables, >>> each with 3-6 columns, and about that many indexes) is that "joins are >>> slow". Which they may be on Mysql, I don't know. But this is >>> (unfortunately) a different battle. >> >> is that really only 100 rows or are you actually talking about columns? > > Bleh, I meant columns. > > 100 rows is nothing. > >> if the later you will have a very hard time getting reasonable bulk/mass >> loading performance in most databases (and also pg) - a table that wide and >> with a that ridiculous number of indexes is just bound to be slow. Now I >> actually think that the figures you are getting from innodb are fairly >> reasonable... >> >> >> Stefan >> > > Brian > > > -- > Sent via pgsql-advocacy mailing list (pgsql-advocacy(at)postgresql(dot)org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-advocacy > -- Mark Callaghan mdcallag(at)gmail(dot)com