PostgreSQL vs MySQL in ZFS (Was: Re: [zfs-discuss] Large disk system 240 6T drives)

Discussion:

Durval Menezes

2015-02-19 13:49:20 UTC

Hello Gordan and Uncle,

(changing the subject field as we are really talking about other matters
than the OP now)

My own experience with MySQL is that it's really simple and intuitive and
"just works" until the exact point it suddenly fails in such a spectacular
way (either by starting to show extremely poor performance, or lack of
stability) that you get badly hosed and has to delve "emergency all-nighter
mode" into really obscure and not-simple-at-all customization and
optimization just to dig yourself out of the hole it has lead you into.

PostgreSQL is more complex and really needs you to stop and plan ahead and
think about you are doing (specially if you go into replication/cluster
country), but IME it's much more reliable and fast and stable afterwards.

As I'd rather pay my dues upfront than be surprised later, I stay with
PostgreSQL as much as I can.

Just my $0.02, your mileage may vary, no affiliations and all other
standard disclaimers apply...

Cheers,
--
Durval.

Much as I like PostgreSQL for it's tendency toward "doing the right

thing", there is distinctly plausible chance that the MySQL solution you
mention could have been adjusted to deliver similar performance levels.

Bacula database. Hundreds of millions of entries....
To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.

Gordan Bobic

2015-02-19 14:09:28 UTC

Permalink

Post by Durval Menezes
Hello Gordan and Uncle,
(changing the subject field as we are really talking about other matters
than the OP now)
My own experience with MySQL is that it's really simple and intuitive and
"just works" until the exact point it suddenly fails in such a spectacular
way (either by starting to show extremely poor performance, or lack of
stability) that you get badly hosed and has to delve "emergency all-nighter
mode" into really obscure and not-simple-at-all customization and
optimization just to dig yourself out of the hole it has lead you into.

I'm not saying that there are no pitfalls in MySQL, but they are
predictable if you know what you are doing. I am not aware of a database
free of such issues - they merely differ.

If you are seeing a situation where the performance falls off a cliff with
little prior warning, you have either reached a point where your indexes
(and especially primary keys) no longer fit into the buffer pool, or your
combination of queries and indexes is poor and is resulting full table
scans. This is often ignorable until the data grows to the point where it
no longer fits into the buffer pool and you start grinding onto spinning
rust.

There are very few database performance issues (on any database) that
cannot be solved using a combination of:
1) Appropriate indexing
2) Use of materialized views (with the appropriate changes to related
queries)
3) Avoiding the usual suspects among performance killers (e.g. sub-selects
and views)

More importantly, although the exact dialect syntax may be different
between databases, conceptually the solution is likely to be similar.
Artefacts like "query runtime reduced by 90%" are typically attributable to
edge cases such as one SQL DB implementation handling a particular
sub-select incantation more sensibly than the other. In reality, both would
perform similarly (and much faster on top) if the said construct were
re-written as a JOIN, or if that is not possible, using a materialized
view. Sometimes even using a temporary able can lead to a massively faster
solution.

If anybody wants to continue this discussion, maybe we should take this
thread off-list as the ZFS related content has disappeared.

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.

Durval Menezes

2015-02-19 14:26:54 UTC

Permalink

Hi Gordan,

Post by Gordan Bobic

I'm not saying that there are no pitfalls in MySQL, but they are
predictable if you know what you are doing. I am not aware of a database
free of such issues - they merely differ.

Agreed, for experienced hands. My point is that MySQL often misleads the
novice with its apparent simplicity and "just-works-ness" up until the
point of serious failure, and PostgreSQL doesn't (or at least not as
much). I think this has to do with the PostgreSQL way of doing things
right from the beginning as you mentioned, instead of the "let's put
something up anyway right now and deal with the consequences later" way I
sense in most of MySQL.

Post by Gordan Bobic
If you are seeing a situation where the performance falls off a cliff with
little prior warning, you have either reached a point where your indexes
(and especially primary keys) no longer fit into the buffer pool, or your
combination of queries and indexes is poor and is resulting full table
scans. This is often ignorable until the data grows to the point where it
no longer fits into the buffer pool and you start grinding onto spinning
rust.
There are very few database performance issues (on any database) that
1) Appropriate indexing
2) Use of materialized views (with the appropriate changes to related
queries)
3) Avoiding the usual suspects among performance killers (e.g. sub-selects
and views)

All of this requires planning, which the "MySQL way" IMHO doesn't
encourage...

Post by Gordan Bobic
More importantly, although the exact dialect syntax may be different
between databases, conceptually the solution is likely to be similar.
Artefacts like "query runtime reduced by 90%" are typically attributable to
edge cases such as one SQL DB implementation handling a particular
sub-select incantation more sensibly than the other. In reality, both would
perform similarly (and much faster on top) if the said construct were
re-written as a JOIN, or if that is not possible, using a materialized
view. Sometimes even using a temporary able can lead to a massively faster
solution.

I can't very well visualize what you are saying (I'm mostly into OS and
networking, not much of a RDMBS guy), so I will take your word for it.

Post by Gordan Bobic
If anybody wants to continue this discussion, maybe we should take this
thread off-list as the ZFS related content has disappeared.

I was hoping you could add something ZFS-related, like the appropriateness
of using mirror vs raidz for insert/update performance, special L2ARC and
whatever tuning, etc...

Here I have quite a few Oracle RDBMS databases running on top of a ZFS
appliance NAS, and apart from the obvious optimizations (mirror instead of
raidz, block size tuning between the RDBMS and the pool/datasets,
compression, etc) didn't have to do much to make it perform.

Cheers,

--
Durval.

Post by Gordan Bobic
To unsubscribe from this group and stop receiving emails from it, send an

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.

Gordan Bobic

2015-02-19 14:35:30 UTC

Permalink

Post by Durval Menezes
Hi Gordan,

Post by Gordan Bobic

Post by Durval Menezes
Hello Gordan and Uncle,
(changing the subject field as we are really talking about other matters
than the OP now)
My own experience with MySQL is that it's really simple and intuitive
and "just works" until the exact point it suddenly fails in such a
spectacular way (either by starting to show extremely poor performance, or
lack of stability) that you get badly hosed and has to delve "emergency
all-nighter mode" into really obscure and not-simple-at-all customization
and optimization just to dig yourself out of the hole it has lead you into.

I'm not saying that there are no pitfalls in MySQL, but they are
predictable if you know what you are doing. I am not aware of a database
free of such issues - they merely differ.

Post by Gordan Bobic
If you are seeing a situation where the performance falls off a cliff
with little prior warning, you have either reached a point where your
indexes (and especially primary keys) no longer fit into the buffer pool,
or your combination of queries and indexes is poor and is resulting full
table scans. This is often ignorable until the data grows to the point
where it no longer fits into the buffer pool and you start grinding onto
spinning rust.
There are very few database performance issues (on any database) that
1) Appropriate indexing
2) Use of materialized views (with the appropriate changes to related
queries)
3) Avoiding the usual suspects among performance killers (e.g.
sub-selects and views)

All of this requires planning, which the "MySQL way" IMHO doesn't
encourage...

Without planning and experienced hands, any such project will go poorly
anyway. The only sane way to consider the comparison is from the point of
skill equivalence.

Post by Durval Menezes

Post by Gordan Bobic
If anybody wants to continue this discussion, maybe we should take this
thread off-list as the ZFS related content has disappeared.

I was hoping you could add something ZFS-related, like the appropriateness
of using mirror vs raidz for insert/update performance, special L2ARC and
whatever tuning, etc...

I already shared all of the non-obvious wisdom on this subject earlier in
this thread. :)

Post by Durval Menezes
Here I have quite a few Oracle RDBMS databases running on top of a ZFS
appliance NAS, and apart from the obvious optimizations (mirror instead of
raidz, block size tuning between the RDBMS and the pool/datasets,
compression, etc) didn't have to do much to make it perform.

Forward planning, I'm afraid. Tuning ashift and recordsize, and making sure
everything is as well aligned as you can make it goes a long way.
Unfortunately, ashift changes and pool geometry aren't something you can
change retrospectively without a full rebuild+restore.

To unsubscribe from this group and stop receiving emails from it, send an email to zfs-discuss+***@zfsonlinux.org.