The actual summary of the article is "The design of Postgres means that updating existing rows is inefficient compared to MySQL".
Yes, there were some other points that were just extra annoyances for them but clearly that point was the most important to them. It's what the header image and the first 60% of the article was talking about and yet nobody seems to be engaging with that point in this thread.
Is the design choice bad? They never said it was. It's just an engineering trade off. It's very possible that most workloads benefit from this design. But if you workload involves updating lots of existing rows at large scale, then MySQL is going to be a better choice for you.
But it's only an issue if you rely on lots of transactions for data consistency and my point was that it sounds like they are relying on transactions too much which is why they need a more "forgiving" database, which is the part I quoted.
Also they didn't mention anything about the auto vacuumer, which mostly solved the issue they are talking about.
Their lack of mention of the vacuumer and not seeming to know that Postgres supports statement level replication makes me wonder if they took a deep dive into the wrong part of the technology.
Both vacuuming and logical replication are discussed in the article. In particular, vacuuming is easier with InnoDB since the changed records all exist in the redo log whereas PostgreSQL needs to scan the whole table. pglogical is mentioned for people running PG9.4+ as a way of doing minimal downtime cross version upgrades, which wasn't an option back with PG9.2 unless you go with something like slony1 or londiste.
Agreed, they even admit Postgres is faster for querying because MySQL requires 2 index lookups.
I like the immutable data design of Postgres, is more robust and enables fast transactional DDL.
My design never deletes or alters a tuple, so all of these problems that Uber is obsessing about, go away.
Question is - Why do they need so many updates?
Is there a problem with their data capture?
Yes, there were some other points that were just extra annoyances for them but clearly that point was the most important to them. It's what the header image and the first 60% of the article was talking about and yet nobody seems to be engaging with that point in this thread.
Is the design choice bad? They never said it was. It's just an engineering trade off. It's very possible that most workloads benefit from this design. But if you workload involves updating lots of existing rows at large scale, then MySQL is going to be a better choice for you.