Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The actual summary of the article is "The design of Postgres means that updating existing rows is inefficient compared to MySQL".

Yes, there were some other points that were just extra annoyances for them but clearly that point was the most important to them. It's what the header image and the first 60% of the article was talking about and yet nobody seems to be engaging with that point in this thread.

Is the design choice bad? They never said it was. It's just an engineering trade off. It's very possible that most workloads benefit from this design. But if you workload involves updating lots of existing rows at large scale, then MySQL is going to be a better choice for you.



But it's only an issue if you rely on lots of transactions for data consistency and my point was that it sounds like they are relying on transactions too much which is why they need a more "forgiving" database, which is the part I quoted.

Also they didn't mention anything about the auto vacuumer, which mostly solved the issue they are talking about.

Their lack of mention of the vacuumer and not seeming to know that Postgres supports statement level replication makes me wonder if they took a deep dive into the wrong part of the technology.


None of what you said addresses the issue that I (or they) are talking about.

On Postgres an update requires a rewrite of every index of the row.

On MySQL it only requires an update of the indexes that were touched by the update.

If you have a table with 10 indexes then this means doing 10 extra writes physically to the disk.


As I know, not every updates rewrite indexes in PG. It has single page clean up and HOT (heap only tuple) update optimization.

Please refer to 64~ page of https://momjian.us/main/writings/pgsql/mvcc.pdf.


10 indexes on one table seems a bit much. It sounds like a table that hasn't been normalized.


Though that is quite proper for a data warehouse.


A data warehouse should use much fewer transactions though.


Both vacuuming and logical replication are discussed in the article. In particular, vacuuming is easier with InnoDB since the changed records all exist in the redo log whereas PostgreSQL needs to scan the whole table. pglogical is mentioned for people running PG9.4+ as a way of doing minimal downtime cross version upgrades, which wasn't an option back with PG9.2 unless you go with something like slony1 or londiste.


> vacuuming is easier with InnoDB ... PostgreSQL needs to scan the whole table

There's been quite a few improvements to VACUUM in 9.6 [1], including avoiding full-table scans.

[1] https://www.postgresql.org/docs/9.6/static/release-9-6.html#...


Agreed, they even admit Postgres is faster for querying because MySQL requires 2 index lookups. I like the immutable data design of Postgres, is more robust and enables fast transactional DDL. My design never deletes or alters a tuple, so all of these problems that Uber is obsessing about, go away. Question is - Why do they need so many updates? Is there a problem with their data capture?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: