Thursday, March 11, 2010

Digg loves Cassandra

What's Wrong with MySQL?

Our primary motivation for moving away from MySQL is the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight. This growth has forced us into horizontal and vertical partitioning strategies that have eliminated most of the value of a relational database, while still incurring all the overhead.

Relational database technology can be a blunt instrument and we're motivated to find a tool that matches our specific needs closely. Our domain area, news, doesn't exact strict consistency requirements, so (according to Brewer's theorem) relaxing this allows gains in availability and partition tolerance (i.e. operations completing, even in degraded system states). We're confident that our engineers can implement application level consistency controls much more efficiently than MySQL does generically.

As our system grows, it's important for us to span multiple data centers for redundancy and network performance and to add capacity or replace failed nodes with no downtime. We plan to continue using commodity hardware, and to continue assuming that it will fail regularly. All of this is increasingly difficult with MySQL.

An interesting blog post from from Digg's VP of Engineering, briefly describing their need and chosen solution.

Nothing new for those who are interested in NoSQL space, because Digg was contributing a decent amount of development time with Cassandra and shared its experience with developers.

What's also a good thing, that Digg decided to open source everything that they do and will do with Cassandra, making it more solid product.

# Posted via web from opportunity__cost