Posted on 07/07/2011 8:55:49 PM PDT by TenthAmendmentChampion
According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to a fate worse than death, and the only way out is bite the bullet and rewrite everything.
Not that its necessarily Facebooks fault, though. Stonebraker says the social networks predicament is all too common among web startups that start small and grow to epic proportions.
During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the sites massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve. Im checking with Facebook to verify the accuracy of those numbers, but Facebooks history with MySQL is no mystery.
The oft-quoted statistic from 2008 is that the site had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached, although multiple MySQL shards and memcached instances can run on a single server. Facebook even maintains a MySQL at Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site...
(Excerpt) Read more at gigaom.com ...
BTTT
thanks for the trip down memory lane.
I'll bet it is, but then it's fairly straightforward and relatively small. You have users, threads, posts, mails and the connections between them. Everything's text, no BLOBs. We also have maybe a few hundred thousand users at most, not several hundred million. We probably have tens of millions of records in the "posts" table, but that's not a problem with modern hardware.
Google probably runs the largest commercial database, but it's highly customized. The proprietary Linux modifications and the database were all designed for this. It's highly distributed by the core design, and fault tolerant in that it doesn't care if some search data is lost since it will be rebuilt on later crawls.
Does the company you work for understand that fixing this problem cannot wait until the system breaks? That it takes time to move stuff over to a more robust system?
I ask, because I’ve been in a similar situation and the company wouldn’t listen when I told them they need to pay attention and starting looking at upgrades now, rather than at the last minute.
Just a bit of warning, writing a check to get Standard or Enterprise doesn't just mean you get to access more memory and CPUs and can have a bigger databases. It's much better than in the MSDE days, but there are still other differences that can affect how you design your database. For example, off the top of my head, the full versions also give indexed views, transparent encryption (so you don't have to encrypt yourself), table and index partitioning and database mail. Luckily, you can now get full text search in the Express Edition too if you download the version with advanced services.
Still, it's much better than moving from mySQL to a top-line system. FTR, I don't hate mySQL, since I've used it quite a bit on small systems I know won't get big. I just wouldn't trust it with the multi-terabytes I've managed on other systems.
BTW, I still have links for some gopherspace that is still out there and running.
Translation?
They have to demolish the building and put up another which looks exactly the same but which won’t spontaneously implode like the first one is about to.
Weird Al is a genius,
And postin' "Me too!" like some brain-dead AOL-er
I should do the world a favor and cap you like Old Yeller
You're just about as useless as jpegs to Hellen Keller
Indeed, it does understand this, but is not of great concern, because the company will most likely be closing down before the end of the year due to eBay’s pricing changes. Unfortunately, we’re also taking down several others at the same time when we stop buying from them.
Okay, whip it out...
I had one with 20 million+ main records, but all text so it was only a few hundred gigabytes. Tweaking the indexes (regular and full-text) for fast searching was the key there. Had another that used lots of BLOBs, ran to terabytes in over a million records. That just sucked. I left before SQL Server 2008 came available with its ability to hold BLOBs outside the database files.
But then I have friends who have dealt with a lot more than that. One ran a mainframe with two of those robotic tape silos. He’s probably juggling petabytes by now. There’s always a bigger fish.
“Regular inner join, or do you plan to get kinky with a left outer join?”
It’s got 2 Americas and change worth of users. Somehow I don’t think mySQL is holding them back very much.
Wouldn't it be fun if we found out Facebook was running RAID 5 or 6?
LOL. I really like that.
never used Raid 6 — how does that differ from 5?
It's 5 with an extra parity drive. It's supposed to eliminate that exposure to data loss while a RAID 5 is rebuilding after losing a drive. Basically, with RAID 6 and a hot spare, odds are you'll always have one level of redundancy.
Bingo!
Your *ENTIRE* reply hit the nail on the head.
I’ve seen (more often heard, rather than seen) this type of issue occur when forthought and growth is not engineered in at the beginning.
Wow.
Yet another person thinks like I do.
(I’m not fond of C++ either. Give me “C”. Pure and powerful K&R “C”.)
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.