Feel free to disagree, but I think you need to improve your knowledge on Facebook's internal systems. There are several papers/preso off of
infoq.com and
highscability.com if you're so inclined. Both sites also have architecture information on other web scale companies.
I'm sure at one time hundreds of gigs per day was impressive. A year ago, Facebook had roughly
130TB of
logs per day. I'm sure that's gone up.
Once controlled structure is achieved housekeeping is pure execution. I think the FB problem is that it was started by a dorm room operation not professional computer guys and then never upgraded to deal with large amounts of data. I have two guys that spend all day every day looking at architecture and DB design. They are worth their weight in gold since they are true system guru's. FB seem to have missed that.
Probably didn't make my point as clear as I could of. In data management foundation is everything and worth the initial pain. We start a project today with a very large customer still trying to run a relatively complex operation on Access of all things. Our biggest obstacle is the IT Dept that wrote the application years ago and has been nursing it all this time. Should be real interesting in a painful sort of way!