Posted on 10/05/2004 9:24:10 PM PDT by John Robinson
It's always something new with a complex system. We have frontends, backends and databases, not to mention ancillary services like DNS, mail, and internal gadgets. Something is bound to goober up.
A few months ago we were hitting the limits of our database environment. I added hardware and all was good... well, too good. The backends couldn't keep up, so I added hardware (just this weekend!), and all was good... until tonight, when things were once again too good. This time the frontend went on strike, overwhelmed.
A few years ago, when last looking at the scalability of our site, we made a choice, to save bandwidth, we spend extra CPU cycles to compress server responses. We achieve roughly a 60% savings.
Unfortunately, that takes a dear toll on our two little 933 MHz CPUs running the frontend. That toll looks something like this:
<code type="unix geek">
8:15pm up 2 days, 14:48, 11 users, load average: 214.69, 160.02, 84.25 317 processes: 306 sleeping, 10 running, 1 zombie, 0 stopped CPU0 states: 12.1% user, 69.0% system, 0.0% nice, 18.0% idle CPU1 states: 12.0% user, 63.0% system, 0.0% nice, 24.0% idle Mem: 2064712K av, 2050876K used, 13836K free, 0K shrd, 46816K buff Swap: 2040244K av, 22900K used, 2017344K free 223376K cached</code>
Translation:
So... I added more hardware! That explains the second IP address (209.157.64.201, the first is 209.157.64.200.) In a few days your ISPs will have our updated DNS, and will automagically select one of the two frontends when you visit www.freerepublic.com.
The new frontend is a dual 1.4 GHz whopper. It along with it's older partner will have no trouble slinging compressed pages now, and saving roughly 2 grand a month. Oh, and when I said I added more hardware, well, I actually reassigned an older backend to frontend duty. I thought I may have to rearrange machines while I tune the system, so I made it easy ((cough)) to do.
We're running fine now. The peak load I saw was 130 requests per second. I figure we were probably doing 160-180 per second during the debate. No way to know for sure, the fire burned up our logs.
As for the new hardware, I know many people have been asking about it, and how the install went last weekend. I just haven't yet had the time to write what I wanted to write.
In summary, we added three Dell ((cough)) PowerEdge 1750 servers each with dual 2.8 GHz Intel Xeon processors and 1 Gigabyte of RAM. I was really impressed with the Dell machines out of the box, they're mean looking boxes and have more features than the barebones Supermicro kits I used before. Of course, the rails were too short for my rack and there was no table space to lay them. What else can a guy do but rebuild a rack on a Saturday night/Sunday morning? Ah, but that's for another story to tell.
I have no clue of what you just said .. but thank you for getting us back up and running
MySQL, it's been good to us and has a choice of storage engines each with different features. We're using the MyISAM storage engine for everything right now, but it's starting to show signs of contention (MyISAM is really good at heavy writing or heavy reading, but not both) so I'll probably start migrating some tables to InnoDB which handles contention better (and does transactions too.)
Oh, the DB is around 30 GB, with another 7 GB of compressed archives which are flat files (actually just HTML.) Those are the older /forum/ URLs you might see from time to time. 2001 and earlier.
Lets have another fund raiser.........last one was too fast !
I understand, a hugh moose bit your sister in the shower.
I was real tempted to install XP and Doom III on one, but I'd still need to acquire the PCI-X vid card and Doom III. Plus I really didn't have the time, if I did, I probably wouldn't have given the machine up. :-)
so, the external clastoid mastoid hyberchronifiar, dismachifnegated the hypostatic ekenosinator, thereby vitiatimating the vortocuticlastical shmagtrofinator...right?
:o)
This is kinda scary I actually understood what you said.
Thanks for being so speedy on the repair.
Thanks for the info. A 30gb db. The biggest single db of the many sybase servers I work with is a mission-critical one about 45 gb, but we have bigger ones in Oracle, which I don't work with. Interesting you have MySQL for a large db and for such a heavily hit application (FR). I'm surprised but impressed. I have to do some research on MySQL dbs.
Wow! That is some load. I have only seen that on an overloaded multinode Nagios monitoring system with 7,000+ checks. But this shows the power of Unix/Unix-like OSes. They might get overloaded but they don't go down. Throw in a little more hardware and a foundry switch to balance it out and you are golden.
No. What's really scary is that I understood what tame was saying. : )
Thanks for getting it fixed in short order (and offering alternate server addresses in the meantime).
:o)
:) jabberwocky
That looks fun. I have no experience with AMD but have been meaning to give them a try.
My goal is to memcache everything, to hit the database for as little as possible. Right now the two frontends have 2 GB of memory each, with a 1 GB memcache on each. The backends have 1 GB of memory each, with a piddly 256 MB memcache on each. The database machines have 3 GB of memory all to themselves.
The only disks that matter are the DB, they're 15K RPM 74 GB SCSIs for the data, one on each server. I have another 15K RPM 36 GB SCSI for the O/S and log files and some of the tables (just to balance load.)
Oh, and this guy (PDF) is my hero.
The system is capable of syncing across the Internet. In fact, I sync the database over a SSH forwarded port, for backup and development. However, we rely on a master write database, which means any child node would need to communicate with the master database to store anything. Site redundancy isn't yet an option. I can address that with a complete rewrite of the database and software, including some of the concepts we know and love, which is something I both look forward to doing and dread.
Thanks for that warning.
Speaking of cooling, those Dells have 7 screaming fans. They really blow. And they're REALLY LOUD! Multiplied by 3. They sat behind me for the duration of their configuration, about 10 days. You will have to speak louder, I'm now partially deaf.
(And I thought my home file server was noisy with 4 HD coolers and misc fans.)
Sure could hear the quiet when the breakers blew. (The breakers blew twice here at home when I was compiling Gentoo on all three boxes. My 650 watt UPS powered them along for literally 5 seconds before it gave up in disgust. LOL, that really sucked!)
That might do it.
Thanks!
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.