What can I say? What I want to do involves clusters, and it's not a web app. How do you propose to handle four servers running at heavy load (pushing lots of data and probably hogging the processor), plus the other three servers and client, plus host OS, on four cores?
It could technically work, and probably work somewhat acceptably on extremely small test datasets, maybe a few tens of megabytes. That would get me through early development. But I need to test and optimize this for realistic datasets, like hundreds of gigabytes. I don't want to wait all day.
I’m not proposing that you do it any differently, just reiterating that 8 2.8GHz cores is in fact, more horsepower than most people, even most developers, can use. If you can use it all, more power to you.