Maybe scale out and have a separate dedicated server machines, like the BigData solutions? (Hadoop and so on...) Connect multiple machines with a fast fiber optic back plane.
SANs, after all, partition loads in exactly this fashion with LUNs, right? That’s why they can handle more throughput, right?
As far as adjusting for contention, why not use something that partitions the writes based on some natural key in the data?
I think the bottom line is that there's is not a "one size fits all" answer to the question.