You might be able to adjust for contention, allocating more space to write buffers when the disk queue length increases but memory is available.
Maybe scale out and have a separate dedicated server machines, like the BigData solutions? (Hadoop and so on...) Connect multiple machines with a fast fiber optic back plane.
SANs, after all, partition loads in exactly this fashion with LUNs, right? That’s why they can handle more throughput, right?
As far as adjusting for contention, why not use something that partitions the writes based on some natural key in the data?