It would be different in the sense that you would first need to associate a particular sound with a particular key on a keyboard. Then you'd have to piece it all together to make sense of the number and/or letter combinations. It would be like breaking a code.
ie, A = x sound pattern, B = y sound pattern, etc, etc.
get 100 people type the same document on 100 different keyboards and map the audio to the words
the bigger the sample size the better the recognition, then adjust for speed
you are mapping the whole word not the key sound itself