This is becoming very obvious.
A few weeks ago I asked ChatGPT “how many seconds of voice sample” is needed to construct a good enough model to fool most Voice ID schemes? The answer that came back was “3 to 5 seconds,” although there are countermeasures that can be taken to make voice fakingmore difficult. So, the obvious conclusion is that you can’t talk to anyone, and you’ve got to use a synthesized voice answering message as this unfolds. Your very simple “Not here, leave a message at the beep” could become a part of identity theft.
So, yesterday I had another idea. Someone will start licensing the vocal tract models of popular actors and singers. Want ChatGPT (or the others) to talk to you like your favorite hot actor/actress? There will be voice model available for that. Actors and singers will have their vocal tract parameters copyrighted (why give it away for free?).
There are already fake AI bands, Rick Beato talked about this.
https://www.youtube.com/watch?v=3Nlb-m_vKYM
“So, the obvious conclusion is that you can’t talk to anyone”
Or around any phones period. The mics are always on, so any phone that can hear you will be recording and collecting, and these databases connected to your device identification can be hacked...