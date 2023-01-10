Share the joy

VALL-E Microsoft AI Model

Microsoft researchers announced on Thursday the new text-to-speech AI model of the company. It’s called VALL-E. It can simulate someone’s voice. But it needs a three-second audio sample to do so.

Once it captures a specific voice, it can start synthesizing audio of the person saying anything. Unlike other AI tools, VALL-E could replicate the tone and emotions of a speaker.

The AI model could be utilized for high-quality text-to-speech apps. It can also be used for speech editing. For instance, a recording of a person could be edited. In that way, it could change what the person said from a text transcript, even though the person didn’t say it.

Microsoft shared voice samples. And they range in quality. Some of them sound natural. Others sound robotic.

AI gets better over time. Thus, in the future generated recordings, it will be more convincing.

Impressive But…

It’s impressive but it has several ethical concerns. AI is becoming more powerful. The said AI model can become more convincing.

For example, if an attacker creates a voice that closely mimics that of a person, they could use it to impersonate that person and deceive others, such as by making fraudulent phone calls or recording fake audio messages.

This could have serious consequences, particularly if the person whose voice is being imitated is a high-profile individual or holds a position of authority.

Another concern is the potential for invading someone’s privacy. For example, if an AI system is trained on an individual’s voice without their consent, it could be used to create a replica of their voice without their knowledge or approval.

Additionally, as technology is becoming more accessible and easily available, there might be possibilities of it being used for harassment or bullying.

For example, someone could use a voice-cloning AI to impersonate a person and make threatening or abusive phone calls.

Lastly, using voice cloning to manipulate public opinion or political agenda is a major concern, the AI-generated voice could be used to impersonate political figures and deliver false or misleading statements to the public, which could have significant real-world implications.

Overall, the ability to clone a person’s voice creates new opportunities for deception and manipulation, as well as raising questions about consent and privacy, and it is important that we consider these ethical concerns as the technology develops and becomes more widely used.

Will Voice Actors Lose Work?

VALL-E is still not available. However, it’s possible that this technology can replace voice actors. Recently, we reported about Apple’s announcing a feature that it uses AI to read audiobooks. If this will become mainstream, it’s possible that some voice actors will lose their job.

Then again, it is not likely that it will fully replace the human touch, creativity, and artistry that professional voice actors bring.

The industry of voice acting also includes other important aspects such as the recording process and post-processing, directing, and selecting the appropriate voice for the appropriate context. It’s unlikely that AI will be able to replace the human touch, creativity, and artistry that professional voice actors bring.

