Imagine a world where even the walls have ears, capturing every whispered secret or heartfelt confession. Now, picture this: those very walls not only listen but can also echo your voice with eerie precision. Voice cloning transforms this surreal concept into reality, creating a sonic twin that captures every nuance of your speech, regardless of its pitch or cadence. Yet, with this power comes a web of legal and ethical challenges, as technology dances between innovation and accountability. Around the globe, many celebrities like Drake, President Joe Biden, Emma Watson, etc have fallen victim to the power of AI. Recently, Artificial Intelligence has made a debut in the Lok Sabha polls by cloning the voices of Prime Minister Narendra Modi and Rahul Gandhi reflecting its unethical use. This article boldly navigates the legal murky waters of voice cloning while spotlighting the elusive ethical debate. From the genesis of voice cloning to its far-reaching consequences, it paints a panoramic view of this technological landscape.
Artificial Intelligence & Voice Cloning
AI has been changing at a rapid pace and the technology behind it is racing out the pace of lawmakers to govern it. AI has yet not been completely defined but its definition can be understood by the Cambridge Dictionary which defines AI as “the use of computer programs that have some of the qualities of the human mind, such as the ability to understand language, recognize pictures, and learn from experience”. AI-generative interface such as OpenAI’s Voice Engine, Microsoft’s VALL-E, Google’s TTS, Watson Text to Speech etc uses methods such as deep learning to create artificial voice. Deep learning goes beyond traditional neural networks by stacking layers of computing units, known as neurons, into a single neural network. This layered approach mirrors how our brains process information. What sets it apart is its ability to self-correct without human intervention, enabling breakthroughs in image and audio processing. The development of deep learning allowed interfaces such as Text-to-Speech and Speech-to-Speech to mimic human voice with much precision by processing audio recordings. These AI-generative interfaces meticulously mirror human vocal qualities achieving a strikingly human-like result. The accuracy is often so precise that listeners struggle to distinguish between the artificial voice and a real human speaker.
Can Voice Cloning Be Copyrighted?
The AI- generative voice suffers from the “black-box” dilemma which makes it difficult for the programmer to anticipate about the output. Therefore, it is extremely difficult to source the originality of the voice, identify the author, and establish its fixation. Further, it is difficult to even establish the relation between the author and the user of the copyright to ascertain the royalty obligations. Thus, the technique of Voice Cloning raises significant questions about its legality and whether it can be copyrighted as listeners struggle to differentiate between the voices.
Indian copyright law protects original works of authorship fixed in a tangible medium of expression. As per R.G Anand vs M/S. Delux Films & Ors, only the “original” work can be copyrighted and not the ideas or discoveries. Therefore, this poses a significant question of whether AI-generated voice can be copyrighted. To make this possible AI-generated voice has to satisfy the three valid essentials of copyright law i.e.- (1) originality, (2) work of an author, and (3) fixation in a tangible medium. In Midler v. Ford Motor Co, the American court denied the extension of copyright law to the imitation of law as the voice is not “fixed”. Further, in the case of Butler v. Target Corp., the American courts held that while lyrics to a song are copyrightable, the underlying voice is not.
The Indian courts have been following a similar approach where they have ruled that while tracing the copyright of voice is difficult under the Indian Copyright Law, 1957 the right to protection to voice and other personality rights can be granted against any misuse for commercial gains. Recently, the Delhi High Court granted an omnibus ex-parte injunction restraining the concerned entities and the world at large from using the voice, image, and persona of Anil Kapoor for any commercial gain without his consent.
Thus, it will be safe to conclude that while the copyright of the “voice” is difficult, individuals have the right to claim protection of their personality rights and privacy against any misuse without their consent.
The Ethical Debate
AI-generative voice raises some significant questions about its ethical use. Recently, the Voice Cloning technique has been used by scammers and cybercriminals that have violated the existing copyrights, especially in the media and entertainment industry by mimicking the voices of singers and actors. The advent of this technology also raises some significant concerns about the privacy of individuals, thus, resurfacing the debate of innovation and vulnerability. The complex process of AI voice cloning requires careful data collection and preprocessing. When this process is misused, it can result in legal consequences. Further, voice recordings, originally obtained for lawful reasons, might inadvertently be used for illicit purposes, including fraud or even blackmail. Thus, this clumsy legal-tech landscape calls for legal safeguards against the ethical use of Voice Cloning simultaneously protecting the interests of individuals against any misappropriation and misuse.
Conclusion
Both in India and worldwide, the legal landscape surrounding artificial intelligence, especially regarding voice cloning, remains ambiguous. Current legislation lacks specificity regarding intellectual property rights concerning voices, creating opportunities for misuse such as “deepfakes.” Therefore, the only solution to control this technological marvel is to create a regulation that firstly, governs this technique and secondly, promotes the use of AI in a bonafide manner. Further, certain rules must be made to copyright the voice generated by the AI with the consent of the person whose voice is mimicked.
Authors: Saurojit Barua