Unveiling Maya: Sesame's Groundbreaking AI Model and the Future of Voice Assistants
Techcrunch17 hours ago
820

Unveiling Maya: Sesame's Groundbreaking AI Model and the Future of Voice Assistants

AI Technology
ai
voiceassistant
technology
startups
innovation
Share this content:

Summary:

  • Sesame releases the CSM-1B model, powering the realistic voice assistant Maya.

  • The model features 1 billion parameters and is licensed under Apache 2.0 for commercial use.

  • Utilizes RVQ audio codes for advanced audio encoding, similar to Google's SoundStream.

  • Lacks significant safeguards, relying on an honor system to prevent misuse.

  • Maya's technology allows for natural speech patterns and can be interrupted during conversation.

Sesame's New AI Model

Sesame, the innovative AI startup, has just unveiled the CSM-1B model, the driving force behind its impressively realistic voice assistant, Maya. This model boasts 1 billion parameters and is licensed under Apache 2.0, allowing for commercial use with minimal restrictions.

What is CSM-1B?

The CSM-1B model is designed to generate RVQ audio codes from both text and audio inputs. RVQ, or residual vector quantization, is a cutting-edge technique for encoding audio into discrete tokens, utilized in various recent AI audio technologies, such as Google's SoundStream and Meta's Encodec.

Technical Backbone

CSM-1B leverages a model from Meta’s Llama family, complemented by an audio decoder component. While a fine-tuned variant of CSM powers Maya, the base model is capable of producing a variety of voices, although it has not been specifically tuned for any single voice.

Limitations and Concerns

Interestingly, Sesame has not disclosed the training data used for CSM-1B. Moreover, the model lacks significant safeguards, relying on an honor system to deter developers from using it to mimic voices without consent or to create misleading content.

Real-World Testing

In a recent demo, cloning a voice took less than a minute, allowing easy generation of speech on sensitive topics. This has raised concerns, echoing warnings from Consumer Reports about the lack of safeguards in popular AI voice cloning tools.

The Vision Behind Sesame

Founded by Brendan Iribe, co-creator of Oculus, Sesame has gained attention for its lifelike assistant technology. Maya and the other assistant, Miles, can speak with natural disfluencies and even be interrupted during speech, emulating human-like interactions.

Additionally, Sesame is working on AI glasses designed for all-day wear, equipped with their proprietary models. The company has secured funding from notable investors such as Andreessen Horowitz, Spark Capital, and Matrix Partners.

Comments

0
0/300
Newsletter

Subscribe our newsletter to receive our daily digested news

Join our newsletter and get the latest updates delivered straight to your inbox.

ListMyStartup.app logo

ListMyStartup.app

Get ListMyStartup.app on your phone!