Anyone Can Make Beautiful Music with #ProjectKazoo

Adobe Research Scientist Zeyu Jin transforms voice recording into realistic, organic instrumental music on stage at Adobe MAX 2018.

by Adobe Communications Team

posted on 12-12-2018

Music makes us feel things, from the depths of sorrow to the euphoria of great joy. It stirs us to action, comforts our pain and becomes an intimate backdrop to our life experiences. We all have music inside of us, and Zeyu Jin, a research scientist at Adobe, is on a mission to let it out.

He developed #ProjectKazoo, an Adobe sneak technology that allows anyone to compose and create realistic, natural sounding instrumental music simply by using his or her own voice – no musical training or vocal talent required. If you can imagine it, you can create it without having to learn how to play an instrument, read music, or even sing on key. “My goal is to allow anyone to create music easily by singing small passages into a microphone,” Zeyu says.

The core functionality of #ProjectKazoo uses artificial intelligence (AI) powered by Adobe Sensei to generate instrumental sounds from recorded voice, but several technologies are brought together to make it easier to use.

First, #ProjectKazoo uses standard voice recording and auto-tune technology to correct for pitch. You simply hit record and sing a few bars of music. Then, the magic comes from the way #ProjectKazoo transforms the original voice recording into different instrumental sounds, such as a violin, a cello, or even a saxophone. It results in a much more organic, natural sound than a typical musical synthesizer.

“We’re not just matching a synthetic note to the pitch of your voice,” explains Zeyu. “We’re using machine learning to take snippets from an actual violin performance (for example) and piecing them together to follow and approximate the pitch of your voice. Essentially, we’re using parts of someone else’s performance to generate your own..” That means it’s possible to do more than generate notes. It’s possible to generate unique instrumental sounds, such as a violin speaking or laughing (as shown on stage at MAX).

Zeyu Jin, Adobe research scientist, shows comedian, actress and writer Tiffany Haddish how #Project Kazoo can generate cello notes that mimic her voice.

For Zeyu, #ProjectKazoo is a unique expression of personal and professional research interests. Long before he turned to computer science and coding, he was composing music. “I started learning piano when I was four years old and became interested in composition as a teen. I’ve written about 320 compositions and actually produced two albums,” he explains.

So how does a young composer become a computer scientist? As a young adult, Zeyu had an opportunity to attend one of the most prestigious universities in China. They didn’t offer a degree in music, so Zeyu turned to computer science. “I thought of it as a compromise at the time. But it ended up being the best choice—I found computer science to be fascinating and equivalently beautiful. Coding, in many ways, is composition,” he notes.

Given his background it’s no surprise that although #ProjectKazoo makes it easy for anyone to compose instrumental music, Zeyu envisions more professional applications in the future. “When I was a composer, I often found it difficult to create the right sound using a synthesizer or other MIDI device. And of course, no one composer can learn to play each and every instrument with fidelity. Software like #ProjectKazoo can make it easy for professional musicians and composers to explore and iterate as well. Deep learning has the potential to create new forms of experience and spontaneous performance as we improve the technology and make it more accessible,” he says.

This story is part of a series that will give you a closer look at the people and technology that were showcased as part of Adobe Sneaks. Read other Peek Behind the Sneaks stories here.

Topics: Digital Transformation, Video & Audio