Wired In: Paris Smaragdis

Wired In: Paris Smaragdis

Each week, staff writer PAUL WOOD chats with a high-tech difference-maker. This week, meet PARIS SMARAGDIS, an associate professor at the CS and ECE departments at the University of Illinois. His primary research interests revolve around making machines that can listen. He has worked on signal processing, machine learning and statistics as they relate to artificial perception, and in particular computational audition. He also loves working on anything related to audio. The bulk of his work on audio is on source separation, and various machine learning approaches to traditional signal processing problems.

You have a music degree. Is that what got you into machine listening at the MIT Media Lab?

My first degree was in electronic music, and I was always very much interested in sound. The hot subject at the time was to automatically generate sounds in software and to trick listeners into thinking they were real instruments. It was a time, very much like today with artificial intelligence, when people were concerned that the performing musician will be replaced by machines (didn't happen). But while working on that I started increasingly appreciating the sophistication of the human hearing system, and experimented making software that tried to understand sound, as opposed to generate it. Professor Barry Vercoe, who was heading the machine listening group at MIT also had a similar trajectory (initially a composer, but by then working on making machines understand music), so it felt like a natural place for me to go to grad school. Thankfully he did too!

How do you teach computers how to hear?

Like many similar fields in computational perception (computer vision, olfaction, etc.), we have gone through multiple transformations. In the older days, we relied on physiology and psychology literature to hand-construct systems that encapsulated all that knowledge. I was part of a younger generation that was more attracted to data-driven methods, like the ones fueling the recent deep-learning revolution. Instead of laying out rules, we present our systems with lots of sounds and subsequent actions that they can take, and allow them to figure out on their own how to behave when hearing sounds. This allows us to sidestep a very important stumbling block, the fact that we have no idea how to define the act of hearing! Unlike computer vision, where most people can articulate why a cat looks different than a violin, when it comes to listening we do not have that much clarity (can you articulate exactly how a cat sounds different from a violin?). This has forced us early on to be a bit more abstract in our thinking as a community and to rely more on example-driven methods as opposed to reasoning about things.

What are the challenges?

For small-scale tasks this works well; e.g. it's not that hard to make systems that can listen for car accidents and alert 911, or smart sensors that look for abnormal breathing, or smart radios that constantly scan for your favorite music in the airwaves. On the other hand, we are still far off from making sophisticated listening systems that can comment on the performance nuances of a world-class violinist.

Machine listening systems that you developed are widely deployed at traffic intersections in Japan, to automatically detect traffic accidents. Some of your technology appears in Adobe's Premiere and Audition products. Do you have patents on these?

Yes, plenty. Before I became an academic I spent a few years working in corporate research, and I still closely collaborate with many colleagues in the industry. This has resulted in more than 30 US patents (and some more internationally), but more importantly it was a great way to make products from some of the research that we do here and to introduce it to the outside world.

Do you have a start-up, or do you plan to do one?

Maybe one day I'll have the time for it!

What made you come to Urbana?

My wife got hired here as a professor, and after spending some years apart we decided that something had to be done one way or the other. Given that UIUC has had a long history when it comes to research on sound, it felt natural for me to come here.

What have you done in your work with source separation and signal processing, and what does this do for us in our daily lives?

Source separation is the task of extracting a single sound source from a mixture of many, e.g. isolating your partner's voice from a loud cocktail party recording. This is a problem that is at the heart of machine listening, since we mostly encounter sounds in mixtures and are usually interested in focusing on one sound at a time. I've been mostly known for introducing a few of the popular technical approaches to resolving this problem by using machine learning. Some of these methods have been put to use widely. For example, for helping cell phones and game consoles pick up your voice and not the noisy background, for making acoustic medical devices that can focus on a fetal heartbeat and not the mother's, for making smarter hearing aids that help people focus on their friends and not on other distractions, for extracting the dialog only from old movies and replacing the music and sound effects with newer recordings, etc. From a more abstract viewpoint, these methods are also applicable to other types of signals (e.g. communications or chemical analyses), and are also used in that setting as well.

What's your best advice for someone who's starting up in technology?

I believe that you need to be passionate about what you want to do with technology, and not just be passionate about technology itself. The field of computing is sufficiently mature by now, and should be primarily used to improve our lives. Doing so means that students in technology-oriented departments should be motivated to solve real problems that impact real people.

What's in the future for your research?

A lot of our lab's work is driven by students, my job is to inspire them, help them develop a vision, and then set them off on a productive course. This means that I do not have a lot of control of where our research heads. And if I did know what lies ahead, it wouldn't really be research!


Do you have a favorite thing to follow on social media, or an app you really love? I have a soft spot for weather apps. I grew up in a place with very predictable weather, and after moving here I felt that I needed help to minimize weather surprises!

On Facebook I follow ... just good friends to keep up with the news.

Book or Kindle? What are you reading right now? Is iPad an option? I don't think I have accepted it yet, but I do most of my reading on screen and not on paper. Unfortunately, most of the things I read lately are technical publications.

Do you have any wearable electronics? I got an Apple Watch recently (I'm hard at work trying to make it understand sounds), and I've been fascinated with hearables technology.

Do you have an entrepreneur hero? Not really, I usually like the innovators behind the scenes.