Share this Job

Research on Learning Speech Rep. Intern

Apply now »

Date: Dec 31, 2020

Location: Barcelona, ES

Company: Dolby Laboratories, Inc.

At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to inspire/make a difference how people create, deliver, and enjoy entertainment worldwide. 


To do that, we need the absolute best talent. Are you the one? 


We’re big enough to give you all the resources you need, and small enough so you can make a real difference and earn recognition for your work. Would you like to work on the latest entertainment technologies to deliver spectacular experiences with us? 

Dolby Applied AI is currently looking for two interns who are willing to push the state of the art in audio processing and synthesis research, as well as to unveil the secrets of deep unsupervised learning. As an intern, Dolby Laboratories offers the possibility to do impactful research in a dynamic and collaborative environment, with enough resources and working in close collaboration with world-class researchers that focus on the most recent progress in signal processing and machine/deep learning. 




As a candidate, we want you to have a clear understanding of deep learning architectures and techniques, so convolutional or recurrent networks are trivial concepts for you already, and you usually play around with more complex ideas and developments like VAEs, GANs, seq2seq models, and others. Moreover, the candidates have backgrounds centered around speaker or speech recognition, synthesis, or enhancement. Additional experience with unsupervised and self-supervised learning methods will be very valuable.


Main responsibilities:

  • Invent/design improvements for general representations of speech, possibly learnt through unsupervised/self-supervised techniques.
  • Implement an extensive evaluation pipeline featuring several downstream tasks, to show the improvements of such representations.
  • Perform such evaluation and compare with the state-of-the-art (possibly reimplementing some of those methods).


Required skills:

  • Profound deep learning knowledge.
  • PyTorch.
  • Knowledge of existing speech recognition approaches, as well as speaker classification or synthesis/enhancement tasks, including evaluation measures and protocols.
  • Python & Linux.



  • Working towards a or MSc or PhD degree in Computer Science or a related field
  • Must be available to work remotely full-time/part-time for 3/6 months between January 2021 and July 2021.
  • We will prioritize candidates pursuing a PhD and located in the EU time zone.


Dolby Hiring Entity:
Avenida Diagonal,
Planta 10 08018 Barcelona,