Research Intern - Audio AI

Date: Jul 2, 2025

Location: Beijing, Chaoyang District, CN

Company: Dolby Laboratories, Inc.

Join the leader in entertainment innovation and help us design the future. At Dolby, science meets art, and high tech means more than computer code. As a member of the Dolby team, you’ll see and hear the results of your work everywhere, from movie theaters to smartphones. We continue to revolutionize how people create, deliver, and enjoy entertainment worldwide. To do that, we need the absolute best talent. We’re big enough to give you all the resources you need, and small enough so you can make a real difference and earn recognition for your work. We offer a collegial culture, challenging projects, and excellent compensation and benefits, not to mention a Flex Work approach that is truly flexible to support where, when, and how you do your best work.

 

Advanced Technology Group (ATG) is the research and technology arm of Dolby Labs. It has multiple competencies that innovate technologies in audio, video, AR/VR, gaming, music, and movies. Many areas of expertise related to computer science and electrical engineering, such as AI/ML, computer vision, image processing, algorithms, digital signal processing, audio engineering, data science & analytics, distributed systems, cloud, edge & mobile computing, natural language processing, knowledge engineering and management, social network analysis, computer graphics, image & signal compression, computer networking, IoT are highly relevant to our research. 

 

Current Dolby ATG Beijing team is looking for a talented, self-motivated Research Intern who dedicates to research deep learning algorithms for speech and audio processing, you will be involved into investigating various models and transfer the learned knowledge to the in-house deep learning models.

This position will be in the Dolby Beijing office.

 

Essential Job Functions

  • Work with researchers to design and implement novel deep learning algorithms for speech/audio processing projects, such as speech/audio enhancement, separation, generation, etc.
  • Work with researchers on the full cycle of research with the goal of pushing state-of-the-art.
  • Extensive literature reading, creative thinking, hands-on development, experiment design, result analysis and patent/academic paper writing.

 

Education, Skills, Abilities, and Experience Required

Desired Qualifications:

  • Candidates working towards a PhD degree in the field of deep learning for speech and audio processing.
  • Strong hands-on experience in one or more research areas in deep learning for speech/audio processing, such as enhancement, separation, generation etc.
  • Experience with deep learning models, such as audio or multi-modal lanaguage model, Transformer, Diffusion Models, etc.
  • Strong coding capability with deep learning tools, such as Pytorch.
  • Demonstrable experience programming in Python.
  • Excellent analytical skills and ability to communicate complex information rapidly and efficiently.
  • Good written communication skills, in both Chinese and English.

 

Nice to have:

  • Publications in top conferences/journals (such as ICASSP, INTERSPEECH, NeurIPS, ICLR, ICML, etc.) is a big plus.

 

#LI-JZ1