Research Scientist Intern, Multimodal Speech Enhancement (PhD)

Employer

Job Description

Facebook Reality Labs focuses on delivering Facebook's vision on connecting people through Virtual Reality (VR) and Augmented Reality (AR). The FRL team is driving the state of the art forward with breakthrough work in computer vision, speech, audio, virtual assistant, machine learning, mixed reality, graphics, displays, sensors, and new ways to map the human body among many others.

The On-Device AI Research team in FRL brings together a world class team of researchers working on a range of problems in AI Assistant and novel AR experiences. Our work is at the core of enabling future AR and VR products. We are seeking exceptional interns with a background in speech enhancement to join our team. We are specifically looking for candidates to work on topics such as face keypoint detection, face tracking and speech enhancement.

In this position, you will work with cross-functional teams in FRL specializing in speech enhancement solutions using audio and visual inputs that can be deployed on devices under challenging computational constraints. You will collaborate with research mentors to understand the challenges and build state-of-the-art models to tackle them and then work with the software/hardware team to deploy these solutions on-device.

Our internships are twelve (12) - sixteen (16) weeks long and have various start dates throughout the year.

Research Scientist Intern, Multimodal Speech Enhancement (PhD) Responsibilities:

  • Define, plan and execute cutting-edge deep learning research to advance speech enhancement solutions.
  • Collaborate with other research scientists and software engineers to develop innovative techniques for speech enhancement that can use multimodal inputs like vision and audio.
  • Develop novel deep learning techniques to achieve state-of-the-art performance within the constraints of on-device and real-time execution.
  • Collaborate with software and hardware engineers to develop tradeoff curves for performance vs the runtime resources/constraints.
  • Communicate the experimental results and the recommendations clearly, both within the group as well as to the cross-functional groups.

Minimum Qualifications:

  • Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Electrical Engineering or related field.
  • 1+ years experience in developing speech enhancement solutions and/or accelerating machine learning solutions for audio related tasks.
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment.

Preferred Qualifications:

  • Intent to return to degree-program after the completion of the internship/co-op.
  • Publication track record in audio understanding, machine learning and/or computer vision conferences. Examples include CVPR, ICCV, ECCV, INTERSPEECH, ICASSP, NeurIPS, ICML, ICLR.
  • Interpersonal experience: cross-group and cross-culture collaboration.

Facebook is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law.Facebook is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.