Loading…
AI DevSummit 2025 + DeveloperWeek Leadership 2025
Wednesday June 4, 2025 1:30pm - 1:55pm PDT
Naman Goyal , Google Deepmind, Machine Learning Engineer

Multimodal AI systems—capable of processing text, images, audio, and video simultaneously—present transformative opportunities for accessibility while introducing complex challenges related to bias and fairness. This presentation explores this duality through evidence-based analysis of current implementations and future directions.
For individuals with disabilities, multimodal AI creates unprecedented opportunities: visual recognition systems achieve high accuracy for common objects, real-time speech-to-text transcription operates with minimal error rates, and adaptive learning technologies significantly improve information retention for neurodivergent learners. However, these same systems exhibit concerning bias patterns: recruitment algorithms show substantial ranking disparities across demographics, speech recognition error rates vary considerably across accents, and many images containing gender bias can be traced to problematic relationships between visual elements and text annotations.
The presentation outlines a comprehensive framework for responsible development including: inclusive design principles (with evidence that disability consultants identify many more potential accessibility barriers), representative dataset curation (addressing the reality that images in computer vision datasets rarely include people with visible disabilities), rigorous testing methodologies (conventional sampling typically captures very few users with disabilities), and ethical governance considerations (most AI practitioners want clearer accessibility standards).
Through case studies including image description technologies (showing notable accuracy disparities between Western and non-Western cultural contexts), diverse speech recognition (where community-driven data collection reduced error rates for underrepresented accent groups), and emotion recognition systems (with higher error rates for non-Western expressions), the presentation provides practical insights for developing multimodal AI that enhances accessibility without reinforcing existing inequities.



Speakers
avatar for Naman Goyal

Naman Goyal

Machine Learning Engineer, Google Deepmind
Naman Goyal is a distinguished Machine Learning Engineer and Researcher specializing in Large Language Models (LLMs), Computer Vision, Deep Learning, and Multimodal Learning. With a proven track record at leading technology companies including Google DeepMind, NVIDIA, Apple, and innovative... Read More →
Wednesday June 4, 2025 1:30pm - 1:55pm PDT
VIRTUAL AI DevSummit Expo Stage

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link