SPRING 2024

Gen-AI for Multi-Agent Deformable Object Manipulation

Jonathan Ong * , Zitong Huang * , Rida Faraz , Vijay Kumaravelrajan , Siddharth Rudharaju , Anisha Chitta , Daniel Seita * 
* Project Lead    * Project Advisor

Robots hold great potential for automating tasks across various environments, yet their adoption is limited by their current capabilities, especially in complex scenarios like caregiving. To enhance automation and accessibility, this project works to advance robotic manipulation of objects that require multiple manipulators and deformable items such as bread and clothing. In this work, we integrate various robotic platforms, including ALOHA and Baxter, into the MuJoCo simulation environment using Robosuite, and set up bimanual configurations to explore different arm controllers. To robustly assess robotic performance, we identify specific tasks involving deformable objects and continuously refine our evaluation protocols.

Harmful Brain Activity Classification through EEG Spectrogram Data

Jessica Fu * , Vayun Mathur , Ryan Nene , Brice Patchou
* Project Lead   

Currently, electroencephalogram (EEG) monitoring heavily relies on manual analysis by specialized neurologists, leading to time-consuming and expensive procedures with potential errors. By applying deep learning techniques for EEG analysis, the Harmful Brain Activity Project strives to enhance the accuracy of electroencephalography pattern classification of harmful brain activities, such as seizures, to facilitate greater diagnoses and treatments for patients. By training convolutional neural networks (CNNs) in a specialized pipeline, we aim to create a highly effective model as a contribution towards AI applications in EEG analysis, thus positively impacting the preservation and advancement of human brain health.

CityLearn: Reducing Greenhouse Gas Emissions by Improving Energy Distribution via Time Series Forecasting

Sanjana Ilango * , Spencer Tran * , Vidur Mushran , Andrew Choi , Joanne Lee , Jimena Arce
* Project Lead   

Buildings account for 30% of greenhouse gas emissions, making energy efficiency a critical focus for sustainability efforts. Distributed energy resources, such as domestic hot water systems that store electricity and solar panels that generate it, play a key role in alleviating the strain on building electric grids. To optimize the management and allocation of these resources across multiple buildings, it is essential to develop accurate and reliable energy usage predictions. This project aims to create robust predictive models that will enhance energy efficiency and contribute to reducing the environmental impact of buildings.

Comparing Encoder-Decoder Architectures for Multimodal Hate Speech Detection in Hateful Memes Dataset

Nathan Johnson * , Maohe (Mo) Jiang , Jonathan Aydin , Catherine Lu , Darius Mahjoob , Catherine He
* Project Lead   

Multimodal hate speech detection presents additional challenges beyond unimodal detection, as subtle forms of hate speech often surface only when both text and images are analyzed together. This project examines the effectiveness and limitations of various pre-trained multimodal encoders in classifying content as hate speech. By comparing these models, we aim to identify the most viable approaches for accurately detecting hate speech in complex, multimodal contexts.

Post-generation ASR Hypothesis Reranking Utilizing Visual Contexts

Youqi Huang * , Aryan Trehan * , Marcus Au , Stan Loosmore , Tommy Shu , Yirui Song
* Project Lead   

The Automatic Speech Recognition (ASR) pipeline proposed in "Multimodal Speech Recognition for Language-Guided Embodied Agents" (Chang et al.) processes both unimodal (audio-only) and multimodal (audiovisual) data to generate multiple ranked hypotheses based on a given ground truth statement. However, the model often fails to rank the hypothesis with the lowest Word Error Rate (WER) as the top choice. To address this issue, we propose a multimodal reranking pipeline that leverages the same visual cues used in the ASR process.

Gender Workplace Bias in Large Language Models

Rachita Jain * , Jessica Luna , Kailin Xia , Arjun Bedi , Malina Freeman
* Project Lead   

As large language models (LLMs) like ChatGPT gain prominence, addressing the biases embedded within these models is of paramount importance, particularly regarding gender and gender roles in the workplace. Historically, certain occupations have been linked with specific genders, resulting in significant imbalances across various sectors. To bridge this gender gap, it is essential that AI technologies do not perpetuate these biases. This project leverages LLaMA, a widely adopted open-source LLM, and employs prompt engineering techniques to investigate methods for reducing workplace gender bias in model responses. The objective is to ensure that LLMs exhibit gender neutrality, particularly in contexts characterized by ambiguity or uncertainty.

Navigating Climate-Induced Turbulence: Optimizing Aviation Emissions Through Neural Network-Based Turbulence Detection

Jayne Bottarini * , Jaiv Doshi * , Pratyush Jaishanker , Naina Panjwani , Sanya Verma , Lauren Sun , Jay Campanell , Sam Silva * 
* Project Lead    * Project Advisor

Aviation currently contributes to over 3% of global carbon dioxide emissions and will only contribute more as global travel levels increase [1], which is why implementing optimal flight paths based on emissions is essential. As the Earth warms, however, atmospheric wind patterns change unpredictably, inducing a cycle of more turbulent air, suboptimal flight paths, and increased emissions of carbon dioxide and other pollutants. We aim to contribute to breaking this cycle by employing generative and spatial machine learning techniques to downscale wind patterns in order to improve finer-scale wind speed predictions and inform emissions-based optimization techniques in flight paths.

Computer Vision and Machine Learning on Optical Coherence Tomography for Middle Ear Pathology Detection

Claude Yoo * , Lucia Zhang * , Sana Jayaswal , Seena Pourzand , Irika Katiyar , Will Dolan , Brian Applegate * 
* Project Lead    * Project Advisor

Current diagnostic methods for middle ear diseases in otology are primarily qualitative and limited to examining only the surface of the tympanic membrane (TM). Optical Coherence Tomography (OCT) offers a non-invasive, quantitative imaging technique that enables three-dimensional reconstruction of the TM and middle ear, providing more detailed information than traditional methods. However, manually interpreting OCT scans can be time-consuming and challenging, and while OCT-based disease detection models are well-established in retinal imaging and ophthalmology, their application in otology remains relatively unexplored. This project focuses on creating a multi-classification machine learning model capable of identifying conditions such as retraction pockets, perforations, and cholesteatomas, and distinguishing them from healthy ear scans.

Performance-based Feature Sampling for Reducing Bias in Image Recognition Models

Aarav Monga * , Sonia Zhang * , Advik Unni , Shahzeb Lakhani , Rajakrishnan Somou , Antonio Ortega * 
* Project Lead    * Project Advisor

The presence of unseen bias in machine learning models remains a significant barrier to achieving trusted AI. While class imbalance is often addressed through training set sampling methods, this project targets a more nuanced challenge: bias within specific groups of a class. Such biases can create harmful, spurious correlations—like associating certain demographics with particular roles—that undermine the accuracy and fairness of representation learning. Addressing these issues involves a two-step process: first, identifying and labeling the bias groups, and second, adaptively sampling from these groups during training. This approach aims to correct the hidden biases that complicate model training and ultimately enhance the reliability of AI systems.

Indigenous Language Translation with Sparse Data (4.0)

Aryan Gulati * , Leslie Moreno * , Abhinav Gupta , Aditya Kumar , Jonathan May * 
* Project Lead    * Project Advisor

Imperialism has led to a loss of many indigenous cultures and with this, their languages. Based on the NeurIPS 2022 Competition “Second AmericasNLP Competition: Speech-to-Text Translation for Indigenous Languages of the Americas,” this project aims to use machine translation (MT) and automatic speech recognition (ASR) approaches to develop a translator for endangered or extinct indigenous languages. This will involve finding and/or building an appropriately sized corpus and using this to train MT and ASR models due to the sparsity of data on these indigenous languages.

FALL 2023

Navigating Climate-Induced Turbulence: Optimizing Aviation Emissions Through Neural Network-Based Turbulence Detection

Jayne Bottarini * , Jaiv Doshi * , Pratyush Jaishanker , Naina Panjwani , Sanya Verma , Sam Silva * 
* Project Lead    * Project Advisor

Aviation currently contributes to over 3% of global carbon dioxide emissions and will only contribute more as global travel levels increase [1], which is why implementing optimal flight paths based on emissions is essential. As the Earth warms, however, atmospheric wind patterns change unpredictably, inducing a cycle of more turbulent air, suboptimal flight paths, and increased emissions of carbon dioxide and other pollutants. We aim to contribute to breaking this cycle by employing generative and spatial machine learning techniques to downscale wind patterns in order to improve finer-scale wind speed predictions and inform emissions-based optimization techniques in flight paths.

Computer Vision and Machine Learning on Optical Coherence Tomography for Middle Ear Pathology Detection

Claude Yoo * , Lucia Zhang * , Sana Jayaswal , Seena Pourzand , Irika Katiyar , Will Dolan , Brian Applegate * 
* Project Lead    * Project Advisor

Current diagnostic methods for middle ear diseases in otology are primarily qualitative and limited to examining only the surface of the tympanic membrane (TM). Optical Coherence Tomography (OCT) offers a non-invasive, quantitative imaging technique that enables three-dimensional reconstruction of the TM and middle ear, providing more detailed information than traditional methods. However, manually interpreting OCT scans can be time-consuming and challenging, and while OCT-based disease detection models are well-established in retinal imaging and ophthalmology, their application in otology remains relatively unexplored. This project focuses on creating a multi-classification machine learning model capable of identifying conditions such as retraction pockets, perforations, and cholesteatomas, and distinguishing them from healthy ear scans.

Performance-based Feature Sampling for Reducing Bias in Image Recognition Models

Aarav Monga * , Sonia Zhang * , Advik Unni , Shahzeb Lakhani , Rajakrishnan Somou , Antonio Ortega * 
* Project Lead    * Project Advisor

The presence of unseen bias in machine learning models remains a significant barrier to achieving trusted AI. While class imbalance is often addressed through training set sampling methods, this project targets a more nuanced challenge: bias within specific groups of a class. Such biases can create harmful, spurious correlations—like associating certain demographics with particular roles—that undermine the accuracy and fairness of representation learning. Addressing these issues involves a two-step process: first, identifying and labeling the bias groups, and second, adaptively sampling from these groups during training. This approach aims to correct the hidden biases that complicate model training and ultimately enhance the reliability of AI systems.

Indigenous Language Translation with Sparse Data (3.0)

Aryan Gulati * , Leslie Moreno * , Abhinav Gupta , Aditya Kumar , Jonathan May * 
* Project Lead    * Project Advisor

Imperialism has led to a loss of many indigenous cultures and with this, their languages. Based on the NeurIPS 2022 Competition “Second AmericasNLP Competition: Speech-to-Text Translation for Indigenous Languages of the Americas,” this project aims to use machine translation (MT) and automatic speech recognition (ASR) approaches to develop a translator for endangered or extinct indigenous languages. This will involve finding and/or building an appropriately sized corpus and using this to train MT and ASR models due to the sparsity of data on these indigenous languages.