Cardiac Patch Delivery Device
Cardiac Patch Delivery Device
Multimodal Object Detection Project
Overview
In the course Intro to Deep Learning at CMU, I completed a project​ on re-implementing and analyzing the methodology presented in the paper Multimodal Object Detection by Channel Switching and Spatial Attention. The objective was to validate the paper's findings and explore enhancements in multimodal object detection, particularly in dimly lit environments. We implemented from scratch the first publicly available codebase of the proposed fusion pipeline using PyTorch, integrating RGB and infrared (IR) data with two ResNet-50 backbones and a Faster-RCNN architecture. The key innovation lies in the Channel Switching and Spatial Attention (CSSA) module, which efficiently fuses multimodal inputs while maintaining computational efficiency. Experiments conducted on the LLVIP dataset confirmed improvements in detection accuracy through multimodal fusion. Additionally, we explored hyperparameter tuning, data augmentation techniques, and parameter-sharing strategies to optimize performance. The repository for our codebase can be found on GitHub here.
​
Video Summary
Below is a video of us summarizing the project. Check it out!​
Paper
Below is the final paper for our project.