דלג לתוכן (מקש קיצור 's')
אירועים

אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב

חוסן מולטי מודלי לתקלות קלט עבור זיהוי אובייקטים תלת מימדי
event speaker icon
רון אלפיה (הרצאה סמינריונית למגיסטר)
event date icon
יום שני, 10.02.2025, 15:30
event location icon
טאוב 601 & זום
event speaker icon
מנחה: Prof. Avi Mendelson & Dr. Chaim Baskin

In the age of abundant data, deep learning has emerged as a leading tool for predictive tasks, consistently setting new benchmarks in areas such as computer vision. One such task is 3D Object Detection (3DOD), where the goal is to estimate the locations of objects within a 3D space using inputs like RGB images and LiDAR point clouds. This task is crucial for applications in advanced driver-assistance systems, autonomous vehicles, and robotic navigation. Despite the promise of deep learning, these models are known to often learn shallow correlations, a challenge particularly evident in multimodal setups. Specifically, when multimodal 3DOD models rely on both images and point clouds, they tend to exhibit a significant bias towards the point cloud data, even though both modalities contain overlapping information. This modality bias renders the model vulnerable to failures in LiDAR data, a crucial problem in real-world environments.

In this work, we define and quantify this modality bias and propose two novel debiasing techniques for multimodal 3DOD models. Additionally, we introduce a failure estimator task for point clouds and demonstrate its high accuracy using a novel architecture. Finally, we present an inference-only architecture that combines debiasing and failure estimation, resulting in a robust end-to-end multimodal model. This model not only performs well under normal conditions but also significantly improves accuracy in the presence of LiDAR failures, addressing a critical gap in robustness for multimodal 3DOD systems.