Multimodal Machine Learning: How to approach?
TimeWednesday, September 153:45pm - 4:30pm CDT
DescriptionMultimodal machine learning environment is a scenario where different modalities such as text, image, audio, and video are used in building a machine learning model. In this presentation, I will discuss the techniques used in building these models, the state-of-the-art in multimodal approaches and the caveats to be considered. I will exemplify these approaches using two tasks—possession extraction and extracting event outcomes. I will go beyond the technical factors and discuss ethical concerns when building a multimodal system.