
Anindya Mondal
About
My research project
Integrating auxiliary information for representation learning for natural worldIntegrating auxiliary information for representation learning for natural world computer vision and machine learning technologies have made an impact across multiple disciplines by providing new tools for analysis of information depicted in images. These methodologies are mostly based on supervised learning paradigm and hence needs huge amount of labelled data.
In many domains, such as ecology, conservation biology etc, aggregating large amounts of data is typically not the bottleneck. Rather, it is the subsequent labelling procedure of that data that consumes vast amounts of money and time, which is further compounded in fine-grained domains, e.g., natural world or medicine etc. However, these data often come with good amount of auxiliary information, such as the time of day, the time of year, geolocation, and may also include additional information characterising the local habitat. Developing tools for automatic understanding of these data and discovering the relationship among different data components is extremely important.
Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision, which typically use aggressive image augmentation strategies to generate different "views" of an input image during training. The central question that we would like to address through this project is how to make use of the auxiliary information available during robust visual representation learning.
Integrating auxiliary information for representation learning for natural world computer vision and machine learning technologies have made an impact across multiple disciplines by providing new tools for analysis of information depicted in images. These methodologies are mostly based on supervised learning paradigm and hence needs huge amount of labelled data.
In many domains, such as ecology, conservation biology etc, aggregating large amounts of data is typically not the bottleneck. Rather, it is the subsequent labelling procedure of that data that consumes vast amounts of money and time, which is further compounded in fine-grained domains, e.g., natural world or medicine etc. However, these data often come with good amount of auxiliary information, such as the time of day, the time of year, geolocation, and may also include additional information characterising the local habitat. Developing tools for automatic understanding of these data and discovering the relationship among different data components is extremely important.
Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision, which typically use aggressive image augmentation strategies to generate different "views" of an input image during training. The central question that we would like to address through this project is how to make use of the auxiliary information available during robust visual representation learning.