Skip to main content

Earth Observation through Large Vision Models

Track:
PyData: Deep Learning, NLP, CV
Type:
Talk (long session)
Level:
beginner
Room:
North Hall
Start:
16:05 on 11 July 2024
Duration:
45 minutes

Abstract

Ever wondered how location planning is done to build city infrastructure? Or when there is a disaster, how do we determine the possible affected areas and send reinforcements there? We require overhead imagery for that, which we mainly obtain from satellites. European Space Agency has sent various satellites however, the dataset from these satellites is huge and may even contain multiple bands from the electromagnetic spectrum. Large AI models have a huge potential in this domain, if they are developed to work well with this dataset. There are a lot of pre-trained Generative & Large Vision models on platforms like HuggingFace, Kaggle, etc., but these models do not integrate well with a specific domain like satellite datasets, hence the need to train or fine-tune them. In this talk, we are going to see from where we can access open satellite datasets, fine-tune various Vision Models and Multimodals on it, and examine the following applications:

  • Perform Zero-Shot classification and object detection on satellite images with human language input using Multimodal models.
  • Image-to-image translation on satellite imagery using generative vision models.

The speaker

Mayank Khanduja

Mayank Khanduja

Mayank is a data scientist at Esri, where he develops AI models for satellite and remote sensing datasets for a deeper understanding of our planet.