Skip to main content

Earth Observation through Large Vision Models

PyData: Deep Learning, NLP, CV
Talk (long session)
North Hall
16:05 on 11 July 2024
45 minutes


Ever wondered how location planning is done to build city infrastructure? Or when there is a disaster, how do we determine the possible affected areas and send reinforcements there? We require overhead imagery for that, which we mainly obtain from satellites. European Space Agency has sent various satellites however, the dataset from these satellites is huge and may even contain multiple bands from the electromagnetic spectrum. Large AI models have a huge potential in this domain, if they are developed to work well with this dataset. There are a lot of pre-trained Generative & Large Vision models on platforms like HuggingFace, Kaggle, etc., but these models do not integrate well with a specific domain like satellite datasets, hence the need to train or fine-tune them. In this talk, we are going to see from where we can access open satellite datasets, fine-tune various Vision Models and Multimodals on it, and examine the following applications:

  • Perform Zero-Shot classification and object detection in satellite images with human language input using Multimodals.
  • Obtain high-resolution satellite imagery from lower resolution using the SuperResolution model.
  • Identify what lies below the clouds in satellite imagery using the Generative Vision model.

The speaker

Mayank Khanduja

Mayank Khanduja

Mayank is a data scientist at Esri, where he develops AI models for satellite and remote sensing datasets for a deeper understanding of our planet.