Skip to main content

Earth Observation through Large Vision Models

45 minutes


Ever wondered how location planning is done to build city infrastructure? Or when there is a disaster, how do we determine the possible affected areas and send reinforcements there? We require overhead imagery for that, which we mainly obtain from satellites. European Space Agency has sent various satellites however, the dataset from these satellites is huge and may even contain multiple bands from the electromagnetic spectrum. Large AI models have a huge potential in this domain, if they are developed to work well with this dataset. There are a lot of pre-trained Generative & Large Vision models on platforms like HuggingFace, Kaggle, etc., but these models do not integrate well with a specific domain like satellite datasets, hence the need to train or fine-tune them. In this talk, we are going to have a hands-on mini-tutorial in Python on how we can access open satellite datasets, fine-tune various Vision Models and Multimodals on it, and examine the following applications:

  • Identify what lies below the clouds in satellite imagery using the Generative Vision model.
  • Perform Zero-Shot object detection in satellite images with human language input using Multimodals.
  • Obtain high-resolution satellite imagery from lower resolution using the SuperResolution model.

The speaker

Mayank Khanduja

Mayank Khanduja

I’m a data scientist at Esri, where I apply computer vision and AI techniques to develop groundbreaking models for satellite and remote sensing datasets. My work unlocks valuable insights from geospatial imagery, for a deeper understanding of our planet.