Earth Observation through Large Vision Models

Abstract

Ever wondered how location planning is done to build city infrastructure? Or when there is a disaster, how do we determine the possible affected areas and send reinforcements there? We require overhead imagery for that, which we mainly obtain from satellites. European Space Agency has sent various satellites however, the dataset from these satellites is huge and may even contain multiple bands from the electromagnetic spectrum. Large AI models have a huge potential in this domain, if they are developed to work well with this dataset. There are a lot of pre-trained Generative & Large Vision models on platforms like HuggingFace, Kaggle, etc., but these models do not integrate well with a specific domain like satellite datasets, hence the need to train or fine-tune them. In this talk, we are going to see from where we can access open satellite datasets, fine-tune various Vision Models and Multimodals on it, and examine the following applications:

Perform Zero-Shot classification and object detection on satellite images with human language input using Multimodal models.
Image-to-image translation on satellite imagery using generative vision models.