NASA Unveils Galileo: Open-Source AI Model for Earth Observation
NASA has unveiled Galileo, an open-source, highly multimodal foundation model designed to process, analyze, and interpret diverse Earth observation (EO) data streams at scale. Developed with support from researchers at McGill University, NASA Harvest Ai2, Carleton University, the University of British Columbia, the Vector Institute, and Arizona State University, Galileo aims to provide a unified, general-purpose solution for critical applications such as agricultural land mapping, disaster response, and environmental monitoring.
Unlike previous remote sensing models often limited to a single data type or scale, Galileo is engineered to flexibly fuse multiple sensing modalities. This enables it to recognize phenomena ranging from minute objects, such as fishing boats measuring just 1–2 pixels, to vast, slowly evolving features like glaciers.
Key Features and Architecture
Galileo is built upon a Vision Transformer (ViT)-based architecture, a type of neural network design specifically adapted to process a wide array of Earth observation data. This includes multispectral optical imagery (e.g., Sentinel-2), Synthetic Aperture Radar (SAR) data (e.g., Sentinel-1), elevation and slope data (e.g., NASA SRTM), weather and climate data (e.g., precipitation and temperature from ERA5), and various auxiliary maps like land cover, population density, and night-lights.
Its flexible input handling is facilitated by a sophisticated tokenization pipeline. This process breaks down diverse remote sensing inputs into standardized spatial patches, timesteps, and logical channel groups, allowing the model to process images, time series, and static tabular data within a single architectural configuration.
A core innovation in Galileo is its self-supervised pretraining algorithm, which employs a dual-objective learning approach:
- Global objectives: These encourage the model to learn abstract representations over wide spatial or temporal contexts, ideal for identifying large-scale or slowly changing features such as glaciers or forest loss.
- Local objectives: These enhance the model's sensitivity to minute details, crucial for detecting small, fast-changing objects like boats or debris. This combination of objectives, differing in their prediction targets and masking strategies, significantly enhances multi-scale feature representation. This design makes Galileo highly generalizable across various tasks and robust even when working with limited labeled data.
Pretraining Dataset and Strategy
To ensure comprehensive semantic and geographic diversity, Galileo's pretraining dataset covers the entire globe. Samples were selected using a clustering approach to maximize both land cover variety and geographic spread. The dataset comprises over 127,000 spatiotemporally aligned samples, encompassing four categories and nine distinct remote sensing data types. Pretraining was conducted over 500 epochs using substantial computing resources, employing an effective batch size of 512, various data augmentations (flipping, rotation, variable patch sizes), and optimized with AdamW.
Benchmark Results
Galileo has been rigorously benchmarked on 11 diverse datasets and 15 downstream tasks, including image classification, pixel time series classification, and segmentation. The model demonstrated superior generalization, outperforming existing specialist models on public datasets such as EuroSat, BigEarthNet, So2Sat, MADOS (marine debris), Sen1Floods11 (SAR flood mapping), and CropHarvest (multimodal crop classification).
Performance highlights for Galileo-Base (ViT-Base) include:
- Classification (Finetune): 97.7% top-1 accuracy on EuroSat (with 100% training data), surpassing specialist models like CROMA (96.6%) and SatMAE (96.6%).
- Pixel Timeseries: 84.2% accuracy on CropHarvest (Kenya), outperforming Presto and AnySat; 73.0% on Breizhcrops.
- Segmentation (mIoU): 67.6% on MADOS and 79.4% on PASTIS.
Across all benchmarks, Galileo consistently emerged as the top overall performer, demonstrating greater flexibility than competitors specialized in either image or time-series data. Notably, smaller model variants (ViT-Nano, ViT-Tiny) also achieved competitive results, making Galileo viable for resource-constrained environments. Ablation studies further underscored the value of multimodality: removing any single input type during pretraining led to a measurable performance decline, even on benchmarks not directly using that input, proving the comprehensive benefit of integrating diverse data.
Open-Source and Real-World Impact
All of Galileo's code, model weights, and pretraining data are openly accessible on GitHub, promoting transparency and facilitating adoption by the global Earth observation community. The model is already supporting mission-critical NASA Harvest activities, including global crop type mapping, rapid disaster mapping (floods, wildfires), and marine pollution detection. Its ability to perform effectively with limited labeled data is particularly valuable in regions where ground truth information is scarce, directly supporting global food security and climate adaptation efforts.
Galileo's methodological and engineering advancements—encompassing multimodal inputs, multi-scale local-global feature learning, and large-scale, globally diverse pretraining—establish a new benchmark for generalist remote sensing AI. Its inherent flexibility is poised to underpin practical deployments from environmental monitoring to climate resilience, providing reliable, high-quality maps and predictions irrespective of the task or geographical area. With its open-source nature and ongoing development, Galileo is expected to catalyze significant innovation in Earth system science, empowering practitioners worldwide.