Analyzing large-scale Data Cubes with user-defined algorithms: A cloud-native approach
Recent advances in cloud-based remote sensing platforms have revoluted the routines for remote sensing big data (RSBD) analysis. However, it is challenging to make user-defined algorithms reusable for RSBD applications if not pre-implemented in RSBD platforms, especially legacy algorithms written with specific programming languages and libraries. In recent years, the emergence of containerization, which is the core feature of cloud native, provided effective solutions to port user-defined algorithms to the cloud environment. In this research, we present a novel approach to deploy user-defined remote sensing algorithms for large-scale analysis based on Data Cube and cloud-native containerization. A processing model is introduced to organize workflows of remote sensing analysis based on Data Cube. The workflows can be decomposed into multiple independent steps and parallelizable tasks following the homogeneity of Data Cube and the parallelizability of remote sensing analysis. Subsequently, the Composite Container is designed to process tasks with user-defined algorithms as built-in algorithms. Then, we introduce Data Cube Resilient Distributed Dataset (DRDD) to implement workflows with Composite Containers following the MapReduce paradigms. The proposed approach was implemented with Science Earth Platform and validated with two sets of up to 10-m resolution continental-scale land cover mapping. Experiment results show that the proposed approach can effectively implement remote sensing analysis with user-defined algorithms and show good performance for continental-scale analysis.
Analyzing large-scale Data Cubes with user-defined algorithms: A cloud-native approach