5th International Symposium on Machine Learning & Big Data in Geoscience (5ISMLG)
10-13 May 2026, Hong Kong
SS4: Data-driven Site Characterization
Session Organizers:
- Jinsong Huang, The University of Newcastle (Huang@newcastle.edu.au)
- Jiawei Xie, The University of Newcastle (Jiawei.Xie@newcastle.edu.au)
- Shui-Hua Jiang, Nanchang University (sjiangaa@ncu.edu.cn)
Session Description:
Site characterization, involving stratigraphic delineation, subsurface profiling, and geotechnical parameter estimation, forms the cornerstone of safe and economical geotechnical design and construction practices. However, the inherent challenges of subsurface investigation—including data sparsity, strong spatial variability, multivariate complexity, uncertainty, and prohibitive sampling costs—render traditional deterministic approaches increasingly inadequate for capturing the complex heterogeneity of geological formations across varying spatial and temporal scales.
In response, data-driven methodologies have emerged as a transformative paradigm for subsurface characterization, offering unprecedented capabilities for extracting hidden patterns from limited observations, quantifying uncertainty in heterogeneous media, and enabling intelligent decision-making under data constraints. From classical geostatistical techniques like Kriging and Sequential Gaussian Simulation to cutting-edge approaches such as Bayesian Neural Networks, Gaussian Process Regression, Tree-based methods, Graph-based Learning, and Physics-Informed Machine Learning, data-driven methods provide a powerful toolkit for tackling the multifaceted challenges of modern site characterization. This mini-symposium aims to bring together researchers, practitioners, and industry professionals to explore the latest advancements in data-driven site characterization, fostering collaboration and shaping future research directions in intelligent subsurface modeling. Contributions are welcomed in areas including, but not limited to:
- Advanced data analytics, preprocessing, and multi-modal integration techniques, such as multi-source data fusion (e.g., CPT, SPT, geophysics, remote sensing), outlier detection, and data quality assessment for heterogeneous geotechnical datasets to enable comprehensive site characterization
- Data-driven stratigraphic delineation using machine learning approaches for automated layer boundary identification, geological unit classification, and subsurface structure recognition from borehole and geophysical data
- Spatial interpolation and subsurface profiling through advanced geostatistical and machine learning methods (e.g., Gaussian processes, kriging variants, spatial deep learning) for continuous parameter estimation and uncertainty mapping
- Development and application of physics-informed or geology-constrained machine learning models that ensure physically plausible and geologically consistent subsurface characterization
- Active learning and optimal investigation design (e.g., Bayesian experimental design, value-of-information) for cost-effective sampling and monitoring campaigns
- Uncertainty quantification and risk assessment in data-driven site characterization, including confidence interval estimation and reliability analysis under sparse sampling conditions
- Benchmarking and validation studies of data-driven models against traditional geostatistical methods and field measurements for subsurface characterization accuracy
- Case studies and practical implementations of data-driven approaches for foundation design, slope stability assessment, tunneling projects, and other geotechnical applications requiring robust site characterization