Techno Press
Techno Press

Geomechanics and Engineering
  Volume 42, Number 5, September10 2025 , pages 321-332
DOI: https://doi.org/10.12989/gae.2025.42.5.321
 

 open access

Application of automated machine learning and clustering algorithm for data-driven site characterization: Predicting the soil-rock interface
Dongwoo Lim, Mijin Goo, Han-Saem Kim and Taeseo Ku

 
Abstract
    The development of underground spaces requires detailed insight into subsurface conditions, particularly the soil– rock interfaces, as this information is crucial for the effective design and safe construction of underground infrastructures. Traditional geotechnical site investigations rely mainly on direct drilling and sampling; however, these methods yield data only at specific investigation points, thus posing limitations in comprehensively capturing ground conditions across an entire area. To address this limitation, various studies have aimed to predict unknown subsurface sections using existing borehole data. Conventional methods use geospatial interpolation, while machine learning has emerged as a strong alternative. The selection and proper tuning of an appropriate model are critical to achieving optimal performance. This study applies automated machine learning, focusing on predicting soil-rock interfaces in unsampled regions using borehole data. AutoGluon is used as the machine learning framework to automate data preprocessing, model selection, hyperparameter tuning, and model ensemble. For this study, approximately 20,000 boreholes from the Seoul metropolitan area were collected and employed. Additionally, various digital maps were used to extract input variables. To capture non-linearity among input variables, Uniform Manifold Approximation and Projection were employed to reduce the dimensionality of the dataset, while Hierarchical Density-Based Spatial Clustering of Applications and Noise was implemented as the clustering algorithm. When compared to a model tuned using Bayesian optimization, AutoGluon exhibited superior predictive performance and reduced errors. Furthermore, although the focus of this study is on predicting the soil-rock interface, the methodology can be extended to the prediction of other geotechnical parameters.
 
Key Words
    automated ML; clustering; data-driven; soil-rock interface; spatial prediction
 
Address
Dongwoo Lim: Department of Civil, Environmental and Plant Engineering, Konkuk University, 120 Neungdong-ro,
Gwangjin-gu, Seoul, Republic of Korea, 05029
Mijin Goo and Taeseo Ku: Department of Civil and Environmental Engineering, Konkuk University, 120 Neungdong-ro,Gwangjin-gu, Seoul, Republic of Korea, 05029
Han-Saem Kim: Department of Civil and Environmental Engineering, Dongguk University, 30, Pildong-ro 1-gil,
Jung-gu, Seoul, Republic of Korea, 04620
 

Techno-Press: Publishers of international journals and conference proceedings.       Copyright © 2026 Techno Press
P.O. Box 33, Yuseong, Daejeon 305-600 Korea, Tel: +82-42-828-7996, Fax : +82-42-828-7997, Email: admin@techno-press.com