MODELLING DISSOLVED OXYGEN IN INTENSIVE AQUACULTURE SYSTEMS: LINEAR REGRESSION vs RANDOM FOREST APPROACHES

Ahmad Yani, Asthervina W. Puspitasari, Hendra Poltak, Ahmad Fahrizal, Rusli Rusli, Ramadhona Saville

Abstract


Dissolved oxygen (DO) is a critical water quality parameter in intensive aquaculture systems because its fluctuations directly affect farmed fish. Accurate prediction of DO is challenging due to complex, often nonlinear interactions among physicochemical and biological variables. Despite increasing interest in machine learning applications, comparative evaluations between traditional linear models and ensemble-based approaches in aquaculture contexts remain limited. This study aimed to analyse key variables associated with DO dynamics, compare the predictive performance of linear regression (LR) and random forest (RF) models, and identify dominant predictors relevant to aquaculture management. A publicly available aquaculture water quality dataset from Mendeley Data was analysed. Data were preprocessed by outlier removal and normalization, then split into training (70%) and test (30%) sets, and model robustness was assessed using 5-fold cross-validation. Dissolved oxygen concentrations ranged from 0.21 to 10.17 mg L⁻¹ (mean = 5.19 mg L⁻¹). Pearson correlation analysis showed positive associations between DO and ammonia (r = 0.60), biochemical oxygen demand (r = 0.55), and nitrite (r = 0.52), and negative associations with hydrogen sulphide (r = −0.55) and turbidity (r = −0.53). These relationships reflected indirect, management-mediated effects rather than direct causation. The RF model slightly outperformed LR (R² = 0.515 vs. 0.470), demonstrating the advantage of non-linear modelling. The feature importance analysis identified ammonia, hydrogen sulphide, nitrite, and biochemical oxygen demand as the dominant predictors. Although predictive accuracy remained moderate, the results highlight key drivers of DO variability and support the use of machine learning as a decision-support tool for smart aquaculture management. 

Oksigen terlarut (dissolved oxygen = DO) merupakan parameter kualitas air yang sangat penting dalam sistem akuakultur intensif karena fluktuasinya secara langsung memengaruhi komoditas yang dibudidayakan. Prediksi DO yang akurat menjadi tantangan karena adanya interaksi yang kompleks dan sering kali bersifat nonlinier antara variabel fisikokimia dan biologis. Meskipun minat terhadap penerapan machine learning terus meningkat, evaluasi komparatif antara model linier tradisional dan pendekatan berbasis ensemble dalam konteks akuakultur masih terbatas. Penelitian ini bertujuan untuk menganalisis variabel-variabel utama yang berkaitan dengan dinamika DO, membandingkan kinerja prediktif model regresi linier (LR) dan random forest (RF), serta mengidentifikasi prediktor dominan yang relevan untuk pengelolaan akuakultur. Dataset kualitas air akuakultur yang tersedia secara publik dari Mendeley Data dianalisis dalam penelitian ini. Data dipraproses melalui penghapusan pencilan dan normalisasi, kemudian dibagi menjadi data training (70%) dan pengujian (30%), dengan ketahanan model dievaluasi menggunakan validasi silang lima lipatan. Konsentrasi DO berkisar antara 0,21 hingga 10,17 mg L⁻¹ (rata-rata = 5,19 mg L⁻¹). Analisis korelasi Pearson menunjukkan hubungan positif antara DO dan amonia (r = 0,60), kebutuhan oksigen biokimiawi (BOD; r = 0,55), serta nitrit (r = 0,52), dan hubungan negatif dengan hidrogen sulfida (r = −0,55) dan kekeruhan (r = −0,53). Hubungan tersebut mencerminkan efek tidak langsung yang dimediasi oleh praktik pengelolaan, bukan hubungan kausal langsung. Model RF menunjukkan kinerja yang sedikit lebih baik dibanding LR (R² = 0,515 vs. 0,470), yang menegaskan keunggulan pemodelan nonlinier. Analisis kepentingan fitur mengidentifikasi amonia, hidrogen sulfida, nitrit, dan kebutuhan oksigen biokimiawi sebagai prediktor dominan. Meskipun akurasi prediksi masih tergolong moderat, hasil penelitian ini menyoroti faktor-faktor utama yang memengaruhi variabilitas DO dan mendukung penerapan machine learning sebagai alat pendukung keputusan dalam pengelolaan akuakultur cerdas. 


Keywords


dissolved oxygen; intensive aquaculture; linear regression; machine learning; random forest; water quality modelling; akuakultur intensif; machine learning; oksigen terlarut; pemodelan kualitas air; random forest; regresi linear

Full Text:

PDF

References


Abdel-Tawwab, M., Monier, M. N., Hoseinifar, S. H., & Faggio, C. (2019). Fish response to hypoxia stress: Growth, physiological, and immunological biomarkers. Fish Physiology and Biochemistry, 45(3), 997–1013. https://doi.org/10.1007/s10695-019-00614-9

Akhtar, N., Ishak, M. I. S., Bhawani, S. A., & Umar, K. (2021). Various natural and anthropogenic factors responsible for water quality degradation: A review. Water, 13(19), 2660. https://doi.org/10.3390/w13192660

Barnes, R. S. K., & Mann, K. H. (2009). Fundamentals of aquatic ecology. John Wiley & Sons.

Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197–227. https://doi.org/10.1007/s11749-016-0481-7

Boyd, C. E., & McNevin, A. A. (2020). Aerator energy use in shrimp farming and means for improvement. Journal of the Word Aquaculture Society, 52(1), 6–29. https://doi.org/10.1111/jwas.12753

Boyd, C. E., & Tucker, C. S. (2012). Pond aquaculture water quality management. Springer. https://doi.org/10.1111/jwas.12753

Chakravarty, S. P., Sinha, A., Baishya, S., & Roy, P. (2022). Robust control of water quality in intensive aquaculture using multi‐variable quantitative feedback theory: From an Indian context. Asian Journal of Control, 25(4), 2790-2807. https://doi.org/10.1002/asjc.2979

Das, R., & Behera, D. (2008). Environmental science: Principles and practice. PHI Learning.

Greig, S., Sear, D., & Carling, P. (2007). A review of factors influencing the availability of dissolved oxygen to incubating salmonid embryos. Hydrological Processes, 21(3), 323–334. https://doi.org/10.1002/hyp.6188

He, J., Chu, A., Ryan, M. C., Valeo, C., & Zaitlin, B. (2011). Abiotic influences on dissolved oxygen in a riverine environment. Ecological Engineering, 37(11), 1804–1814. https://doi.org/10.1016/j.ecoleng.2011.06.022

Heddam, S., & Kisi, O. (2017). Extreme learning machines: A new approach for modeling dissolved oxygen concentration with and without water quality variables as predictors. Environmental Science and Pollution Research, 24(20), 16702–16724. https://doi.org/10.1007/s11356-017-9283-z

Hodson, T. O. (2022). Root mean square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development, 15(14), 5481–5487. https://doi.org/10.5194/gmd-15-5481-2022

Jeong, J., Awosile, B., Thakur, K. K., Stryhn, H., Boyce, B., & Vanderstichel, R. (2024). Longitudinal dissolved oxygen patterns in Atlantic salmon aquaculture sites in British Columbia, Canada. Frontiers in Marine Science, 10, 1289375. https://doi.org/10.3389/fmars.2023.1289375

Jongjaraunsuk, R., Taparhudee, W., & Suwannasing, P. (2024). Comparison of water quality prediction for red tilapia aquaculture in an outdoor recirculation system using deep learning and a hybrid model. Water, 16(6), 907. https://doi.org/10.3390/w16060907

Kuang, L., Shi, P., Hua, C., Chen, B., & Zhu, H. (2020). An enhanced extreme learning machine for dissolved oxygen prediction in wireless sensor networks. IEEE Access, 8, 198730–198739. https://doi.org/10.1109/ACCESS.2020.3033455

Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. Springer.

Lee, S. Y., Jeong, D. Y., Choi, J., Jo, S. K., Park, D. H., & Kim, J. G. (2024). LSTM model to predict missing data of dissolved oxygen in land-based aquaculture farm. ETRI Journal, 46(6), 1047–1060. https://doi.org/10.4218/etrij.2023-0337

Liu, M., Lian, Q., Zhao, Y., Ni, M., Lou, J., & Yuan, J. (2023a). Treatment effects of pond aquaculture wastewater using a field-scale combined ecological treatment system and associated microbial characteristics. Aquaculture, 563, 739018. https://doi.org/10.1016/j.aquaculture.2022.739018

Liu, W., Liu, S., Hassan, S. G., Cao, Y., Xu, L., Feng, D., Cao, L., Chen, W., Chen, Y., Guo, J., Liu, T., & Zhang, H. (2023b). A novel hybrid model to predict dissolved oxygen for efficient water quality in intensive aquaculture. IEEE Access, 11, 29162–29174. https://doi.org/10.1109/ACCESS.2023.3260089

Nagamora, J., Courage, S., Angeles, H., Vertudes, R., Ken, J., Balangao, B., Halil, A., & Abdullah, S. (2022). An assessment of the control and monitoring functionalities of a developed small-scale aquaculture system. International Journal of Biosciences, 21(4), 89–100. https://doi.org/10.12692/ijb/21.4.89-100

Schäfer, N., Matoušek, J., Rebl, A., Stejskal, V., Brunner, R. M., Goldammer, T., Verleih, M., & Korytář, T. (2021). Effects of chronic hypoxia on the immune status of pikeperch (Sander lucioperca). Biology, 10(7), 649. https://doi.org/10.3390/biology10070649

Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal, 20(1), 3–29. https://doi.org/10.1177/1536867X20909688

Tong, C., He, K., & Hu, H. (2024). Design and application of a new aeration device based on a recirculating aquaculture system. Applied Sciences, 14(8), 3401. https://doi.org/10.3390/app14083401

Veeramsetty, V. A., Rajeshwarrao, & Bernatin, T. (2024). Aquaculture – Water quality dataset (Version 1) [Data set]. Mendeley Data. https://doi.org/10.17632/y78ty2g293.1

Vo, T. T. E., Ko, H., Huh, J. H., & Kim, Y. (2021). Overview of smart aquaculture system: Focusing on applications of machine learning and computer vision. Electronics, 10(22), 2882. https://doi.org/10.3390/electronics10222882

Wang, K., Gopaluni, R. B., Chen, J., & Song, Z. (2018). Deep learning of complex batch process data and its application on quality prediction. IEEE Transactions on Industrial Informatics, 16(12), 7233–7242. https://doi.org/10.1109/TII.2018.2880968

Xu, W., Yang, C.-E., Luo, Y., Zhang, K., Chen, M., Jiang, S., Grossart, H. P., & Luo, Z.-H. (2023). Distinct response of total and active fungal communities and functions to seasonal changes in a semi-enclosed bay with mariculture (Dongshan Bay, southern China). Limnology and Oceanography, 68(5), 1048–1063. https://doi.org/10.1002/lno.12328

Yang, H., Sun, M., & Liu, S. (2023). A hybrid intelligence model for predicting dissolved oxygen in aquaculture water. Frontiers in Marine Science, 10, 1126556. https://doi.org/10.3389/fmars.2023.1126556

Yao, X., Zhang, G., Song, Y., & Chen, Y. (2023). Adaptive anti-disturbance control of dissolved oxygen in circulating water culture systems. Symmetry, 15(11), 2015. https://doi.org/10.3390/sym15112015

Zhang, X., Zhang, Y., Zhang, Q., Liu, P., Guo, R., Jin, S., Liu, J., Chen, L., Ma, Z., & Liu, Y. (2020). Evaluation and analysis of water quality of marine aquaculture area. International Journal of Environmental Research and Public Health, 17(4), 1446. https://doi.org/10.3390/ijerph17041446

Zhang, Y., Fitch, P., Vilas, M. P., & Thorburn, P. J. (2019). Applying multi-layer artificial neural network and mutual information to the prediction of trends in dissolved oxygen. Frontiers in Environmental Science, 7, 46. https://doi.org/10.3389/fenvs.2019.00046




DOI: http://dx.doi.org/10.15578/jra.20.4.2025.339-354