A Comparative Study of Random Forest, K-Nearest Neighbors, and XGBoost Models for Weather-Aware Smart Office Building Automation

Authors

  • Erwin Yonata Universitas Sampoerna
  • Maya Anggun Beer Universitas Sampoerna
  • Ni Nyoman Putri Shopia Universitas Sampoerna
  • Emilia Loho Universitas Sampoerna
  • Gilang Raka Rayuda Dewa Universitas Sampoerna

DOI:

https://doi.org/10.52158/7925qh24

Keywords:

building, K-Nearest Neighbor, Random Forest, XGBoost, weather

Abstract

The intelligent control of lighting and HVAC systems plays a critical role in reducing energy consumption in smart buildings. However, many existing automation systems rely on static scheduling strategies that fail to adapt to dynamic environmental conditions. Although machine learning has been widely applied to weather-based building automation, inconsistent feature selection, model configuration, and evaluation procedures limit the validity of comparative performance claims. This study aims to develop and evaluate a machine-learning-based weather classification framework for smart building automation. The proposed methodology follows a structured pipeline comprising data acquisition and preprocessing, model training and testing, parameter tuning, and performance evaluation. A publicly available Weather Type Classification dataset is used, consisting of numerical weather parameters, which are encoded prior to training. Feature selection is applied to identify the most influential predictors. Three machine learning models, Random Forest, K Nearest Neighbors, and XGBoost, are trained using an 80:20 stratified split, with hyperparameters optimized through grid search to ensure an optimized model. Model performance is evaluated using accuracy, precision, recall, F1 score, and a confusion matrix. Experimental results demonstrate that Random Forest achieves the highest accuracy of 97.50 percent, followed by XGBoost at 96.90 percent and K Nearest Neighbors at 95.73 percent, with balanced performance across all weather categories. The findings indicate that ensemble-based classifiers are well-suited for robust weather recognition. The classified weather outputs can be directly mapped to real-time control strategies for lighting and HVAC systems, enabling adaptive automation and improved energy efficiency in smart buildings.

Downloads

Download data is not yet available.

References

Statista, “Smart Cities - Worldwide,” Statista.” [Online]. Available: https://www.statista.com/outlook/tmo/internet-of-things/smart-cities/worldwide.

“Use of Energy Explained - Energy Use in Commercial Buildings.” U.S. Energy Information Administration (EIA, Dec. 2022. [Online]. Available: https://www.eia.gov/energyexplained/use-of-energy/commercial-buildings.php.

H. Sabit and T. Tun, “IoT Integration of Failsafe Smart Building Management System,” IoT, vol. 5, no. 4, pp. 801-815, 2024.

M. Shin, S. Kim, Y. Kim, A. Song, Y. Kim, and H.-Y. Kim, “Development of an HVAC system control method using weather forecasting data with deep reinforcement learning algorithms,” Build. Environ., vol. 248, p. 111069, 2024.

O. O. Ayankemi, I. Z. Adesola, and L. Adeolu, “Comparative Analysis of Weather Prediction Using Classification Algorithm: Random Forest Classifier,” Afr. J. Math. Stat. Stud., vol. 7, no. 2, pp. 162–171, 2024.

H. Zheng and Y. Wu, “A XGBoost Model with Weather Similarity Analysis and Feature Engineering for Short-Term Wind Power Forecasting,” Appl. Sci., vol. 9, no. 15, p. 3019, 2019.

M. Poyyamozhi, B. Murugesan, N. Rajamanickam, M. Shorfuzzaman, and Y. Aboelmagd, “IoT—A Promising Solution to Energy Management in Smart Buildings: A Systematic Review, Applications, Barriers, and Future Scope,” Buildings, vol. 14, no. 11, p. 3446, 2024.

J. Ma, Y. Ding, J. C. Cheng, Y. Tan, V. J. L. Gan, and J. Zhang, “Analyzing the leading causes of traffic fatalities using XGBoost and grid-based analysis: a city management perspective,” IEEE Access, vol. 7, pp. 148059-148072, 2019.

P. S. Lakshmi, S. Sivagamasundari, and M. S. Rayudu, “IoT based solar panel fault and maintenance detection using decision tree with light gradient boosting,” Meas. Sens., vol. 27, p. 100726, Jun. 2023, doi: 10.1016/j.measen.2023.100726.

M. A. Nayak and S. Ghosh, “Prediction of extreme rainfall event using weather pattern recognition and support vector machine classifier,” Theor. Appl. Climatol., vol. 114, no. 3–4, pp. 583–603, Nov. 2013, doi: 10.1007/s00704-013-0867-3.

N. Shelke, S. Maurya, R. Ithape, Z. Shaikh, R. Somkunwar, and A. Pimpalkar, “Towards an automated weather forecasting and classification using deep learning, fully convolutional network, and long short-term memory,” Int. J. Electr. Comput. Eng. IJECE, vol. 15, no. 2, p. 1868, Apr. 2025, doi: 10.11591/ijece.v15i2.pp1868-1879.

Y. Li, Y. Shen. H. Jiang, W. Zhang, J. Li, J. Li, C. Zhang, B. Cui, “Hyper-tune: towards efficient hyper-parameter tuning at scale,” Proc. VLDB Endow., vol. 15, no. 6, pp. 1256–1265, Feb. 2022, doi: 10.14778/3514061.3514071.

A. G. Rahman, E. Juliani, and B. Halimi, “An Analysis of Potential for Reducing Operational Costs Through the Use of LED Lighting in Indonesian Hotel,” J. Sos. Teknol., vol. 4, no. 11, pp. 942-956, Nov. 2024.

H. Zhang, Y. Liu, C. Zhang, and N. Li, “Machine Learning Methods for Weather Forecasting: A Survey,” Atmosphere, vol. 16, no. 1, p. 82, Jan. 2025, doi: 10.3390/atmos16010082.

Y. E. Yousif, “Weather Prediction System Using KNN Classification Algorithm,” Eur. J. Inf. Technol. Comput. Sci., vol. 2, no. 1, pp. 10-13, 2022.

R. S. Moorthy and P. Parameshwaran, “An Optimal K-Nearest Neighbor for Weather Prediction Using Whale Optimization Algorithm,” Int. J. Appl. Metaheuristic Comput., vol. 13, no. 1, pp. 1–19, Dec. 2021, doi: 10.4018/IJAMC.290538.

R. K. Halder, M. N. Uddin, Md. A. Uddin, S. Aryal, and A. Khraisat, “Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications,” J. Big Data, vol. 11, no. 1, p. 113, Aug. 2024, doi: 10.1186/s40537-024-00973-y.

T. T. Wong, “Performance evaluation of classification algorithms by K-fold and leave-one-out cross validation,” Pattern Recognit., vol. 48, no. 9, pp. 2839-2846, Sep. 2015.

T. Sutanto, M. R. Aditya, H. Budiman, M. R. N. Ridha, U. Syapotro, and N. Azijah, “Comparison of Logistic Regression, Random Forest, SVM, KNN Algorithmfor Water Quality Classification Based on Contaminant Parameters,” INTI J., vol. 2022, no. 1, Nov. 2024, doi: 10.61453/jods.v2023no48.

M. B. Kursa and W. R. Rudnicki, “The All Relevant Feature Selection using Random Forest,” Jun. 25, 2011, arXiv: arXiv:1106.5112. doi: 10.48550/arXiv.1106.5112.

Y. Wang and Y. Fan, “XGBoost and ANOVA-based Analysis of Sailboat Prices and Their Influencing Factors,” in 2024 IEEE 3rd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA, Changchun, China, 2024, pp. 931–935.

H. Li, Y. Cao, S. Li, J. Zhao, and Y. Sun, “XGBoost model and its application to personal credit evaluation,” IEEE Intell. Syst., vol. 35, no. 3, pp. 52-61, 2020.

D. Cousineau and S. Chartier, “Outliers Detection and Treatment: A Review,” Int. J. Psychol. Res., vol. 3, no. 1, pp. 59-68, 2010.

B.-Y. Kim, M. Belorid, and J. W. Cha, “Short-Term Visibility Prediction Using Tree-Based Machine Learning Algorithms and Numerical Weather Prediction Data,” Weather Forecast., vol. 37, no. 12, pp. 2263–2274, Dec. 2022, doi: 10.1175/WAF-D-22-0053.1.

H. P. Das, Y-W. Lin, U. Agwan, L. Spangher, A. Devonport, Y. Yang, J. Drgoňa, A. Chong, S. Schiavon, and C. J. Spanos, “Machine Learning for Smart and Energy-Efficient Buildings,” Environ. Data Sci., vol. 3, p. 1, 2024.

V. P. Widartha, I. Ra, S.-Y. Lee, and C.-S. Kim, “Advancing Smart Lighting: A Developmental Approach to Energy Efficiency through Brightness Adjustment Strategies,” J. Low Power Electron. Appl., vol. 14, no. 1, p. 6, 2024.

K. A. Sayed, A. Boodi, R. S. Broujeny, and K. Beddiar, “Reinforcement Learning for HVAC Control in Intelligent Buildings: A Technical and Conceptual Review,” J. Build. Eng., vol. 95, p. 110085, 2024.

A. Mohamed, I. Ismail, and M. AlDaraawi, “IoT-Driven Intelligent Energy Management: Leveraging Smart Monitoring Applications and Artificial Neural Networks (ANN) for Sustainable Practices,” Computers, vol. 14, no. 7, p. 269, 2025.

N. Kumar, “Weather Type Classification,” Kaggle, 2021, [Online]. Available: https://www.kaggle.com/datasets/nikhil7280/weather-type-classification/data.

G. R. R. Dewa, “Performance Analysis of Priority Medical Events in Healthcare IOT Networks Using 3‐Dimension Discrete Time Markov Chain,” Internet Technol. Lett., Dec. 2024, doi: 10.1002/itl2.626.

M. Arun, G. Gopan, S. Vembu, D. U. Ozsahin, H. Ahmad, and M. F. Alotaibi, “Internet of Things and Deep Learning-Enhanced Monitoring for Energy Efficiency in Older Buildings,” Case Stud. Therm. Eng., vol. 61, p. 104867, 2024.

Downloads

Published

2026-05-28

How to Cite

[1]
“A Comparative Study of Random Forest, K-Nearest Neighbors, and XGBoost Models for Weather-Aware Smart Office Building Automation”, J. Appl. Comput. Sci. Technol., vol. 7, no. 1, pp. 9–21, May 2026, doi: 10.52158/7925qh24.