Comparative Analysis of Hyperparameter Optimization Using Optuna and Hyperopt for Convolutional Neural Networks
Abstract
The process of training a neural network model is controlled by selecting optimal hyperparameters, which have a significant impact on its quality and performance. This impact has been confirmed both theoretically and empirically by numerous studies. If manual search is selected, the task can be labor-intensive. The most common enumeration methods include grid search, random search, and sequential model-based optimization, in which the procedure for estimating the objective function is quite fast. But all these methods create problems in applications with convolutional neural networks, where the parameter space is so large that even a shortened enumeration of their possible combinations is expensive in terms of the required computing power. An alternative is automatic hyperparameter tuning tools compatible with machine learning frameworks and using high-speed probabilistic estimates of the objective function with additional mechanisms. The paper compares the performance of training convolutional neural networks using exactly such tools – Python libraries for automatic hyperparameter optimization – Hyperopt and Optuna. Their comparative analysis is performed for image classification applications. It is shown that the use of libraries makes it possible to overcome the most important problems of hyperparameter optimization of such applications, including the large dimensionality of their search space and sensitivity to the choice of adjustable hyperparameters. Comparing the performance of the model in terms of accuracy and training loss, it can be concluded that both hyperparameter optimization methods are effective, providing training accuracy of more than 99% and a loss level of less than 0.03. The implementation of the optimization algorithm in the Optuna package turned out to be slightly better than in Hyperopt, providing high performance indicators on both training and test data.
References
2. Wang J., Li X., Jin Y., Zhong Y., Zhang K., Zhou C. Research on Image Recognition Technology Based on Multimodal Deep Learning. In: 2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA). Shenyang, China: IEEE Press; 2024. p. 1363-1367. https://doi.org/10.1109/ICIPCA61593.2024.10709051
3. Sruthi S., Trinath V., Jayanth V., Balaji V.P., Singh T., Mandal A. Natural Language Processing for Sentiment Analysis with Deep Learning. In: 2024 3rd International Conference for Innovation in Technology (INOCON). Bangalore, India: IEEE Press; 2024. p. 1-6. https://doi.org/10.1109/INOCON60754.2024.10511769
4. Shekhar S., Bansode A., Salim A. A Comparative study of Hyper-Parameter Optimization Tools. In: 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). Brisbane, Australia: IEEE Press; 2021. p. 1-6. https://doi.org/10.1109/CSDE53843.2021.9718485
5. Du X., Xu H., Zhu F. Understanding the effect of hyperparameter optimization on machine learning models for structure design problems. Computer-Aided Design. 2021;135:103013. https://doi.org/10.1016/j.cad.2021.103013
6. Yang L., Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing. 2020;415:295-316. https://doi.org/10.1016/j.neucom.2020.07.061
7. Kiziloluk S., Sert E. COVID-CCD-Net: COVID-19 and colon cancer diagnosis system with optimized CNN hyperparameters using gradient-based optimizer. Medical & Biological Engineering & Computing. 2022;60(6):1595-1612. https://doi.org/10.1007/s11517-022-02553-9
8. Raiaan M.A.K., et al. A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks. Decision Analytics Journal. 2024:11:100470. https://doi.org/10.1016/j.dajour.2024.100470
9. Irmawati, Chai R., Basari, Gunawan D. Optimizing CNN Hyperparameters for Blastocyst Quality Assessment in Small Datasets. IEEE Access. 2022;10:88621-88631. https://doi.org/10.1109/ACCESS.2022.3196647
10. Muhajir D., et al. Improving classification algorithm on education dataset using hyperparameter tuning. Procedia Computer Science. 2022;197:538-544. https://doi.org/10.1016/j.procs.2021.12.171
11. Chernigovskaya M., et al. Hyper-parameter Optimization in the context of Smart Manufacturing: a Systematic Literature Review. Procedia Computer Science. 2024;232:804-812. https://doi.org/10.1016/j.procs.2024.01.080
12. Hoque K.E., Aljamaan H. Impact of Hyperparameter Tuning on Machine Learning Models in Stock Price Forecasting. IEEE Access. 2021;9:163815-163830. https://doi.org/10.1109/ACCESS.2021.3134138
13. Erkan U., Toktas A., Ustun D. Hyperparameter optimization of deep CNN classifier for plant species identification using artificial bee colony algorithm. Journal of Ambient Intelligence and Humanized Computing. 2023;14(7):8827-8838. https://doi.org/10.1007/s12652-021-03631-w
14. Kilichev D., Kim W. Hyperparameter optimization for 1D-CNN-based network intrusion detection using GA and PSO. Mathematics. 2023;11(17):3724. https://doi.org/10.3390/math11173724
15. Rakhimov M., Javliev S., Nasimov R. Parallel Approaches in Deep Learning: Use Parallel Computing. In: Proceedings of the 7th International Conference on Future Networks and Distributed Systems (ICFNDS '23). New York, NY, USA: Association for Computing Machinery; 2024. p. 192-201. https://doi.org/10.1145/3644713.3644738
16. Liashchynskyi P., Liashchynskyi P. Grid search, random search, genetic algorithm: a big comparison for NAS. arXiv:1912.06059. 2019. https://doi.org/10.48550/arXiv.1912.06059
17. Ogunsanya M., Isichei J., Desai S. Grid search hyperparameter tuning in additive manufacturing processes. Manufacturing Letters. 2023;35:1031-1042. https://doi.org/10.1016/j.mfglet.2023.08.056
18. Kisvari A., Lin Z., Liu X. Wind power forecasting A data-driven method along with gated recurrent neural network. Renewable Energy. 2021;163:1895-1909. https://doi.org/10.1016/j.renene.2020.10.119
19. Wu J., et al. Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization. Journal of Electronic Science and Technology. 2019;17(1):26-40. https://doi.org/10.11989/JEST.1674-862X.80904120
20. Price W.L. Global optimization by controlled random search. Journal of Optimization Theory and Applications. 1983;40:333-348. https://doi.org/10.1007/BF00933504
21. Bergstra J., Bengio Y. Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research. 2021;13(2):281-305. Available at: https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf (accessed 13.07.2024).
22. Snoek J., Larochelle H., Adams R.P. Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 26th International Conference on Neural Information Processing Systems Vol. 2 (NIPS'12). Curran Associates Inc., Red Hook, NY, USA; 2012. p. 2951-2959.
23. Talathi S.S. Hyper-parameter optimization of deep convolutional networks for object recognition. In: 2015 IEEE International Conference on Image Processing (ICIP). Quebec City, QC, Canada: IEEE Press; 2015. p. 3982-3986. https://doi.org/10.1109/ICIP.2015.7351553
24. Hanifi S., Lotfian S., Zare-Behtash H., Cammarano A. Offshore Wind Power Forecasting A New Hyperparameter Optimisation Algorithm for Deep Learning Models. Energies. 2022;15(19):6919. https://doi.org/10.3390/en15196919
25. Zha W., et al. Ultra-short-term power forecast method for the wind farm based on feature selection and temporal convolution network. ISA transactions. 2022;129(A):405-414. https://doi.org/10.1016/j.isatra.2022.01.024
26. Komer B., Bergstra J., Eliasmith C. Hyperopt-Sklearn. In: Hutter F., Kotthoff L., Vanschoren J. (eds.) Automated Machine Learning. The Springer Series on Challenges in Machine Learning (SSCML). Cham: Springer; 2019. p. 97-111. https://doi.org/10.1007/978-3-030-05318-5_5
27. Akiba T., et al. Optuna: A Next-generation Hyperparameter Optimization Framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19). New York, NY, USA: Association for Computing Machinery; 2019. p. 2623-2631. https://doi.org/10.1145/3292500.3330701
28. Sandha S.S., Aggarwal M., Fedorov I., Srivastava M. Mango: A Python Library for Parallel Hyperparameter Tuning. In: ICASSP 2020 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain: IEEE Press; 2020. p. 3987-3991. https://doi.org/10.1109/ICASSP40776.2020.9054609
29. Agasiev T.A., Karpenko A.P. Modern Techniques of Global Optimization. Review. Informacionnye tehnologii = Information Technologies. 2018;24(6):370-386. (In Russ., abstract in Eng.) https://doi.org/10.17587/it.24.370-386
30. Bergstra J., et al. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery. 2015;8(1):014008. https://doi.org/10.1088/1749-4699/8/1/014008

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.
