Automatic Evaluation of Recommendation Models
Abstract
The paper presents an overview of state-of-the-art algorithms used in recommender systems. We discuss the goal of collaborative filtering (CF) as well as different approaches to the method. Specifically, we talk about Singular Value Decomposition (including optimizations, bias, time sensitive Singular Value Decomposition (SVD) and enhanced SVD methods as SVD++), clustering approaches (using K means clustering). We also discuss deep learning methods applied to recommender systems, such as Autoencoders and Restricted Boltzmann Machines. We also go through qualitative evaluation metrics of the algorithms, with a special emphasis on the classification quality metrics, as recommender systems are usually expected to have an order in which the recommendations are delivered. At the same time, we propose a tool that automates the processes of CF algorithms launch and evaluation, that contains data pre-processing, metrics selection, training launch, quality indicators checks and analyses of the resulted data. Our tool demonstrates the impact that parameter selection has on the quality of the algorithm execution. We observed that classical matrix factorization algorithms can compete with new deep learning methods, giving the correct tuning. Also, we demonstrate a significant gain in time between the manual (involving a person that launches all the algorithms individually) and the automatic (when the tool launches all the algorithms) algorithm launch
References
[2] Hofmann T. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems. 2004; 22(1):89-115. (In Eng.) DOI: https://doi.org/10.1145/963770.963774
[3] Yu K., Schwaighofer A., Tresp V., Xu X., Kriegel H.-P. Probabilistic memory-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering. 2004; 16(1):56-69. (In Eng.) DOI: https://doi.org/10.1109/TKDE.2004.1264822
[4] Resnick P., Iacovou N., Suchak M., Bergstrom P., Riedl J. GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM conference on Computer supported cooperative work (CSCW'94). Association for Computing Machinery, New York, NY, USA; 1994. p. 175-186. (In Eng.) DOI: https://doi.org/10.1145/192844.192905
[5] Balabanović M., Shoham Y. Fab: content-based, collaborative recommendation. Communications of the ACM. 1997; 40(3):66-72. (In Eng.) DOI: https://doi.org/10.1145/245108.245124
[6] Cacheda F., Carneiro V., Fernández D., Formoso V. Comparison of collaborative filtering algorithms: Limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Transactions on the Web. 2011; 5(1):1-33. Article No. 2. (In Eng.) DOI: https://doi.org/10.1145/1921591.1921593
[7] Koren Y., Bell R., Volinsky C. Matrix Factorization Techniques for Recommender Systems. Computer. 2009; 42(8):30-37. (In Eng.) DOI: https://doi.org/10.1109/MC.2009.263
[8] Bell R.M., Koren Y. Improved Neighborhood-based Collaborative Filtering. In: Proceedings of KDDCup and Workshop. San Jose, California, USA; 2007. p. 7-14. Available at: https://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings/Neighbor-Koren.pdf (accessed 14.07.2020). (In Eng.)
[9] Sarwar B., Karypis G., Konstan J., Riedl J. Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems. In: Fifth International Conference on Computer and Information Science. 2002. Available at: http://files.grouplens.org/papers/sarwar_SVD.pdf (accessed 14.07.2020). (In Eng.)
[10] Ungar L.H., Foster D.P. Clustering Methods for Collaborative Filtering. In: Proceedings of the 1998 Workshop on Recommender Systems. AAAI Press, Menlo Park; 1998. (In Eng.)
[11] O’Connor M., Herlocker J. Clustering items for Collaborative Filtering. In: Proceedings of the ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation. Berkeley, California, USA; 1999. (In Eng.)
[12] He X., Liao L., Zhang H., Nie L., Hu X., Chua T. Neural Collaborative Filtering. In: Proceedings of the 26th International Conference on World Wide Web (WWW'17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE; 2017. p. 173-182. (In Eng.) DOI: https://doi.org/10.1145/3038912.3052569
[13] Zhang S., Yao L., Sun A., Tay Y. Deep Learning Based Recommender System: A Survey and New Perspectives. ACM Computing Surveys. 2019; 52(1):1-38. (In Eng.) DOI: https://doi.org/10.1145/3285029
[14] LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015; 521:436-444. (In Eng.) DOI: https://doi.org/10.1038/nature14539
[15] Xue H.-J., Dai X., Zhang J., Huang S., Chen J. Deep Matrix Factorization Models for Recommender Systems. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17). Melbourne, Australia; 2017. p. 3203-3209. (In Eng.) DOI: https://doi.org/10.24963/ijcai.2017/447
[16] Wang H., Wang N., Yeung D.-Y. Collaborative Deep Learning for Recommender Systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'15). Association for Computing Machinery, New York, NY, USA; 2015. p. 1235-1244. (In Eng.) DOI: https://doi.org/10.1145/2783258.2783273
[17] Sarwar B., Karypis G., Konstan J., Riedl J. Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web (WWW'01). Association for Computing Machinery, New York, NY, USA; 2001. p. 285-295. (In Eng.) DOI: https://doi.org/10.1145/371920.372071
[18] Koren Y. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD'08). Association for Computing Machinery, New York, NY, USA; 2008. p. 426-434. (In Eng.) DOI: https://doi.org/10.1145/1401890.1401944
[19] Karypis G. Evaluation of Item-Based Top-N Recommendation Algorithms. In: Proceedings of the tenth international conference on Information and knowledge management (CIKM'01). Association for Computing Machinery, New York, NY, USA; 2001. p. 247-254. (In Eng.) DOI: https://doi.org/10.1145/502585.502627
[20] Babu S. Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM symposium on Cloud computing (SoCC'10). Association for Computing Machinery, New York, NY, USA; 2010. p. 137-142. DOI: (In Eng.) https://doi.org/10.1145/1807128.1807150
[21] Liao Q., Yang F., Zhao J. An improved parallel K-means clustering algorithm with MapReduce. In: 2013 15th IEEE International Conference on Communication Technology. Guilin; 2013. p. 764-768. (In Eng.) DOI: https://doi.org/10.1109/ICCT.2013.6820477
[22] Barbieri J., Alvim L.G.M., Braida F., Zimbrão G. Autoencoders and recommender systems: COFILS approach. Expert Systems with Applications. 2017; 89:81-90. (In Eng.) DOI: https://doi.org/10.1016/j.eswa.2017.07.030
[23] Suzuki Y., Ozaki T. Stacked Denoising Autoencoder-Based Deep Collaborative Filtering Using the Change of Similarity. In: 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA). Taipei; 2017. p. 498-502. (In Eng.) DOI: https://doi.org/10.1109/WAINA.2017.72
[24] Salakhutdinov R., Mnih A., Hinton G. Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on Machine learning (ICML'07). Association for Computing Machinery, New York, NY, USA; 2007. p. 791-798. (In Eng.) DOI: https://doi.org/10.1145/1273496.1273596
[25] McLaughlin M.R., Herlocker J.L. A collaborative filtering algorithm and evaluation metric that accurately model the user experience. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR'04). Association for Computing Machinery, New York, NY, USA; 2004. p. 329-336. (In Eng.) DOI: https://doi.org/10.1145/1008992.1009050
[26] Brzezinski D., Stefanowski J. Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowledge and Information Systems. 2017; 52:531-562. (In Eng.) https://doi.org/10.1007/s10115-017-1022-8
[27] Davis J., Goadrich M. The relationship between Precision-Recall and ROC curves. In: Proceedings of the 23rd international conference on Machine learning (ICML'06). Association for Computing Machinery, New York, NY, USA; 2006. p. 233-240. (In Eng.) DOI: https://doi.org/10.1145/1143844.1143874
[28] Herschtal A., Raskutti B. Optimising area under the ROC curve using gradient descent. In: Proceedings of the twenty-first international conference on Machine learning (ICML'04). Association for Computing Machinery, New York, NY, USA; 2004. p. 49. DOI:https://doi.org/10.1145/1015330.1015366

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.