Methodology for Risk Assessment from Confidential Information Disclosure in Data Sources Using Data Mining

Abstract

At the moment, the low level of development of methods and tools for assessing the level of risk from the dissemination of confidential information in sources in which such data should not be. In the modern world, many commercial organizations collect information about their customers, store and process information about their own activities and means of achieving financial results. The problem is that there is no single methodology for assessing the risk associated with storing confidential information in sources that should not contain such data. There is also no system for regular assessment of this type of risk. The purpose of the study is to test the hypotheses about the possibility and necessity of regular monitoring of data sources in order to identify confidential information and protect it using the developed methodology for assessing the risks of disclosing confidential information. The novelty of the study lies in the development of the author's algorithm for assessing risks from the dissemination of confidential information and the construction of a mathematical model that allows for a quantitative assessment of risks, options for determining the probabilities of occurrence of events and a methodology for establishing and using a scale based on expert assessments. To achieve the goal set in the study, general scientific methods are used in the framework of comparative and statistical analysis, as well as expert assessments and graphical interpretation of the results obtained during the study. The author's modification of the three-factor risk assessment model and an adapted approach to achieving an acceptable level of risk from the disclosure of confidential information are presented as a solution to the problem. As a result of the analysis, the risk of disclosing confidential information was assessed, problem areas were identified using the example of open sources of information, and a scale of riskiness of sources was determined. Once again, the need to develop systems that allow assessing the levels of risk from the disclosure of confidential information, the development of methods and approaches to algorithms for detecting and preventing such disclosures has been confirmed.

Author Biographies

Anastasiia Igorevna Shabrova, PJSC "Sberbank of Russia"

Data Protection Architect, Cybersecurity Department

Aleksey Alekseevich Terenin, PJSC "Sberbank of Russia"

Managing Director, Cybersecurity Department, Cand.Sci. (Eng.)

Nikita Grigorievich Babak, PJSC "Sberbank of Russia"

Chief Data Protection Expert, Cybersecurity Department

References

1. Alshaikh M. Developing cybersecurity culture to influence employee behavior: A practice perspective. Computers & Security. 2020;98:102003. doi: https://doi.org/10.1016/j.cose.2020.102003
2. Ameen N., Tarhini A., Shah M.H., Madichie N., Paul J., Choudrie J. Keeping customers' data secure: A cross-cultural study of cybersecurity compliance among the Gen-Mobile workforce. Computers in Human Behavior. 2021; 114:106531. doi: https://doi.org/10.1016/j.chb.2020.106531
3. Paul J.A., Zhang M. Decision support model for cybersecurity risk planning: A two-stage stochastic programming framework featuring firms, government, and attacker. European Journal of Operational Research. 2021;291(1):349-64. doi: https://doi.org/10.1016/j.ejor.2020.09.013
4. Singh J., Crisafulli B., Xue M.T. ‘To trust or not to trust’: The impact of social media influencers on the reputation of corporate brands in crisis. Journal of Business Research. 2020;119:464-80. doi: https://doi.org/10.1016/j.jbusres.2020.03.039
5. Naarttijärvi M. Balancing data protection and privacy ‒ The case of information security sensor systems. Computer Law & Security Review. 2018;34(5):1019-1038. doi: https://doi.org/10.1016/j.clsr.2018.04.006
6. Tikkinen-Piri C., Rohunen A., Markkula J. EU General Data Protection Regulation: Changes and implications for personal data collecting companies. Computer Law & Security Review. 2018;34(1):134-153. doi: https://doi.org/10.1016/j.clsr.2017.05.015
7. Steppe R. Online price discrimination and personal data: A General Data Protection Regulation perspective. Computer Law & Security Review. 2017;33(6):768-85. doi: https://doi.org/10.1016/j.clsr.2017.05.008
8. Borgesius F.Z., Poort J. Online Price Discrimination and EU Data Privacy law. Journal of Consumer Policy. 2017;40(3):347-366. doi: https://doi.org/10.1007/s10603-017-9354-z
9. Štitilis D., Laurinaitis M. Treatment of biometrically processed personal data: Problem of uniform practice under EU personal data protection law. Computer Law & Security Review. 2017;33(5):618-628. doi: https://doi.org/10.1016/j.clsr.2017.03.012
10. Malatras A., Sanchez I., Beslay L., Coisel I., Vakalis I., D'Acquisto G., Sanchez M.G., Grall M., Hansen M., Zorkadis V. Pan-European personal data breaches: Mapping of current practices and recommendations to facilitate cooperation among Data Protection Authorities. Computer Law & Security Review. 2017;33(4):458-469. doi: https://doi.org/10.1016/j.clsr.2017.03.013
11. Mantelero A. Personal data for decisional purposes in the age of analytics: From an individual to a collective dimension of data protection. Computer Law & Security Review. 2016;32(2):238-55. doi: https://doi.org/10.1016/j.clsr.2016.01.014
12. Li Y., Saxunová D. A perspective on categorizing personal and sensitive data and the analysis of practical protection regulations. Procedia Computer Science. 2020;170:1110-1115. doi: https://doi.org/10.1016/j.procs.2020.03.060
13. Mousavi R., Chen R., Kim D.J., Chen K. Effectiveness of privacy assurance mechanisms in users' privacy protection on social networking sites from the perspective of protection motivation theory. Decision Support Systems. 2020;135:113323. doi: https://doi.org/10.1016/j.dss.2020.113323
14. Zhao J., Yan Q., Li J., Shao M., He Z., Li B. TIMiner: Automatically extracting and analyzing categorized cyber threat intelligence from social data. Computers & Security. 2020;95:101867. doi: https://doi.org/10.1016/j.cose.2020.101867
15. Choi J.P., Jeon D.S., Kim B.C. Privacy and personal data collection with information externalities. Journal of Public Economics. 2019;173:113-124. doi: https://doi.org/10.1016/j.jpubeco.2019.02.001
16. Arooj A., Farooq M.S., Akram A. et al. Big Data Processing and Analysis in Internet of Vehicles: Architecture, Taxonomy, and Open Research Challenges. Archives of Computational Methods in Engineering. 2022;29(2):793-829. doi: https://doi.org/10.1007/s11831-021-09590-x
17. Fisher J., Vlachos A. Merge and Label: A Novel Neural Network Architecture for Nested NER. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics; 2019. p. 5840-5850. doi: https://doi.org/10.18653/v1/P19-1585
18. Mayhew S., Tsygankova T., Roth D. ner and pos when nothing is capitalized. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China: Association for Computational Linguistics; 2019. p. 6256-6261. Available at: https://aclanthology.org/D19-1650.pdf (accessed 07.09.2022).
19. Park J.-S., Kim G.-W., Lee D.-H. Sensitive Data Identification in Structured Data through GenNER Model based on Text Generation and NER. In: Proceedings of the 2020 International Conference on Computing, Networks and Internet of Things (CNIOT2020). New York, NY, USA: Association for Computing Machinery; 2020. p. 36-40. doi: https://doi.org/10.1145/3398329.3398335
20. Hassan F., Domingo-Ferrer J., Soria-Comas J. Anonymization of Unstructured Data via Named-Entity Recognition. In: Torra V., Narukawa Y., Aguiló I., González-Hidalgo M. (eds.) Modeling Decisions for Artificial Intelligence. MDAI 2018. Lecture Notes in Computer Science. Vol. 11144. Cham: Springer; 2018. p. 296-305. doi: https://doi.org/10.1007/978-3-030-00202-2_24
21. Guamán D.S., Ferrer X., del Alamo J.M., Such J. Automating the GDPR Compliance Assessment for Cross-border Personal Data Transfers in Android Applications. arXiv:2103.07297. 2021. doi: https://doi.org/10.48550/arXiv.2103.07297
22. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is All you Need. In: Guyon I., Von Luxburg U., Bengio S., Wallach H., Fergus R., Vishwanathan S., Garnett R. (eds.) Advances in Neural Information Processing Systems. vol. 30. Long Beach, CA, USA: Curran Associates, Inc.; 2017. Available at: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (accessed 07.09.2022).
23. Branley-Bell D., Coventry L., Sillence E. Promoting Cybersecurity Culture Change in Healthcare. In: The 14th PErvasive Technologies Related to Assistive Environments Conference (PETRA 2021). New York, NY, USA: Association for Computing Machinery; 2021. p. 544-549. doi: https://doi.org/10.1145/3453892.3461622
24. Mwim E.N., Mtsweni J. Systematic Review of Factors that Influence the Cybersecurity Culture. In: Clarke N., Furnell S. (eds.) Human Aspects of Information Security and Assurance. HAISA 2022. IFIP Advances in Information and Communication Technology. Vol. 658. Springer, Cham; 2022. p. 147-172. doi: https://doi.org/10.1007/978-3-031-12172-2_12
25. Corradini I. Building a Cybersecurity Culture. In: Building a Cybersecurity Culture in Organizations. Studies in Systems, Decision and Control. Vol. 284. Cham: Springer; 2020. p. 63-86. doi: https://doi.org/10.1007/978-3-030-43999-6_4
26. Uchendu B., Nurse J.R., Bada M., Furnell S. Developing a cyber security culture: Current practices and future needs. Computers & Security. 2021;109:102387. doi: https://doi.org/10.1016/j.cose.2021.102387
27. Aiken G.M. Cybersecurity and productivity: has a cybersecurity culture gone too far? In: ASBBS Proceedings of the 26th Annual Conference. San Diego: American Society of Business and Behavioral Sciences; 2019. p. 13-23.
28. Blum D. Executive Overview. In: Rational Cybersecurity for Business. Berkeley, CA: Apress; 2020. p. 1-29. doi: https://doi.org/10.1007/978-1-4842-5952-8_1
29. Babak N.G., Kryukov A.F. Mobile Application for Visualization of the Advertising Booklet Using Augmented Reality. In: 2018 IV International Conference on Information Technologies in Engineering Education (Inforino). Moscow, Russia: IEEE Computer Society; 2018. p. 1-4. doi: https://doi.org/10.1109/INFORINO.2018.8581841
Published
2022-10-24
How to Cite
SHABROVA, Anastasiia Igorevna; TERENIN, Aleksey Alekseevich; BABAK, Nikita Grigorievich. Methodology for Risk Assessment from Confidential Information Disclosure in Data Sources Using Data Mining. Modern Information Technologies and IT-Education, [S.l.], v. 18, n. 3, p. 666-679, oct. 2022. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/919>. Date accessed: 11 sep. 2025. doi: https://doi.org/10.25559/SITITO.18.202203.666-679.
Section
Theoretical and Practical Aspects of Cybersecurity