A Stochastic Model of Noises for Periodic Symbol Sequences

Abstract

The construction of almost periodic sequences is needed for the analysis of the cycles’ detection methods, identification of symbolic sequences’ features and sensitivity analysis. Two probabilistic models of noise are proposed for constructing almost periodic symbolic sequences. Models provide various types of noise in a periodic sequence, such as changing, adding and deleting characters. Thus, on the basis of a periodic symbolic sequence, an almost periodic sequence is constructed.
It is necessary to ensure a given level of noise in the constructed almost periodic sequence. The required level of simulated noise is guaranteed by a two-level model, in which the positions for simulated noise are determined at the first level, based on the discrete random variable, chosen by the researcher. At the second level, the necessary changes are made at the corresponding positions using a random variable that simulates the noise itself. The second model is based on simulated noise with probability (depending on the noise level) in each element of the sequence.
The observed noise level is calculated based on the Levenshtein distance between the original periodic sequence and the almost periodic sequence that we constructed. The observed noise level is always somewhat less than the level of the noise simulated, since when calculating the Levenshtein distance, a shorter way of obtaining a noisy sequence from a periodic one can be found than the one that is used in constructing an almost periodic sequence.
A comparison of the proposed models for the proximity of the noise level they provide to a given noise level is made. The computational experiment showed that the observed noise level is closer to that provided by the two-level model. The software implementation of these models will be used in future to study the algorithms for finding cycles in noisy periodic symbolic sequences.

Author Biographies

Galina Nikolaevna Zhukova, National Research University Higher School of Economics

Associate Professor of the School of Software Engineering, Faculty of Computer Science, Ph.D. (Phys.-Math.), Associate Professor

Yuri Gennadievich Smetanin, Federal Research Center Computer Science and Control of the Russian Academy of Sciences

Chief researcher of the Dorodnicyn Computing Centre of RAS, Dr. Sci. (Phys.-Math.)

Mikhail Vasilevich Ulyanov, V.A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences; Lomonosov Moscow State University

Leading scientist researcher; Professor of the Department of Algorithmic Languages, Faculty of Computational Mathematics and Cybernetics, Dr.Sci. (Engineering), Professor

References

[1] Ma Sh., Hellerstein J.L. Mining Partially Periodic Event Patterns with Unknown Periods. Proceedings 17th International Conference on Data Engineering. Heidelberg, Germany. 2001; 205-214.
(In Eng.) DOI: 10.1109/ICDE.2001.914829
[2] He Z., Wang X.S., Lee B.S., Ling A.C.H. Mining Partial Periodic Correlations in Time Series. Knowledge and Information Systems. 2008; 15(1):31-54. (In Eng.) DOI: 10.1007/s10115-006-0051-5
[3] Elfeky M.G., Aref W.G., Elmagarmid A.K. Periodicity Detection in Time Series Databases. IEEE Transactions on Knowledge and Data Engineering. 2005; 17(7):875-887. (In Eng.) DOI: 10.1109/TKDE.2005.114
[4] Kung-JiuanYang, Tzung-Pei Hong, Guo-Cheng Lan, Yuh-Min Chen. A Two-Phase Approach for Mining Weighted Partial Periodic Patterns. Engineering Applications of Artificial Intelligence. 2014; 30:225-234. (In Eng.) DOI: 10.1016/j.engappai.2014.01.004
[5] Ran He, Sen Yang, Jingyuan Yang, Jin Cao. Automated Mining of Approximate Periodicity on Numeric Data: A Statistical Approach. Proceedings of the 2nd International Conference on Compute and Data Analysis (ICCDA 2018). ACM, New York, NY, USA. 2018; 20-27. (In Eng.) DOI: 10.1145/3193077.3194509
[6] Yuan H., Qian Y., Bai M. Efficient Mining of Event Periodicity in Data Series. In: Li G., Yang J., Gama J., Natwichai J., Tong Y. (eds). Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science. Vol. 11446. Springer, Cham, 2019. (In Eng.) DOI: 10.1007/978-3-030-18576-3_8
[7] Kiran R.U., Kitsuregawa M., Reddy P.K. Efficient Discovery of Periodic-Frequent Patterns in Very Large Databases. Journal of Systems and Software. 2016; 112(C):110-121. (In Eng.) DOI: 10.1016/j.jss.2015.10.035
[8] Venkatesh J.N., Kiran R.U., Reddy P.K., Kitsuregawa M. Discovering Periodic-Frequent Patterns in Transactional Databases Using All-Confidence and Periodic-All-Confidence. Proceedings of the 27th International Conference on Database and Expert Systems Applications. Porto, Portugal, 2016. Part. I, LNCS 9827, pp. 55-70. (In Eng.) DOI: 10.1007/978-3-319-44403-1_4
[9] Patel M., Modi N. A Comprehensive Study on Periodicity Mining Algorithms. 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC). Jalgaon. 2016; 567-575. (In Eng.) DOI: 10.1109/ICGTSPICC.2016.7955365
[10] Cao J., Drabeck L., He R. Statistical Network Behavior Based Threat Detection. 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Atlanta, GA. 2017; 420-425. (In Eng.) DOI: 10.1109/INFCOMW.2017.8116413
[11] Huang H., Liu F., Zha X., Xiong X., Ouyang T., Liu W. et al. Robust Bad Data Detection Method for Microgrid Using Improved ELM and DBSCAN Algorithm. Journal of Energy Engineering. 2018; 144(3):04018026. (In Eng.) DOI: 10.1061/(ASCE)EY.1943-7897.0000544
[12] Aydin B., Angryk R. Spatiotemporal Frequent Pattern Mining on Solar Data: Current Algorithms and Future Directions. 2015 IEEE International Conference on Data Mining Workshop (ICDMW). Atlantic City, NJ. 2015; 575-581. (In Eng.) DOI: 10.1109/ICDMW.2015.10
[13] Dong S., Liu S., Zhao Y. et al. An Innovative Model to Mine Asynchronous Periodic Pattern of Moving Objects. Multimedia Tools and Applications. 2019; 78(7):8943-8964. (In Eng.) DOI: 10.1007/s11042-018-6752-4
[14] Li D., Yan W., Li W., Ren Z. A Two-Tier Wind Power Time Series Model Considering Day-to-Day Weather Transition and Intraday Wind Power Fluctuations. IEEE Transactions on Power Systems. 2016; 31(6):4330-4339. (In Eng.) DOI: 10.1109/TPWRS.2016.2531739
[15] He Sh., Zhenxin Q. Research on the Periodical Behavior Discovery of Funds in Anti-money Laundering Investigation. ICMLC '19 Proceedings of the 2019 11th International Conference on Machine Learning and Computing.2019; 516-520. (In Eng.) DOI: 10.1145/3318299.3318356
[16] Gonzalez R.C., Woods R.E. Digital Image Processing. Addison-Wesley, Reading, MA., 1992. 716 pp. (In Eng.)
[17] Widrow B., Stearns S.D. Adaptive Signal Processing. Pearson, 1985. 496 pp. (In Eng.)
[18] Rasheed F., Alhajj R. STNR: A Suffix Tree Based Noise Resilient Algorithm for Periodicity Detection in Time Series Databases. Applied Intelligence. 2010; 32(3):267-278. (In Eng.) DOI: 10.1007/s10489-008-0144-9
[19] Bjørnstad O.N., Viboud C. Timing and Periodicity of Influenza Epidemics. Proceedings of the National Academy of Sciences. 2016; 113(46):12899-12901. (In Eng.) DOI: 10.1073/pnas.1616052113
[20] Chanda A.K. et al. An Efficient Approach to Mine Flexible Periodic Patterns in Time Series Databases. Engineering Applications of Artificial Intelligence. 2015; 44:46-63. (In Eng.) DOI: 10.1016/j.engappai.2015.04.014
[21] Korotkov E.V., Korotkova M.A. Developing New Mathematical Method for Search of the Time Series Periodicity with Deletions and Insertions. Journal of Physics: Conference Series. 2017; 788(1):012019. (In Eng.) DOI: 10.1088/1742-6596/788/1/012019.
[22] Lin J. et al. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, 2003, pp. 2-11. (In Eng.) DOI: 10.1145/882082.882086
[23] Smetanin Yu.G., Ulyanov M.V. Determining the Characteristics of Kolmogorov Complexity of Time Series: An Approach Based on Symbolic Descriptions. Business Informatics. 2013; 2(24):49-54. Available at: https://elibrary.ru/item.asp?id=19526577 (accessed 07.06.2019). (In Russ., abstract in Eng.)
[24] Levenshtein V.I. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Soviet Physics Doklady. 1966; 10(8):707-710. Available at: https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf (accessed 07.06.2019). (In Eng.)
[25] Gusfield D. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. 556 pp. (In Eng.)
Published
2019-07-25
How to Cite
ZHUKOVA, Galina Nikolaevna; SMETANIN, Yuri Gennadievich; ULYANOV, Mikhail Vasilevich. A Stochastic Model of Noises for Periodic Symbol Sequences. Modern Information Technologies and IT-Education, [S.l.], v. 15, n. 2, p. 431-440, july 2019. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/537>. Date accessed: 03 aug. 2025. doi: https://doi.org/10.25559/SITITO.15.201902.431-440.
Section
Research and development in the field of new IT and their applications