A Stochastic Model of Noises for Periodic Symbol Sequences
Abstract
The construction of almost periodic sequences is needed for the analysis of the cycles’ detection methods, identification of symbolic sequences’ features and sensitivity analysis. Two probabilistic models of noise are proposed for constructing almost periodic symbolic sequences. Models provide various types of noise in a periodic sequence, such as changing, adding and deleting characters. Thus, on the basis of a periodic symbolic sequence, an almost periodic sequence is constructed.
It is necessary to ensure a given level of noise in the constructed almost periodic sequence. The required level of simulated noise is guaranteed by a two-level model, in which the positions for simulated noise are determined at the first level, based on the discrete random variable, chosen by the researcher. At the second level, the necessary changes are made at the corresponding positions using a random variable that simulates the noise itself. The second model is based on simulated noise with probability (depending on the noise level) in each element of the sequence.
The observed noise level is calculated based on the Levenshtein distance between the original periodic sequence and the almost periodic sequence that we constructed. The observed noise level is always somewhat less than the level of the noise simulated, since when calculating the Levenshtein distance, a shorter way of obtaining a noisy sequence from a periodic one can be found than the one that is used in constructing an almost periodic sequence.
A comparison of the proposed models for the proximity of the noise level they provide to a given noise level is made. The computational experiment showed that the observed noise level is closer to that provided by the two-level model. The software implementation of these models will be used in future to study the algorithms for finding cycles in noisy periodic symbolic sequences.
References
(In Eng.) DOI: 10.1109/ICDE.2001.914829
[2] He Z., Wang X.S., Lee B.S., Ling A.C.H. Mining Partial Periodic Correlations in Time Series. Knowledge and Information Systems. 2008; 15(1):31-54. (In Eng.) DOI: 10.1007/s10115-006-0051-5
[3] Elfeky M.G., Aref W.G., Elmagarmid A.K. Periodicity Detection in Time Series Databases. IEEE Transactions on Knowledge and Data Engineering. 2005; 17(7):875-887. (In Eng.) DOI: 10.1109/TKDE.2005.114
[4] Kung-JiuanYang, Tzung-Pei Hong, Guo-Cheng Lan, Yuh-Min Chen. A Two-Phase Approach for Mining Weighted Partial Periodic Patterns. Engineering Applications of Artificial Intelligence. 2014; 30:225-234. (In Eng.) DOI: 10.1016/j.engappai.2014.01.004
[5] Ran He, Sen Yang, Jingyuan Yang, Jin Cao. Automated Mining of Approximate Periodicity on Numeric Data: A Statistical Approach. Proceedings of the 2nd International Conference on Compute and Data Analysis (ICCDA 2018). ACM, New York, NY, USA. 2018; 20-27. (In Eng.) DOI: 10.1145/3193077.3194509
[6] Yuan H., Qian Y., Bai M. Efficient Mining of Event Periodicity in Data Series. In: Li G., Yang J., Gama J., Natwichai J., Tong Y. (eds). Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science. Vol. 11446. Springer, Cham, 2019. (In Eng.) DOI: 10.1007/978-3-030-18576-3_8
[7] Kiran R.U., Kitsuregawa M., Reddy P.K. Efficient Discovery of Periodic-Frequent Patterns in Very Large Databases. Journal of Systems and Software. 2016; 112(C):110-121. (In Eng.) DOI: 10.1016/j.jss.2015.10.035
[8] Venkatesh J.N., Kiran R.U., Reddy P.K., Kitsuregawa M. Discovering Periodic-Frequent Patterns in Transactional Databases Using All-Confidence and Periodic-All-Confidence. Proceedings of the 27th International Conference on Database and Expert Systems Applications. Porto, Portugal, 2016. Part. I, LNCS 9827, pp. 55-70. (In Eng.) DOI: 10.1007/978-3-319-44403-1_4
[9] Patel M., Modi N. A Comprehensive Study on Periodicity Mining Algorithms. 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC). Jalgaon. 2016; 567-575. (In Eng.) DOI: 10.1109/ICGTSPICC.2016.7955365
[10] Cao J., Drabeck L., He R. Statistical Network Behavior Based Threat Detection. 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Atlanta, GA. 2017; 420-425. (In Eng.) DOI: 10.1109/INFCOMW.2017.8116413
[11] Huang H., Liu F., Zha X., Xiong X., Ouyang T., Liu W. et al. Robust Bad Data Detection Method for Microgrid Using Improved ELM and DBSCAN Algorithm. Journal of Energy Engineering. 2018; 144(3):04018026. (In Eng.) DOI: 10.1061/(ASCE)EY.1943-7897.0000544
[12] Aydin B., Angryk R. Spatiotemporal Frequent Pattern Mining on Solar Data: Current Algorithms and Future Directions. 2015 IEEE International Conference on Data Mining Workshop (ICDMW). Atlantic City, NJ. 2015; 575-581. (In Eng.) DOI: 10.1109/ICDMW.2015.10
[13] Dong S., Liu S., Zhao Y. et al. An Innovative Model to Mine Asynchronous Periodic Pattern of Moving Objects. Multimedia Tools and Applications. 2019; 78(7):8943-8964. (In Eng.) DOI: 10.1007/s11042-018-6752-4
[14] Li D., Yan W., Li W., Ren Z. A Two-Tier Wind Power Time Series Model Considering Day-to-Day Weather Transition and Intraday Wind Power Fluctuations. IEEE Transactions on Power Systems. 2016; 31(6):4330-4339. (In Eng.) DOI: 10.1109/TPWRS.2016.2531739
[15] He Sh., Zhenxin Q. Research on the Periodical Behavior Discovery of Funds in Anti-money Laundering Investigation. ICMLC '19 Proceedings of the 2019 11th International Conference on Machine Learning and Computing.2019; 516-520. (In Eng.) DOI: 10.1145/3318299.3318356
[16] Gonzalez R.C., Woods R.E. Digital Image Processing. Addison-Wesley, Reading, MA., 1992. 716 pp. (In Eng.)
[17] Widrow B., Stearns S.D. Adaptive Signal Processing. Pearson, 1985. 496 pp. (In Eng.)
[18] Rasheed F., Alhajj R. STNR: A Suffix Tree Based Noise Resilient Algorithm for Periodicity Detection in Time Series Databases. Applied Intelligence. 2010; 32(3):267-278. (In Eng.) DOI: 10.1007/s10489-008-0144-9
[19] Bjørnstad O.N., Viboud C. Timing and Periodicity of Influenza Epidemics. Proceedings of the National Academy of Sciences. 2016; 113(46):12899-12901. (In Eng.) DOI: 10.1073/pnas.1616052113
[20] Chanda A.K. et al. An Efficient Approach to Mine Flexible Periodic Patterns in Time Series Databases. Engineering Applications of Artificial Intelligence. 2015; 44:46-63. (In Eng.) DOI: 10.1016/j.engappai.2015.04.014
[21] Korotkov E.V., Korotkova M.A. Developing New Mathematical Method for Search of the Time Series Periodicity with Deletions and Insertions. Journal of Physics: Conference Series. 2017; 788(1):012019. (In Eng.) DOI: 10.1088/1742-6596/788/1/012019.
[22] Lin J. et al. A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, 2003, pp. 2-11. (In Eng.) DOI: 10.1145/882082.882086
[23] Smetanin Yu.G., Ulyanov M.V. Determining the Characteristics of Kolmogorov Complexity of Time Series: An Approach Based on Symbolic Descriptions. Business Informatics. 2013; 2(24):49-54. Available at: https://elibrary.ru/item.asp?id=19526577 (accessed 07.06.2019). (In Russ., abstract in Eng.)
[24] Levenshtein V.I. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Soviet Physics Doklady. 1966; 10(8):707-710. Available at: https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf (accessed 07.06.2019). (In Eng.)
[25] Gusfield D. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. 556 pp. (In Eng.)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.