The Application of Data Transformations in the Calculation of a Composite Index of a System's Quality

Abstract

The paper examines features of data used when calculating composite indexes of complex systems. Principal component analysis gives an objective summary of the dataset, but is sensitive to the quality of the data.  One of the main critiques of using multidimensional analysis when calculating the weights of composite indexes is ambiguity in the socio-economic interpretation of negative weight coefficients.   The paper shows that known statistical characteristics of the data, such as the coefficient of asymmetry, the coefficient of variation and the presence/absence of a normal distribution of data do not allow the identification of anomalous variables. Anomalous data is considered such if the upper range outliers neutralize all other values for an indicator.   Such data affects the calculated the weights and can be identified by using heatmaps. The logarithmic transformation of anomalous variables eliminates the peculiarities of the distribution of such data. The paper proposes an analytical criterion for determining the anomalous data. The criterion evaluates the signal-to-noise ratios of variables in a fixed range that does not contain zero. The justification of using the logarithmic transformation when assessing the quality of weakly formalized systems is demonstrated in the example of using the author’s modification of the PCA when studying the quality of life of the population of Russia’s regions. The paper shows that the use of logarithmic correction for anomalous variables eliminates the negativity of the weight coefficients and results in a redistribution of weights with a more correct socio-economic interpretation.

Author Biography

Tatyana Valentinovna Zhgun, Yaroslav-the-Wise Novgorod State University

Associate Professor of the Department of Applied Mathematics and Computer Science, Institute of Electronic and Information Systems, Ph.D. (Phys.-Math.), Associate Professor

References

1. Greco S., Ishizaka A., Tasiou M., Torrisi G. On the Methodological Framework of Composite Indices: A Review of the Issues of Weighting, Aggregation, and Robustness. Social Indicators Research. 2019; 141(1):61-94. (In Eng.) DOI: https://doi.org/10.1007/s11205-017-1832-9
2. Diewert W.E. Exact and superlative index numbers. Journal of Econometrics. 1976; 4(2):115-145. (In Eng.) DOI: https://doi.org/10.1016/0304-4076(76)90009-9
3. Silver M., Webb B. The measurement of inflation: Aggregation at the basic level. Journal of Economic and Social Measurement. 2002; 28(1-2):21-35. (In Eng.) DOI: https://doi.org/10.3233/JEM-2003-0185
4. Hightower W.L. Development of an Index of Health Utilizing Factor Analysis. Medical Care. 1978; 16(3):245-255. (In Eng.) DOI: https://doi.org/10.1097/00005650-197803000-00006
5. McKenzie D.J. Measuring Inequality with Asset Indicators. Journal of Population Economics. 2005; 18(2):229-260. (In Eng.) DOI: https://doi.org/10.1007/s00148-005-0224-7
6. Vyas S., Kumaranayake L. Constructing Socio-Economic Status Indices: How to Use Principal Components Analysis. Health Policy and Planning. 2006; 21(6):459-468. (In Eng.) DOI: https://doi.org/10.1093/heapol/czl029
7. Manly B.F.J., Navarro J.A.A. Multivariate Statistical Methods: A Primer. 4th Ed. Chapman and Hall/CRC; 2016. 269 p. (In Eng.)
8. Somarriba N., Pena B. Synthetic Indicators of Quality of Life in Europe. Social Indicators Research. 2009; 94(1):115-133. (In Eng.) DOI: https://doi.org/10.1007/s11205-008-9356-y
9. Filmer D., Pritchett L.H. Estimating Wealth Effects Without Expenditure Data ‒ Or Tears: An Application to Educational Enrollments in States of India. Demography. 2001; 38(1):115-132. (In Eng.) DOI: https://doi.org/10.1353/dem.2001.0003
10. Houweling T.A.J., Kunst A.E., Mackenbach J.P. Measuring health inequality among children in developing countries: does the choice of the indicator of economic status matter? International Journal for Equity in Health. 2003; 2(1):1-12. (In Eng.) DOI: https://doi.org/10.1186/1475-9276-2-8
11. Krishnan V. Constructing a Multidimensional Socioeconomic Index and the Validation of It With Early Child Developmental Outcomes. In: Ed. by Management Association, Information Resources. Early Childhood Development: Concepts, Methodologies, Tools, and Applications. IGI Global, Hershey, PA; 2019. p. 130-165. (In Eng.) DOI: https://doi.org/ 10.4018/978-1-5225-7507-8.ch008
12. Rencher A.C. Methods of Multivariate Analysis. 2nd Ed. A John Wiley & Sons, Inc. Publication; 2002. 738 p. (In Eng.)
13. Jolliffe I.T. Principal Component Analysis. Springer Series in Statistics. Second Edition. N.-Y.: Springer; 2002. 488 p. (In Eng.) DOI: https://doi.org/10.1007/b98835
14. Tabachnick B.G., Fidell L.S. Using Multivariate Statistics. 5 Ed. Boston: Pearson Education; 2007. 980 p. (In Eng.)
15. Fukuda, Y., Nakamura, K., & Takano, T Higher mortality in areas of lower socioeconomic position measured by a single index of deprivation in Japan. Public Health. 2007; 121(3):163-173. (In Eng.) DOI: https://doi.org/10.1016/j.puhe.2006.10.015
16. Maity S., Kachari S. Socioeconomic status and the factors influencing the Socio-economic status of Bodo tribes:A case study of Udalguri District, Assam. Socioeconomica. 2015; 4(8):371-394. (In Eng.) DOI: https://dx.doi.org/10.12803/SJSECO.48132
17. Molchanova E.V., Kruchek M.M., Kibisova Z.S. Building of the rating assessments of the Russian Federation subjects by the blocks of socio-economic indicators. Economic and Social Changes: Facts, Trends, Forecast. 2014; (3):196-208. (In Eng.) DOI: https://doi.org/10.15838/esc/2014.3.33.15
18. Zhgun T.V. Building an Integral Measure of the Quality of Life of Constituent Entities of the Russian Federation Using the Principal Component Analysis. Economic and Social Changes: Facts, Trends, Forecast. 2017; 10(2):214-235. (In Eng.) DOI: https://doi.org/10.15838/esc.2017.2.50.12
19. Zhgun T.V. Complex index of a system's quality for a set of observations. Journal of Physics: Conference Series. 2019; 1352(1):012064. (In Eng.) DOI: https://doi.org/10.1088/1742-6596/1352/1/012064
20. Wilkinson L., Friendly M. The History of the Cluster Heat Мaр. The American Statistician. 2009; 63(2):179-184. (In Eng.) DOI: https://doi.org/10.1198/tas.2009.0033
21. Li X., Liang S., Zhang J. Acceleration of OCT Signal Processing with Lookup Table Method for Logarithmic Transformation. Applied Sciences. 2019; 9(7):1278-1286. (In Eng.) DOI: https://doi.org/10.3390/app9071278
22. Mundo G., Nardo M. Noncompensatory/nonlinear composite indicators for ranking countries: a defensible setting. Applied Economics. 2009; 41(12):1513-1523. (In Eng.) DOI: https://doi.org/10.1080/00036840601019364
23. Zhgun T.V. Investigation of data quality in the problem of calculating the composite index of a system from a series of observations. Journal of Physics: Conference Series. 2020; 1658:012082. (In Eng.) DOI: https://doi.org/10.1088/1742-6596/1658/1/012082
24. Zhao H., Lu L., He Z., Chen B. Adaptive recursive algorithm with logarithmic transformation for nonlinear system identification in α-stable noise. Digital Signal Processing. 2015; 46:120-132. (In Eng.) DOI: https://doi.org/10.1016/j.dsp.2015.08.004
25. Klyatskin V.I. Integral characteristics: a key to understanding structure formation in stochastic dynamic systems. Physics-Uspekhi. 2011; 54(5):441-464. (In Eng.) DOI: https://doi.org/10.3367/UFNe.0181.201105a.0457
Published
2021-09-30
How to Cite
ZHGUN, Tatyana Valentinovna. The Application of Data Transformations in the Calculation of a Composite Index of a System's Quality. Modern Information Technologies and IT-Education, [S.l.], v. 17, n. 3, p. 550-563, sep. 2021. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/783>. Date accessed: 01 june 2025. doi: https://doi.org/10.25559/SITITO.17.202103.550-563.
Section
Research and development in the field of new IT and their applications