Естественный способ преодоления катастрофической забывчивости нейронных сетей

Alexey Anatolyevich Kutalev

doi:10.25559/SITITO.16.202002.331-337

Alexey Anatolyevich Kutalev АО "ИнфоВотч" http://orcid.org/0000-0003-2695-792X

DOI: https://doi.org/10.25559/SITITO.16.202002.331-337

Аннотация

Проблема катастрофической забывчивости проявилась в моделях нейронных сетей на базе коннекционистского подхода, которые активно исследуются начиная со второй половины 20-го века. Предпринимались многочисленные попытки и были предложены различные способы решения этой проблемы, но до самого последнего времени значимых успехов достичь не удавалось. В 2016 году случился значительный прорыв – группа ученых из DeepMind предложила метод эластичного закрепления весов (EWC), который позволяет успешно бороться с проблемой катастрофической забывчивости. К сожалению, хотя нам известны случаи использования этого метода в реальных задачах, он пока не получил повсеместного распространения. В этой работе мы хотим предложить альтернативные подходы к преодолению катастрофической забывчивости, основанные на суммарном абсолютном сигнале, прошедшем через связь в нейронной сети, которые демонстрируют схожую с EWC эффективность и, при этом, имеют существенно меньшую вычислительную стоимость. Эти подходы имеют более простую реализацию и представляются нам по своей сути более близкими к процессам, происходящим в мозге животных для сохранения выученных ранее навыков при последующем обучении. Мы надеемся, что простота реализации этих методов послужит их более широкому применению.

Сведения об авторе

Alexey Anatolyevich Kutalev, АО "ИнфоВотч"

специалист, программист-исследователь отдела прогнозирования

Литература

[1] French R.M. Catastrophic forgetting in connectionist networks. Trends in Cognitive Science. 1999; 3(4):128-135. (In Eng.) DOI: https://doi.org/10.1016/S1364-6613(99)01294-2
[2] McCloskey M., Cohen N.J. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. Psychology of Learning and Motivation. 1989; 24:109-165. (In Eng.) DOI: https://doi.org/10.1016/S0079-7421(08)60536-8
[3] McClelland J.L., McNaughton B.L., O’Reilly R.C. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review. 1995; 102(3):419-457. (In Eng.) DOI: https://doi.org/10.1037/0033-295X.102.3.419
[4] Goodfellow I.J., Mirza M., Xiao D., Courville A.C., Bengio Y. An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks. arXiv:1312.6211. 2013. Available at: https://arxiv.org/abs/1312.6211 (accessed 04.09.2020). (In Eng.)
[5] Kirkpatrick J., Pascanu R., Rabinowitz N., Veness J., Desjardins G., Rusu A.A., Milan K., Quan J., Ramalho T., Grabska-Barwinska A., Hassabis D., Clopath C., Kumaran D., Hadsell R. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences. 2017; 114(13):3521-3526. (In Eng.) DOI: https://doi.org/10.1073/pnas.1611835114
[6] Huszár F. Note on the quadratic penalties in elastic weight consolidation. Proceedings of the National Academy of Sciences. 2018; 115(11):E2496-E2497. (In Eng.) DOI: https://doi.org/10.1073/pnas.1717042115
[7] Wake H., Lee P.R., Fields R. D. Control of Local Protein Synthesis and Initial Events in Myelination by Action Potentials. Science. 2011; 333(6049):1647-1651. (In Eng.) DOI: https://doi.org/10.1126/science.1206998
[8] Miller D.J., Duka T., Stimpson C.D., Schapiro S.J., Baze W.B., McArthur M.J., Fobbs A.J., Sousa A.M., Šestan N., Wildman D.E., Lipovich L., Kuzawa C.W., Hof P.R., Sherwood C.C. Prolonged myelination in human neocortical evolution. Proceeding of the National Academy of Sciences. 2012; 109(41):16480-16485. (In Eng.) DOI: https://doi.org/10.1073/pnas.1117943109
[9] Zacarias A., Alexandre L.A. SeNA-CNN: Overcoming Catastrophic Forgetting in Convolutional Neural Networks by Selective Network Augmentation. In: Pancioni L., Schwenker F., Trentin E. (ed.) Artificial Neural Networks in Pattern Recognition. ANNPR 2018. Lecture Notes in Computer Science. 2018; 11081:102-112. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-99978-4_8
[10] Li H., Barnaghi P., Enshaeifar S., Ganz F. Continual Learning Using Task Conditional Neural Networks. arXiv:2005.05080. 2020. Available at: https://arxiv.org/abs/2005.05080 (accessed 04.09.2020). (In Eng.)
[11] Zenke F., Poole B., Ganguli S. Continual Learning Through Synaptic Intelligence. Proceedings of the 34th International Conference on Machine Learning. PMLR. 2017; 70:3987-3995. International Convention Centre, Sydney, Australia. Available at: http://proceedings.mlr.press/v70/zenke17a.html (accessed 04.09.2020). (In Eng.)
[12] Aljundi R., Babiloni F., Elhoseiny M., Rohrbach M., Tuytelaars T. Memory Aware Synapses: Learning What (not) to Forget. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (ed.) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science. 2018; 11207:144-161. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-030-01219-9_9
[13] Thangarasa V., Miconi T., Taylor G.W. Enabling Continual Learning with Differentiable Hebbian Plasticity. In: 2020 International Joint Conference on Neural Networks (IJCNN). Glasgow, United Kingdom; 2020, p. 1-8. (In Eng.) DOI: https://doi.org/10.1109/IJCNN48605.2020.9206764
[14] Kumaran D., Hassabis D., McClelland J.L. What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated. Trends in Cognitive Sciences. 2016; 20(7):512-534. (In Eng.) DOI: https://doi.org/10.1016/j.tics.2016.05.004
[15] Miconi T., Stanley K.O., Clune J. Differentiable plasticity: training plastic neural networks with backpropagation. Proceedings of the 35th International Conference on Machine Learning. PMLR. 2018; 80:3559-3568. Available at: http://proceedings.mlr.press/v80/miconi18a.html (accessed 04.09.2020). (In Eng.)
[16] Zenke F., Gerstner W., Ganguli S. The temporal paradox of Hebbian learning and homeostatic plasticity. Current Opinion in Neurobiology. 2017; 43:166-176. (In Eng.) DOI: https://doi.org/10.1016/j.conb.2017.03.015
[17] Li Z., Hoiem D. Learning without Forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018; 40(12):2935-2947. (In Eng.) DOI: https://doi.org/10.1109/TPAMI.2017.2773081
[18] Parisi G.I., Kemker R., Part J.L., Kanan C., Wermter S. Continual lifelong learning with neural networks: A review. Neural Networks. 2019; 113:54-71. (In Eng.) DOI: https://doi.org/10.1016/j.neunet.2019.01.012
[19] Masse N.Y., Grant G.D., Freedman D.J. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proceedings of the National Academy of Sciences. 2018; 115(44):E10467-E10475. (In Eng.) DOI: https://doi.org/10.1073/pnas.1117943109
[20] Mirzadeh S.I., Farajtabar M., Ghasemzadeh H. Dropout as an Implicit Gating Mechanism For Continual Learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA; 2020. p. 945-951. (In Eng.) DOI: https://doi.org/10.1109/CVPRW50498.2020.00124
[21] Soltoggio A., Stanley K.O., Risi S. Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks. Neural Networks. 2018; 108:48-67. (In Eng.) DOI: https://doi.org/10.1016/j.neunet.2018.07.013
[22] Song S., Miller K.D, Abbott L.F. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience. 2000; 3:919-926. (In Eng.) DOI: https://doi.org/10.1038/78829
[23] Lee K., Lee K., Shin J., Lee H. Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South); 2019. p. 312-321. (In Eng.) DOI: https://doi.org/10.1109/ICCV.2019.00040
[24] Rostami M., Kolouri S., Pilly P.K. Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19). IJCAI; 2019. p. 3339-3345. (In Eng.) DOI: https://doi.org/10.24963/ijcai.2019/463
[25] Schak M., Gepperth A. A Study on Catastrophic Forgetting in Deep LSTM Networks. In: Tetko I., Kůrková V., Karpov P., Theis F. (ed.) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science. 2019; 11728:714-728. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-030-30484-3_56
[26] Ribeiro J., Melo F.S., Dias J. Multi-task Learning and Catastrophic Forgetting in Continual Reinforcement Learning. In: Calvanese D., Iocchi L. (ed.) GCAI 2019. Proceedings of the 5th Global Conference on Artificial Intelligence. 2019; 65:163-175. (In Eng.) DOI: https://doi.org/10.29007/g7bg