Artifical neural networks
Покупка
Издательство:
Издательский Дом НИТУ «МИСиС»
Автор:
Калитин Денис Владимирович
Год издания: 2018
Кол-во страниц: 88
Дополнительно
Вид издания:
Учебное пособие
Уровень образования:
ВО - Магистратура
ISBN: 978-5-906953-04-9
Артикул: 752960.01.99
Рассмотрены вопросы возникновения искусственных нейронных сетей, биологическая модель нейрона, различные архитектуры искусственных нейронных сетей. Приведено большое количество примеров формализации практических задач для последующего решения с помощью искусственных нейронных сетей. Предназначено для обучающихся по направлению подготовки 09.04.01 «Информатика и вычислительная техника» по программе “Innovative software systems: Design, development and application”.
Скопировать запись
Фрагмент текстового слоя документа размещен для индексирующих роботов
№ 3052 МИНИСТЕРСТВО ОБРАЗОВАНИЯ И НАУКИ РФ ФЕДЕРАЛЬНОЕ ГОСУДАРСТВЕННОЕ АВТОНОМНОЕ ОБРАЗОВАТЕЛЬНОЕ УЧРЕЖДЕНИЕ ВЫСШЕГО ОБРАЗОВАНИЯ «НАЦИОНАЛЬНЫЙ ИССЛЕДОВАТЕЛЬСКИЙ ТЕХНОЛОГИЧЕСКИЙ УНИВЕРСИТЕТ «МИСиС» ИНСТИТУТ ИНФОРМАЦИОННЫХ ТЕХНОЛОГИЙ И АВТОМАТИЗИРОВАННЫХ СИСТЕМ УПРАВЛЕНИЯ Кафедра автоматизированного проектирования и дизайна Д.В. Калитин ARTIFICIAL NEURAL NETWORKS Учебное пособие Рекомендовано редакционно-издательским советом университета МИСиС Москва 2018
УДК 004.7 К17 Рецензент канд. техн. наук, доц. И.В. Баранникова Калитин Д.В. К17 Artificial neural networks : учеб. пособие / Д.В. Калитин. - М.: Изд. Дом НИТУ «МИСиС», 2018. - 88 с. ISBN 978-5-906953-04-9 Рассмотрены вопросы возникновения искусственных нейронных сетей, биологическая модель нейрона, различные архитектуры искусственных нейронных сетей. Приведено большое количество примеров формализации практических задач для последующего решения с помощью искусственных нейронных сетей. Предназначено для обучающихся по направлению подготовки 09.04.01 «Информатика и вычислительная техника» по программе “Innovative software systems: Design, development and application”. УДК 004.7 In this book considers the problems of the apperance of artificial neural networks, the biological model of the neuron, various architectures of artificial neural network. A large number of examples of formalization of practical problems for the subsequent solution with the help of artificial neural networks are given. Designed for students in the field of training 09.04.01 Computer science on master’s program “Innovative software systems: Design, development and application”. ISBN 978-5-906953-04-9 © Д.В. Калитин, 2018 © НИТУ «МИСиС», 2018
TaBle of coNTeNTS Introduction ...................................................5 HIStory of ArtIfICIAl NeurAl NetworkS emergeNCe AND DevelopmeNt ......................................6 Historical milestones .......................................10 Introduction to artificial neural networks................. 11 parallel processing and feasibility of neural networks ......14 place neural networks, among other methods for solving problems ... 15 BIologICAl NeuroNS ............................................17 Nervous impulse ............................................18 Saltatory mechanism of impulse propagation .................19 Synaptic transmission ..................................... 19 the procedure for synaptic transmission ....................20 ArtIfICIAl NeurAl NetworkS .....................................21 formal neuron ..............................................21 types of activation functions ..............................22 restrictions of formal neuron ..............................25 A multIlAyer perCeptroN .......................................26 problem solving algorithm using a multilayer perceptron.....28 formalization of a problem..................................29 examples of problems formalization .........................30 Choosing a number of neurons and layers ....................34 preparation of input and output data .......................35 training methods ...........................................36 training a single layer perceptron .........................39 training schedule ..........................................40 representation of a perceptron .............................40 the problem of “exclusive or” ..............................43 Solution of the Xor problem ................................44 A multIlAyer perCeptroN trAININg ..............................46 error backpropagation algorithm ............................46 further development of the algorithm .......................51 Network paralysis ..........................................52 Step length selection ......................................52 local minimums .............................................53 Sensitivity to pattern presentation order ..................54 Dynamical neuron addition ..................................55 3
Ability of neural networks to generalize..................56 unsupervised learning ....................................57 Network with linear reward ...............................58 koHoNeN NetworkS ............................................60 problem of classification ................................60 Classification algorithms ................................62 Architecture of kohonen network ..........................63 training of kohonen layer ................................65 Convex combination method ................................66 examples of training .....................................67 modifications of training algorithm ......................68 Network operating modes ..................................69 using kohonen networks for data compression ..............69 BACk-propAgAtIoN Network ....................................71 grossberg layer ..........................................71 training of back-propagation network .....................72 geNetIC AlgorItHmS of ArtIfICIAl NeurAl NetworkS trAININg ...........................................74 Implementation of genetic algorithms to artificial neural networks training ........................................75 positive features of genetic algorithms ..................76 Negative features in artificial neural networks training .76 ADAptIve reSoNANCe tHeory (Art) .............................77 Art-1 ....................................................77 Architecture and operation................................78 Comparison layer .........................................79 recognition layer ........................................80 Art network functioning ..................................81 Need for search ..........................................83 positive and negative features of Art ....................84 Bibliography ................................................86 4
INTRODUCTION the theory of artificial neural networks originated more than half a century ago. It appeared as a need to solve on computers those tasks that only a human can solve. the first researchers in the field of artificial neural networks were biologists and physicians. the knowledge in the field of biological neural networks allowed them to create the first models of artificial neural networks. for half a century of development in this field of science there have been ups and downs. At present, many architectures of artificial neural networks have been developed - from simple to complex ones. However, even simple artificial neural networks show impressive results. more advanced architectures allow you to solve complex problems that only human reason can. However, many tasks are still not solved. In this textbook we consider the basic architectures of artificial neural networks. for a deeper exploration of the material, it is suggested that you become acquainted with the works presented in the bibliographic section. 5
HISTORY OF ARTIFICIAL NEURAL NETWORKS EMERGENCE AND DEVELOPMENT Conceptual fundamentals of neuromathematics had been set in early 1940s. In 1943, w. mcCulloch and his follower w. pitts had formulated basic concepts of cerebration theory. they had obtained the following results: - the model of neuron as a primary processing element, whose purpose is computing the transition function of scalar product of input signals vector by weight factors vector, was developed; - Design of a network formed from such elements and intended for performing logical or arithmetical operations was proposed; - the fundamental assumption that such network can be able to learn, recognize patterns, and/or generalize acquired information was made. Although neuromathematics has come a long way since that time, many of mcCulloch’s provisions remained valid until now. In particular, despite a lot of existing neuron models, mcCulloch - pitts principle of neuron functioning stays unchanged. the only shortcoming of their neuron model consists in threshold type of transition function. In the mcCulloch-pitts formalism, neurons can occur only in state 0 or 1 and logic of transitions between these states is of threshold type. each neuron in the network calculates weighed state sum of all other neurons and compares it with threshold to determine own state. However, threshold type of transition function hardly could provide sufficient neuron network flexibility in learning or tuning for some kinds of problems. If a computed scalar product value even a tiny bit fails to reach the specified threshold, the neuron fails to form output signal at all or, in other words, fails to respond. It means that output signal (axon) intensity becomes lost and, as a consequence, too small level value is formed on weighed inputs in the next layer of neurons. Substantial advances in neurocybernetics had come about through works by American neurophysiologist francis rosenblatt (Cornell university). He had proposed his own model of neuron network in 1958. rosenblatt had introduced the ability of synapses to modification, which made the network able to learn. Such model was named perceptron. Initially perceptron was designed as a single layer structure with stiff threshold function of a processing element and binary or multivalued 6
inputs. first perceptrons were capable to recognize some letters of roman alphabet. Subsequently this perceptron model had been essentially advanced. perceptron was applied to solve problems of automatic classification, which is reducible generally to attribute space distribution among given number of classes. for example, in the two-dimensional case, it is required to draw up a line in the plane to separate two domains. perceptron can divide plane or space by only straight lines or planes, respectively. one serious shortage of perceptron consists in impossibility to find a weight factor combination such that to enable given perceptron to recognize any given pattern ensemble. the cause of this shortage lays in that only minor share of problems implies the straight type of boundary between domains. mostly it is quite complicated, open or closed curve. therefore, if a single layer perceptron, which can realize only linear dividing surface, is used in applications requiring a nonlinear surface, patterns can be recognized incorrectly (this trouble is referred as linear indivisibility of attribute space). the way out of this trouble is in using the multilayer perceptron, which can draw up a polygonal line between patterns under recognition. It should be noted that above issue is not the only difficulty that arises when dealing with perceptrons. perceptron learning methods are very poorly formalized as well. As a result, usage of perceptrons had put a number of questions, which initiated development of “smart” neural networks and new applied methods that have found also application in many areas beyond neurocybernetics (e.g. group method of data handling used to identify the mathematical models). In 1970s, neural networks had attracted much lesser attention but researches in this direction were continued. A number of interesting designs such as cognitron and similar systems had been proposed allowing recognizing patterns regardless of image rotation or scaling. the cognitron had been first proposed by Japanese scientist I. fukushima. A new round of neural network model development was associated with works of many researchers including Amari, Anderson, Carpenter, kohonen, and especially Hopfield, as well as promising successes of optical technologies and mature phase of vlSI development able to implement these new architectures. 7
modern mathematical approach to modeling of neural computations had been initiated by Hopfield’s publications in 1982, where a mathematical model of neural network based associative memory was formulated. the author had shown that a single-layer neural network with “all-to-all” type synapses is characterized by convergence to a single equilibrium point from finite set of local minimums of an energy function, which describes the whole network relationships structure. this neural network dynamics had found understanding by other researchers too. However, only Hopfield and tank had shown how to construct the energy function for a specific optimization problem and how to use it for mapping into the neural network. this approach was subsequently advanced to solve other combinatorial optimization problems as well. Hopfield approach is attractive in that the neural network can be programmed to solve particular problems involving no training iterations. Synapse weights are simply calculated from the energy function constructed especially for given problem. the idea of neural network learning first was mentioned by Donald Hebb in his 1949 book “the organization of Behavior”. It had been commonly accepted before the book publication that some king of physical changes must occur in the neural network in order to provide its learning, but the nature of such changes remained unknown. Hebb had assumed that essential biological changes are required to strengthen a neural network synapse only if presynaptic membrane and postsynaptic membrane are activated simultaneously. essence of Hebb’s idea consists in different learning paradigm. Although some details of weight changing rules can be different, basic Hebb’s postulate on a definite dependence of synaptic strength between elements on correlated activity of associated elements was adopted in many posterior models of learning. Based on Hebb’s idea, edmonds and minsky in 1951 had proposed their learning machine. Although minsky was among first who formulated the learning machine concept, real progress in the field of neuron-like network learning had started from the work of rosenblatt that had been published in 1962. rosenblatt had proposed a class of simple neuron-like networks called him the ‘perceptron’. perceptron indeed represented a whole class of structures consisting of processing elements able to transfer signals and change weights of their synapses. rosenblatt’s studies had been focused primarily on brain simulation in attempt to understand mechanisms of 8
cerebration and learning as well as cognitive processes. His works had been continued and advanced by many scientists and engineers who designed a number of perceptron based machines. they had performed mathematical analysis of perceptron computing power with the aim to determine limits of its capabilities. However, interest to neural networks as computing means had fallen when the book written by minsky and papert was issued. In this book, the authors stated that the nondisjunction operation is impracticable in neural networks and hence this direction is wrong. Nevertheless, Hopfield had suggested new type of neural network architecture in 1982 that was later known as Hopfield networks. Hopfield had discovered the way to construction of computing facilities based on the neural networks. He had shown also the way to implementation of neural network based associative memory and later the way to use neural networks for solving optimization problems such as, for instance, the traveling salesman problem. Another approach widely used in paradigms of neural networks was named the error back-propagation algorithm or simply backpropagation. Conceptual basis of this approach was formulated by werbos in 1974 and then independently rumelhart and mcClelland in 1986. the last authors in their book entitled “parallel Distributed processing: exploration in the microstructure of Cognition” had opened wide prospects of neural network based approaches. most recent advances in neural networks include Boltzmann machine, competitive learning, and kohonen self-organizing feature maps. the Boltzmann machine was developed by Hinton and Sejnowski based on thermodynamic models. Such models employ binary elements occurring only in states 0 or 1. these states are determined by a stochastic input function. using the same assumptions that were applied to Hopfield network, the authors managed to prove that the network converges to the states, which correspond to inputs and weight specified inner restrictions as much as possible. unlike the above, the paradigm of competitive learning uses a layer of processing elements competing with each other. As a result, a neural network able to solve pattern recognition problems is formed. Such approach was examined by rumelhart and Zipser as well as grossberg. the kohonen self-organizing feature matrix is a two-layer network able to deploy the topological map from any arbitrary initial point. 9