Artificial Neural Networks


Artificial neural networks (ANNs) are learning algorithms in the form of computer programs or hardware. ANNs are characterized by an architecture and a method of training. Network architecture refers to the way processing elements are connected and the direction of the signals exchanged. A processing element or unit is a node where input signals converge and are transformed to outputs via transfer or activation functions. The values of outputs are usually multiplied by weights before they reach another node. The purpose of training is to find optimal values of these weights according to a criterion. In supervised training, inputs are presented to the network and outputs are compared to the desired or target outputs. Weights are then adjusted to minimize an objective function such as the root mean square error for instance. In unsupervised training, the network itself finds its own optimal parameters.

Although there are several types of neural networks, a simple example of ANN is the multilayer perceptron. The middle sets of units are called hidden layers and the other two input and output layers. The transfer functions in the input and output layers can be identities, and those of the hidden layer are usually sigmoid or hyperbolic tangent functions. These functions map the sum of weighted inputs to the range between zero and one or between minus one and plus one. The flow of signals in the example is unidirectional giving the name feedforward to the whole network. One can have also the output from the network and connect it to the inputs thus leading to recurrent networks which are useful for time series modeling. Typically, the hidden layers contain several processing elements. Obviously the outputs are modeled as highly non-linear functions of the original inputs. Thus, it is the architecture of units that allow an ANN to be a universal approximator. In other words an ANN can recover an unknown mapping from the input to the output space as long as it contains enough processing elements (White et al., 1992). The network can be trained with backpropagation (Rumelhart and McClelland, 1986), which seeks a minimum in the error function via the gradient descent method. Weights are adjusted in the direction that reduces the value of the error function after each presentation of the input records.

Ana Sayfa | Site Map | İletişim | Hakkında | Genel |