Computer Science & Engineering Artificial Neural Networks Virtual Lab Experiments

Solution of optimization problems

Description of self-organizing feature map

Self-organizing map (SOM) was proposed by T. Kohonen [Kohonen, 1982a], and it provides a way of visualizing data. In particular, high dimensional data can be reduced to lower dimensions using SOM. The map also helps in grouping similar data together. Thus the map helps in visualizing the concept of clustering by grouping data with similar characteristics. A SOM attempts to cluster the training data by detecting similarity between data points/ feature vectors. The map does not require external supervision, and hence represents an unsupervised way of learning.

Architecture of SOM

In a self-organizing map, the input is represented by an \(M\)-dimensional feature vector. The output layer consists of \(N\) units, where the units are arranged in the form of a grid. Each unit is a neuron/node. Each dimension of the input feature vector is associated with a unit/neuron in the input layer. A weight vector is associated with a unit/node/neuron in the output layer, and the dimension of this weight vector is same as that of the input feature vector. The map operates in two modes, namely, training and mapping. The process of training helps to build the map using (a large number of) input feature vectors, which are also known as training examples. The process of mapping a new input feature vector automatically identifies the region in the output whose neighbourhood units have similar properties.

Figure 1: Architecture of SOM

Algorithm for learning

The training of SOM is based on the principle of competitive learning (Refer experiment 7, Artificial Neural Networks virtual labs). A training example is a set of inputs with some given number of nodes. Initially weights in the SOM network are assigned randomly. When a training example is presented to the network, Euclidean distances between the training feature vector and all the weight vectors are computed. The neuron in the output unit for which the distance is found minumum is said to fire. Which mean that the weights of this neuron and the neurons close to it in the SOM network are adjusted towards the input vector. The magnitude of the adjustment decreases with time and with distance of other neurons from the fired neuron.  A neighborhood function is defined to specify the neurons whose weight vectors are updated along with that of the fired neuron. The region of the neighborhood function  is broad initially, and the region shrinks gradually with successive iterations. This process is repeated for each input vector for a  number of iterations. The SOM associates the output nodes with groups or patterns in the input data set.

Let us consider an \(N\)-unit output layer and \(M\)-dimensional input feature vectors. The following sequence of steps is involved in the learning process, i.e., the process of updating the weights.

      1. Initialize the weights from \(M\) inputs to the \(N\) output units to small random values. Initialize the size of the neighbourhood region.

      2. Present a new input feature vector.

      3. Compute the distance between the input feature vector the weight vector associated with each neuron.

      4. Select the neuron k which minimizes the distance.

      5. Update the weight vector associated with neuron k, and also the weight vectors associated with neurons in the neighbourhood of neuron k.

      6. Repeat steps 2 through 5 for all inputs, several times.

Mapping using a test feature vector

During mapping again, there will be one winning neuron, the neuron whose weight vector lies closest to the input vector. This will be determined by calculating the Euclidean distance between input vector and all the weight vectors. And then the test feature vector will be said to fire the winning neuron. The updation of weights for rest of the cycles follow the same algorithm as above.