ANN Series 8 - Perceptron


In the previous blogs in the series we saw how to handle regression with a simple ANN architecture. In this blog, we perform classification with a single layered ANN for binary classification and it is termed perceptron. To update weights we make use of the Perceptron learning rule rather than Gradient Descent that was used in regression.

Perceptron

  • Binary classifier : Class = {1,-1}
  • Linear Classifier
  • Learns weight and threshold
  • Activation Function –
    • Step function –Binary or Sign
  • Let input vector
    • x = ( x1, x2, ... , xn)
  • Associate weight vector for the input
    • w = ( w1, w2, ... , wn)


 Perceptron Learning Algorithm

  • Step 1 : Initialize weights randomly
  • Step 2 : Take a training sample at random without replacement
  • Step 3 : Perform Affine transformation of input and weights
  • Step 4 : Pass output of affine transformation through activation function to find the class label
  • Step 5 : Update weights based on perceptron learning rule
  • Step 6 : Repeat from step 2 till all the inputs are correctly classified

 Refer previous blog for what is an affine transformation.




Geometric Intuition behind the Perceptron Learning Algorithm

Perceptron Criteria:
Case 1 : No misclassification
No update

Case 2 : Misclassified x_n
then Minimize -w . x_n * t_n
Since w.x=0, and since w and x are vectors, they must be orthogonal vectors. So the cosine angle between them must be 0. We want to move the decision boundary such that the decision boundary moves:
i) closer to the misclassified example when example is positive
ii) Farther from the misclassified example when example is negative

Intuitive Explanation

The weight vector can be thought of as pointing in the "ideal" direction where the sum of the weighted inputs gives the correct classification. Updating the weights based on misclassified examples gradually rotates and shifts this vector (and thus the decision boundary) towards a position where the classifications are as accurate as possible given the training data.

The bias update adjusts where the decision boundary intersects the feature space, allowing for fine-tuning of the boundary's position even when the inputs are all zeros.

In essence, the perceptron learning algorithm iteratively adjusts the orientation and position of the decision boundary by updating the weights and bias, aiming to minimize the classification errors on the training set.

Geometric Interpretation

  • The weight vector can be thought of as normal to the decision boundary. Changing rotates or tilts the boundary in the feature space.
  • The bias shifts the boundary closer or further away from the origin without rotating it.

Case 1: If x ε class:1 and w.x < 0  

     w_new = w+x

cos ϴ_new α w_new.x
                  α (w+x).x
                  α (w.x + x.x)
                  α (cos ϴ + x.x)
   cos ϴ_new > cos ϴ
          ϴ_new < ϴ

We see that with every updation, w_new moves towards x and the angle between the blue line and w is reducing. 

Case 2 : If x ε class:-1 and w.x ≥ 0  

     w_new = w-x

cos ϴ_new α w_new.x
                α (w-x).x
                  α (w.x - x.x)
                  α (cos ϴ - x.x)
   cos ϴ_new < cos ϴ
          ϴ_new > ϴ

We see that with every updation, w_new moves away from x and the angle between the red line and w is increasing. 

Limitations

  • Solves for Boolean output
  • Single layer perceptron solves only linearly separable problems cause they try to find a linear boundary
  • Updates weight and bias for every training sample

 




Comments

Popular posts from this blog

ANN Series - 10 - Backpropagation of Errors

Naive Bayesian Classifiers - Multinomial, Bernoulli and Gaussian with Solved Examples and Laplace Smoothing

Clustering - K means Clustering