Kernel Function and Kernel Trick
Parametric Models They store the parameter only and not the entire training dataset. Memory Based Methods Store entire training data or a subset of the training dataset . They are fast to train, but prediction takes a longer time. Eg. Knn Need for Kernels Not all problems are linearly separable. Also, when a feature vector is transformed to a higher dimensional space, linearly separable problems become non linearly separable. For example, let us consider a point (x,y). In the 2D space it might be non-linearly separable. that is a straight line cannot separate the points. We can add a new dimension that is a combination of the existing dimensions. Let it be z = x^2 + y^2 Now the transformed feature space is (x, y, z) and is of 3 dimensions. In this dimension, z might help to move the data points along x and y in such a way that there is a linear separator between them. That is we move from the original feature space to a transformed feature space . We cannot say how many dimensions