Posts

Showing posts from April, 2024

Kernel Function and Kernel Trick

  Parametric Models They store the parameter only and not the entire training dataset. Memory Based Methods Store entire training data or a subset of the training dataset . They are fast to train, but prediction takes a longer time. Eg. Knn Need for Kernels Not all problems are linearly separable. Also, when a feature vector is transformed to a higher dimensional space, linearly separable problems become non linearly separable. For example, let us consider a point (x,y). In the 2D space it might be non-linearly separable. that is a straight line cannot separate the points. We can add a new dimension that is a combination of the existing dimensions. Let it be z = x^2 + y^2 Now the transformed feature space is (x, y, z) and is of 3 dimensions. In this dimension, z might help to move the data points along x and y in such a way that there is a linear separator between them. That is we move from the original feature space to a transformed feature space . We cannot say how many dimensions

Constructing Kernels

  Properties of Kernel Functions To perform this transformation we need to chose proper kernel functions. Functions that satisfy the following properties make good kernel functions. 1. Symmetry K(x,y) = K(y,x) That is, the similarity measure between x and y is the same as the similarity measure between y and x. The kernel matrix will be a symmetric matrix. 2. Positive Semidefinitiveness The kernel matrix formed must be positive semidefinite. Eigenvalues and Eigenvectors Mercer's Theorem It states that a continuous, symmetric and positive semidefinite function is a valid kernel. It also states that the kernel function corresponds to the dot product(inner product in Euclidean space) in higher dimensional space. Constructing Common Kernel functions The following pdf shows how to construct kernels and check for valid kernels.

Clustering - Agglomerative clustering

Image
Agglomerative clustering is a bottom-up approach to clustering in which each data point starts in its own cluster and pairs of clusters are merged together based on certain criteria until all points belong to just one cluster. The process involves iteratively merging the most similar clusters until a stopping criterion is met. The similarity between clusters is typically measured using a distance metric, such as Euclidean distance, and different linkage methods define how this similarity is computed. Common linkage methods include: Agglomerative clustering is a hierarchical clustering technique where each data point starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. There are different linkage criteria used to determine the merging strategy. Here are some common types of agglomerative clustering based on the linkage criteria: 1. Single Linkage (or Minimum Linkage):    - In single linkage clustering, the distance between two clusters is defined as

Measures of Similarity and Dissimilarity

All the measures are as given in the pdf.