Monday, June 8, 2026
2026-06-08
I started a #blog for #cosine-similarity. This blog starts with a visual of the definition of the cos function. The blog then talks about vectors and how the cos theta function value is 0 when the vectors overlap and how the value of the function is 1 when the vectors are orthogonal. This is essentially the dot product of the vectors. I then created a script that used word embeddings using the gensim package to illustrate how learned word embeddings exist in a latent space that encodes semantic meaning of the words. For example, we can do things like Queen = Kind - Male + Female in the latent space. However, this is not exactly but we use nearest neighbor search to find the closest vector. Next work is to play with sentence embeddings and create the GitHub commit understanding examples.