- Published on
Embedding
- Authors

- Name
- Kai Kang
- Role
- Staff Software Engineer @ Meta · Solo App Builder
I created this embedding based game :)
What are Embeddings, and why?

The language of modern neural networks is array/list of numbers, or tensors, or vectors. Embedding vectors are just "lists of numbers". Think .
They have interesting traits though. Words with similar semantic meanings are also similar vectors. "Cat" and "dog" are similar words, and thus they are also similar vectors. They are very different from "Hamburger", and thus you can expect the embedding vector representing hamburger to be very different.
The reason we need them is that just like English, Chinese, Japanese, ... are our languages, vectors are the language of neutral networks, thus LLMs.
LLMs can do math on those embedding vectors, and this is the core reason how it is able to output human language words while only understand tensors / vectors. The most classic example is .

Embeddings into LLM, how?

When a word like "cat" is fed into an LLM, it first gets "tokenized", a concept we will talk about in another post. Now a word becomes a token (it's not 1-1 mapping, a word can be many tokens too. for simplicity, we say it's one token). This token is then translated to an embedding vector via a look up table. This embedding lookup table comes from LLM training.
initial_embedding = embedding_table[token]
contextual_embedding = ...
next_token = ...
More on this in an LLM note.
How to get pre-trained embeddings
[1] OpenAI has online API:
from openai import OpenAI
client = OpenAI()
response = client.embeddings.create(
model="text-embedding-3-small",
input="The dog is playing in the park."
)
embedding = response.data[0].embedding
print(len(embedding))
[2] Sentence Transformers library is a standard python, offline solution
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
texts = [
"A dog is running outside.",
"A puppy is playing in the park.",
"I want to eat a hamburger."
]
embeddings = model.encode(texts)
print(embeddings.shape)
[3] Hugging Face allows direct access to models, so more flexibility
from transformers import AutoTokenizer, AutoModel
import torch
model_name = "sentence-transformers/all-MiniLM-L6-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
text = "A dog is running outside."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
token_embeddings = outputs.last_hidden_state
sentence_embedding = token_embeddings.mean(dim=1)
Applications
- Search (RAG)
- RAG: embedding vector -> vector DB
- Recommendation Systems
- User Embedding, Item Embedding
An interesting paper on Food Embeddings shows that each food ingredient can have an embedding. Embedding space is not objective. It encodes the relationship you choose to train or retrieve on (things cooked together, or things that are substitutes)