from mixtrain import EmbeddingOverview
Embedding represents a vector embedding for ML features or semantic search. It is not file-based — it holds the vector data directly.
Use it to return embeddings from models or mark dataset columns that contain vector values.
Constructor
Embedding(
values: list[float],
*,
dimension: int | None = None,
model: str | None = None,
)| Parameter | Type | Description |
|---|---|---|
values | list[float] | The embedding vector |
dimension | int | None | Optional vector dimension hint. Use len(values) if you want to set it explicitly. |
model | str | None | Name of the model that generated this embedding |
embedding = Embedding(
values=[0.1, 0.2, 0.3, ...],
dimension=1536,
model="text-embedding-3-small"
)Properties
| Property | Type | Description |
|---|---|---|
values | list[float] | The embedding vector |
dimension | int | None | Optional vector dimension hint |
model | str | None | Source model name |
embedding = Embedding(values=[0.1, 0.2, 0.3])
print(embedding.values) # [0.1, 0.2, 0.3]
print(len(embedding.values)) # 3Using Embedding
You can use Embedding in your models, datasets, workflows, and routines.
As output
from mixtrain import MixModel, Embedding
class TextEmbedder(MixModel):
def run(self, inputs=None):
vector = self._embed(inputs["text"])
return {
"embedding": Embedding(
values=vector,
model="my-embedder-v1"
)
}In datasets
Use Embedding as a dataset column type when a column contains vectors:
from mixtrain import Dataset, Embedding
dataset = Dataset.from_file("data.parquet")
dataset.save(
"search-data",
column_types={
"text_embedding": Embedding
}
)From model result
result = model.run({"text": "Hello world"})
embedding = result.embedding
print(f"Dimension: {embedding.dimension}")
print(f"First 3 values: {embedding.values[:3]}")