MixtrainDocsBlog
from mixtrain import Tensor

Overview

Tensor represents an N-dimensional numeric array. Use it for ML features, robotics, and multimodal data such as action sequences (timesteps, action_dim), joint/proprioceptive state, depth maps, or point clouds or other data like numpy arrays or nested lists. For 1-D vectors for embedding or similarity search consider using Embedding instead.

Constructor

Tensor(
    values,
    *,
    shape: list[int] | None = None,
    dtype: str | None = None,
    dim_names: list[str] | None = None,
    storage: "auto" | "inline" | "external" = "auto",
)
ParameterTypeDescription
valuesnumpy.ndarray or nested list of numbersThe array data
shapelist[int | None] | NoneOptional shape contract; None is a wildcard dimension
dtypestr | NoneOptional dtype name such as "float32" (auto-derived if omitted)
dim_nameslist[str] | NoneOptional name for each axis, e.g. ["time", "joints"]
storage"auto" | "inline" | "external"Column/output storage policy; defaults to automatic routing
import numpy as np

action = Tensor(
    np.zeros((10, 7), dtype="float32"),
    dim_names=["time", "joints"],
)

See from_numpy() for a convenience that captures shape and dtype from the array.

storage="inline" requires a uniform shape and an Arrow-supported dtype. storage="external" always writes tensors to the external storage path. storage="auto" selects automatically between inline and external storage.

Properties

PropertyTypeDescription
valuesarray or nested listThe array data
shapelist[int] | NoneShape of the array
dtypestr | NoneElement dtype name
dim_nameslist[str] | NoneAxis names

Methods

from_numpy()

Create a Tensor from a numpy array, capturing its shape and dtype.

Tensor.from_numpy(array: np.ndarray, *, dim_names: list[str] | None = None) -> Tensor
import numpy as np

action = Tensor.from_numpy(np.zeros((10, 7), dtype="float32"), dim_names=["time", "joints"])

to_numpy()

Return the values as a numpy array, applying the dtype/shape hints.

tensor.to_numpy() -> np.ndarray
t = Tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
arr = t.to_numpy()
print(arr.shape)   # (2, 3)

Using Tensor

You can use Tensor in your models, datasets, workflows, and routines.

As input

A Tensor parameter is reconstructed for you from a numpy array, a nested list, or a dict:

from mixtrain import MixModel, Tensor

class ActionConsumer(MixModel):
    def run(self, action: Tensor):
        arr = action.to_numpy()   # numpy array, original shape and dtype
        ...

As output

from mixtrain import MixModel, Tensor

class PolicyModel(MixModel):
    def run(self, inputs=None):
        actions = self._predict(inputs["observation"])  # numpy array (T, 7)
        return {
            "action": Tensor.from_numpy(actions, dim_names=["time", "joints"])
        }

In datasets

Use Tensor as a dataset column type when a column contains N-dimensional arrays. Within a column, tensors should share the same shape and dtype:

from mixtrain import Dataset, Tensor

dataset = Dataset.from_file("episodes.parquet")
dataset.save(
    "robot-actions",
    column_types={
        "action": Tensor
    }
)

To attach named dimensions during in-memory ingestion, put Tensor values in the column. from_dict(), from_pylist(), and from_pandas() preserve the dtype and dimension names:

dataset = Dataset.from_dict({
    "action": [
        Tensor(actions[0], dim_names=["time", "joints"], dtype="float32"),
        Tensor(actions[1], dim_names=["time", "joints"], dtype="float32"),
    ]
})

For an existing nested-list column, provide a descriptor with with_column_types():

dataset = dataset.with_column_types({
    "action": Tensor([], dim_names=["time", "joints"], dtype="float32")
})

When you read a tensor column back with to_tensors(), each row is restored to its full N-dimensional shape:

tensors = Dataset("robot-actions").to_tensors()
print(tensors["action"].shape)   # (num_rows, T, 7)

From model result

result = model.run({"observation": obs})

action = result.tensor
print(f"Shape: {action.shape}")
print(f"Dtype: {action.dtype}")

On this page