O-Voxel Explained: How TRELLIS 2 Handles Hollow Structures & Thin Walls

May 6, 2026

Quick Navigation


What Is O-Voxel?

O-Voxel stands for Occupancy Voxel — a sparse octree-based 3D representation that TRELLIS 2 uses internally to encode geometry. Think of it as a 3D tree structure where each branch only exists where there's actual matter, skipping empty space entirely.

Unlike a traditional voxel grid (a uniform 3D grid of cubes), an octree subdivides space recursively: a parent voxel splits into 8 children only when needed. A hollow pipe only occupies voxels along its surface — the inside air takes zero storage. This is what makes O-Voxel fundamentally different from most other image-to-3D approaches.

The "occupancy" part is critical: each voxel cell stores a probability of being filled or empty, not a binary on/off. This lets TRELLIS 2 model soft transitions, thin walls, and fuzzy surfaces with a single representation.


How O-Voxel Differs from Standard Approaches

Standard Mesh (Polygon Soup)

Traditional 3D models are collections of triangles (a mesh). Triangles are great for smooth surfaces but struggle with:

  • Hollow interiors — a mesh defines a surface, not a volume. The inside is implicit.
  • Complex topology — bad topology (crossing edges, inverted normals) breaks physics simulations and baking.
  • Thin structures — sub-millimeter features can cause rendering artifacts.

Standard Voxel Grids

Fixed-resolution voxel grids solve the topology problem but create new ones:

  • Memory explosion — a 512³ grid is 134 million voxels. A 1024³ grid is 1 billion.
  • No hollow savings — even empty regions consume storage.
  • Blobby edges — staircase artifacts at resolution boundaries.

O-Voxel's Advantage

O-Voxel combines the best of both worlds:

PropertyMeshVoxel GridO-Voxel
Hollow structures
No topology issues
Sparse storage
Thin wall support⚠️
Adaptive resolution

The sparse octree means TRELLIS 2 only stores what matters. A lattice structure that would need a billion-voxel grid can be represented with just thousands of O-Voxel nodes.


Why Non-Manifold Geometry Matters

"Non-manifold" geometry is the formal term for shapes that don't follow the rules of traditional 3D — and it's exactly what makes TRELLIS 2 powerful.

Real-World Non-Manifold Structures

These are structures that existing tools handle poorly:

Gothic Window Grating Thin iron bars, open lattice, complex intersections. TripoSR generates solid blocks. Instant3D gets the outline but mashes interior details. TRELLIS 2: every bar, every rivet, crystal clear — and truly hollow.

Mesh Fabric / Chainmail Interlocked rings where material and air interweave at the same scale. A mesh representation either smooths this into a solid or creates impossible geometry. O-Voxel naturally models both the metal rings and the air between them.

Thin-Wall Containers A ceramic mug, a drinking glass, a helmet. These have walls that are thin relative to overall size — too thin for conventional approaches to resolve correctly. O-Voxel can represent the wall thickness as a few voxels at high resolution, without filling the entire interior.

Industrial Lattice Supports 3D-printed brackets, aerospace supports, architectural meshes. These intentionally non-manifold structures are where TRELLIS 2's advantage is most dramatic.

Why Competitors Fail Here

Most image-to-3D models use a signed distance field (SDF) or occupancy field at a fixed resolution. These representations naturally smooth away thin features — they're designed for solid objects with smooth surfaces. O-Voxel's sparse octree can zoom into regions of interest at higher resolution without paying the memory cost everywhere.


O-Voxel vs. Competitors

TripoSR

TripoSR uses a triplane representation + NeRF-style rendering. Its strength is speed (under 0.5 seconds on A100), but the triplane approach fundamentally can't represent thin structures — it's optimized for solid, compact objects. A chain-link fence or window grate? It'll give you a solid block or a mush of blended shapes.

Instant3D

Instant3D uses a hybrid approach with a NeRF backbone. Better at complex shapes than TripoSR, but still struggles with high-aspect-ratio features (very thin or very tall). The underlying representation smooths away fine lattice details.

TRELLIS 2 (O-Voxel)

O-Voxel was explicitly designed to handle non-manifold geometry. The octree representation can allocate high resolution wherever needed — thin walls get fine detail without blowing up the overall memory budget. When you need to generate a Gothic cathedral window, O-Voxel allocates enough depth to represent each bar independently.

Practical test: Same input image of decorative ironwork. TripoSR → solid block with vague outline. Instant3D → gets the silhouette, loses internal structure. TRELLIS 2 → every spindle and joint visible, truly hollow.


How It Works in Production

When you input an image to TRELLIS 2:

  1. Feature extraction — a vision encoder (based on DINOv2) extracts multi-scale features from your image.

  2. O-Voxel prediction — the model predicts O-Voxel occupancy at multiple octree levels, refining high-resolution details in regions of interest.

  3. Mesh extraction — a variant of Marching Cubes extracts a clean triangle mesh from the O-Voxel field.

  4. PBR material prediction — a separate decoder predicts base color, metallic, and roughness maps aligned to the extracted mesh.

The result is a game-ready mesh with physically-based materials — no baking artifacts, no topology nightmares, no manual cleanup required.

Export Path

Once TRELLIS 2 generates your model, you can export directly to glTF/FBX and drop it into Unity, Unreal Engine, or Godot. The PBR maps import cleanly — no re-authoring needed.


Frequently Asked Questions

Can O-Voxel represent solid objects too, or only hollow ones?

Yes — O-Voxel handles both equally well. A solid rock and a hollow vase are just different occupancy patterns in the same representation. The advantage is that O-Voxel doesn't force you to commit to solid or hollow; it represents geometry accurately regardless of interior topology.

Does O-Voxel work for organic shapes (characters, creatures)?

Yes, though this isn't TRELLIS 2's primary strength compared to competitors focused on character generation. O-Voxel's sweet spot is hard-surface and architectural geometry. For organic characters, the sparse octree can represent smooth surfaces at appropriate resolution, but the generation quality depends more on training data distribution.

Why is TRELLIS 2 slower than TripoSR if O-Voxel is more efficient?

The slowness isn't from O-Voxel itself — it's from the full pipeline: DINOv2 feature extraction, the generation model inference, and multi-pass refinement. O-Voxel's sparse representation actually makes inference faster per generated voxel than a dense grid, but TRELLIS 2's model architecture does more refinement passes for quality. 17s on H100 for a high-quality 1024³ model is still remarkable.

Can I convert O-Voxel output to other formats?

Yes. The mesh extraction step (Marching Cubes variant) produces a standard triangle mesh you can export as OBJ, glTF, FBX, or USD. The PBR textures export as standard image maps. All standard pipeline tools work with the output.

Does O-Voxel require special hardware to use?

The O-Voxel representation is TRELLIS 2's internal format — you don't interact with it directly. The hardware requirement is VRAM: the 4B model needs ~24GB (A100 or RTX 3090+), the 2B model needs 12GB. If your GPU has less, use the HuggingFace Spaces online version or rent a cloud GPU ($0.50/hr on Vast.ai).


Learn More

Next Step: Ready to put TRELLIS 2 to work? Try the free HuggingFace Spaces version or follow the installation guide to run locally.

trellis2.com

trellis2.com

O-Voxel Explained: How TRELLIS 2 Handles Hollow Structures & Thin Walls | Blog