Asset Harvester: turn road-object photos into simulation-ready 3D assets

01What it does

Asset Harvester is an image-to-3D workflow from NVIDIA for autonomous-driving and robotics simulation. You give it one image, or a small set of object views, and it builds a complete 3D asset for that object.

The browser demo keeps the workflow concrete. It segments the foreground object, recenters it, estimates camera information, generates a 16-view orbit with a multiview diffusion model, then lifts those views into a 3D Gaussian splat. The output is not just a preview video. You also get a PLY file that can be used as a simulation asset.

That makes the model more specific than a general "make a 3D object" demo. It is built for vehicles, pedestrians, cyclists, strollers, bins, trucks, buses, and similar objects that show up in autonomous-driving logs. If you work with simulation, synthetic data, scene reconstruction, or robotics datasets, the value is being able to turn real observations into reusable objects.

02How to try it

Start with the Hugging Face Space and use one of the built-in examples first. Watch the object segmentation result before judging the final 3D render. If the mask misses part of the object or grabs the wrong region, the 3D result will likely carry that error forward.

For your own test, upload a clear single-object image that looks similar to a road-scene crop: the full object visible, one main subject, and enough surrounding context for the model to estimate the view. The demo returns an orbit render and a downloadable PLY, so check both the visual rotation and whether the file is useful for your downstream tool.

If the demo result is useful, move to the GitHub repo for local deployment. The repo includes setup, checkpoint download, segmentation, camera estimation, inference, and benchmark paths. Treat local use as a real CUDA workflow, not a laptop experiment. NVIDIA lists Ampere-or-newer GPUs and heavy memory/storage requirements for the full system.

03Caveat

This is domain-specific and hardware-heavy. It is trained for autonomous-driving style objects, so arbitrary internet photos, crowded scenes, severe occlusion, or unusual object categories can produce poor or hallucinated geometry. Do not treat the output as safety-critical ground truth.

04What you can do with it

Convert road-scene object crops into 3D Gaussian splat assets for simulation.
Build reusable vehicle, cyclist, pedestrian, stroller, bus, truck, or roadside-object assets from real observations.
Test whether a simulation pipeline can ingest objects derived from driving logs instead of hand-made assets.
Compare the generated 3D orbit against the original image before using it in a synthetic-data workflow.
Use the downloadable PLY as a starting point for AV, robotics, or scene-reconstruction experiments.

Try the demo

View model page

Neural Expedition · Useful open-source AI, curated without hype.

Field notes

01What it does

02How to try it

03Caveat

04What you can do with it