
Segment Anything Model 3D (SAM 3D) is a computer vision model from Meta designed to reconstruct 3D i
Segment Anything Model 3D (SAM 3D) is a computer vision model from Meta designed to reconstruct 3D information from single 2D images. Building on the Segment Anything ecosystem, SAM 3D predicts dense depth maps and per-pixel surface normals with a focus on generalization across diverse object categories and scenes. It is intended primarily as a research model to advance 3D understanding, rather than a turnkey content creation tool.
Users can leverage SAM 3D to obtain geometry-aware representations useful for downstream tasks such as 3D reconstruction, novel-view synthesis, robotics perception, AR/VR scene understanding, and human or object pose analysis. The project page provides technical details, model capabilities, datasets used, and links to code or weights where available.
Please sign in to comment
π¬ No comments yet
Be the first to share your thoughts!
Explore 182+ top alternatives to Meta SAM 3D

OpenMMLab is an open-source computer vision platform providing modular libraries, algorithms, and pretrained models for tasks such as classification, detection, segmentation, and video understanding.

Gemini Robotics is a suite of Gemini-based models and tools for enabling robots to understand instructions, perceive environments, and perform complex real-world manipulation tasks.

Nvidia Omniverse is a platform for building, simulating, and connecting physically accurate, real-time 3D applications and collaborative virtual worlds using USD-based workflows.

CORLEO Kawasaki is a digital guide that presents Kawasaki Heavy Industriesβ technologies, exhibits, and initiatives for Expo 2025 Osaka, Kansai in an interactive online format.

NVIDIA Cosmos is a multimodal AI platform that unifies and orchestrates specialized models to understand, simulate, and reason about complex real-world environments and dynamics.

World Labs provides spatial intelligence models that perceive, generate, and interact with 3D environments for applications such as robotics, simulation, mapping, and virtual reality.

CoTracker3 is a computer vision model that jointly tracks multiple points across video frames with long-range temporal consistency and dense, pixel-precise motion estimation.
Trellis 3D is a neural rendering framework that synthesizes detailed 3D scenes from sparse, casually captured mobile phone videos using distillation-based view generation.