# segmentation > SAM2 (Segment Anything Model 2) for dora-rs. Use when user needs instance segmentation, mask generation, or object segmentation from bounding boxes or points. - Author: Haixuan Xavier Tao - Repository: dora-rs/dora-skills - Version: 20260116112205 - Stars: 3 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/dora-rs/dora-skills - Web: https://mule.run/skillshub/@@dora-rs/dora-skills~segmentation:20260116112205 --- --- name: segmentation description: SAM2 (Segment Anything Model 2) for dora-rs. Use when user needs instance segmentation, mask generation, or object segmentation from bounding boxes or points. --- # Segmentation with SAM2 Generate precise segmentation masks using SAM2 (Segment Anything Model 2). ## Node Configuration ```yaml - id: sam2 build: pip install dora-sam2 path: dora-sam2 inputs: image: camera/image bbox: detector/bbox # Bounding box prompts outputs: - mask env: MODEL: sam2-hiera-small # Model variant DEVICE: cuda # cuda, mps, or cpu ``` ## Model Options | Model | Size | Speed | Quality | |-------|------|-------|---------| | `sam2-hiera-tiny` | Tiny | Fastest | Good | | `sam2-hiera-small` | Small | Fast | Better | | `sam2-hiera-base` | Base | Balanced | High | | `sam2-hiera-large` | Large | Slower | Best | ## Input Prompts ### From Bounding Boxes ```yaml inputs: image: camera/image bbox: yolo/bbox # YOLO detection boxes ``` ### From Points ```yaml inputs: image: camera/image points: click/points # User-selected points ``` ## Output Format Segmentation masks as Arrow arrays: ```python # Mask metadata metadata = { "width": "640", "height": "480", "encoding": "mask" # Binary mask } # Mask data: flattened boolean array (H * W) # 1 = object, 0 = background ``` ## Complete Pipeline: Detection + Segmentation ```yaml nodes: # Camera - id: camera build: pip install opencv-video-capture path: opencv-video-capture inputs: tick: dora/timer/millis/33 outputs: - image env: CAPTURE_PATH: "0" # Object detection - id: yolo build: pip install dora-yolo path: dora-yolo inputs: image: camera/image outputs: - bbox # Segmentation from detections - id: sam2 build: pip install dora-sam2 path: dora-sam2 inputs: image: camera/image bbox: yolo/bbox outputs: - mask # Visualization - id: viz build: pip install dora-rerun path: dora-rerun inputs: image: camera/image mask: sam2/mask ``` ## Processing Masks in Custom Node ```python from dora import Node import numpy as np import pyarrow as pa node = Node() for event in node: if event["type"] == "INPUT": if event["id"] == "mask": # Get mask data metadata = event["metadata"] width = int(metadata["width"]) height = int(metadata["height"]) # Reshape to 2D mask_flat = event["value"].to_numpy() mask = mask_flat.reshape((height, width)) # Calculate object area object_area = np.sum(mask) total_area = width * height coverage = object_area / total_area print(f"Object covers {coverage:.1%} of frame") # Find object center if object_area > 0: y_coords, x_coords = np.where(mask > 0) center_x = np.mean(x_coords) center_y = np.mean(y_coords) print(f"Object center: ({center_x:.0f}, {center_y:.0f})") ``` ## Interactive Segmentation Create point prompts from user input: ```yaml nodes: # Click handler node - id: click path: ./click_handler.py inputs: tick: dora/timer/millis/100 outputs: - points # Segment from clicks - id: sam2 build: pip install dora-sam2 path: dora-sam2 inputs: image: camera/image points: click/points outputs: - mask ``` **click_handler.py:** ```python import cv2 import pyarrow as pa from dora import Node clicked_points = [] def mouse_callback(event, x, y, flags, param): if event == cv2.EVENT_LBUTTONDOWN: clicked_points.append((x, y, 1)) # 1 = foreground elif event == cv2.EVENT_RBUTTONDOWN: clicked_points.append((x, y, 0)) # 0 = background node = Node() cv2.namedWindow("Click to segment") cv2.setMouseCallback("Click to segment", mouse_callback) for event in node: if event["type"] == "INPUT" and event["id"] == "image": # Display image # ... if clicked_points: points = pa.array(clicked_points) node.send_output("points", points) clicked_points.clear() ``` ## Multi-Object Segmentation Handle multiple objects: ```python # Each bbox produces a separate mask for i, detection in enumerate(detections): mask = masks[i] # Process each object mask ``` ## Performance Tips ### GPU Memory ```yaml env: # Use smaller model for limited GPU memory MODEL: sam2-hiera-tiny # Or use CPU DEVICE: cpu ``` ### Batch Processing ```yaml inputs: image: source: camera/image queue_size: 1 # Latest frame only ``` ## Applications ### Object Removal ```python # Create inpainting mask from segmentation inpaint_mask = (mask * 255).astype(np.uint8) result = cv2.inpaint(image, inpaint_mask, 3, cv2.INPAINT_TELEA) ``` ### Object Extraction ```python # Extract object from background object_only = image.copy() object_only[mask == 0] = 0 # Black background ``` ### Collision Detection ```python # Check if masks overlap overlap = np.logical_and(mask1, mask2) if np.any(overlap): print("Objects are colliding!") ``` ## Related Skills - `object-detection` - YOLO for bounding boxes - `tracking` - Track segmented objects - `ml-vision` - Vision pipeline overview