Opslagsindhold
⚡️Google DeepMind launched Gemini Robotics-ER 1.6, a model enabling robots to understand space, tools, and task progress with fewer human checks. Since vision alone is insufficient, the model uses stepwise reasoning, pointing, counting, multi-view success detection, and spatial constraints to convert camera feeds into decisions. Its standout feature is instrument reading: agentic vision zooms, marks indicators, runs code for proportions, and reads gauges at 93% accuracy. Demos with Boston Dynamics Spot show the system checking doors, reading dials, correcting distortion, and refusing risky actions like handling liquids or heavy items. DeepMind also offers an API to integrate this reasoning into industrial tasks such as pallet counting, inspection, and puddle detection. Source. @aipost🏴