Agent skill

alfworld-environment-scanner

Performs an initial scan of the Alfworld environment to identify all visible objects and receptacles. Processes raw observation text into a structured list of entities to build a mental map for planning.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/alfworld-environment-scanner

SKILL.md

Instructions

Primary Objective

Execute an initial environmental scan at the start of any Alfworld task. Your goal is to systematically identify and catalog all objects and receptacles mentioned in the initial observation text.

Core Workflow

  1. Trigger: This skill is triggered immediately upon receiving the initial environment description from the simulator (e.g., "You are in the middle of a room. Looking quickly around you, you see...").
  2. Parse & Extract: Process the raw observation text. Extract every noun phrase that represents a physical entity (object or receptacle). Note the naming convention (e.g., "armchair 2", "diningtable 1").
  3. Categorize: Mentally categorize each entity. A receptacle is a surface or container that can hold other objects (e.g., sofa, sidetable, diningtable, dresser). An object is an item that can be manipulated (e.g., laptop, creditcard, pillow). Some entities (like ottoman) can be both depending on context.
  4. Output Structured Mental Map: Formulate a clear, concise internal summary. Do not output this summary to the simulator—it is for your planning use only.
    • Format: Scan Complete. Receptacles: [list]. Objects: [list].
    • Example: Based on the trajectory, the mental map should be: Scan Complete. Receptacles: [armchair 2, armchair 1, diningtable 1, drawer 4, drawer 3, drawer 2, drawer 1, dresser 1, garbagecan 1, ottoman 1, sidetable 3, sidetable 2, sidetable 1, sofa 1]. Objects: []. (Note: Initial scan only lists visible furniture; objects on them are discovered upon interaction.)

Execution Rules

  • Single Action: This skill culminates in a single go to <receptacle> action to begin the task-specific search. Choose the most logical first receptacle to inspect (e.g., large central surfaces like diningtable 1 or sofa 1).
  • No Looping: Do not create a loop of go to actions. After executing the first go to, the skill ends, and standard task planning takes over.
  • Integration: The mental map generated by this skill must inform all subsequent planning. When searching for a target object, reason about which receptacles from your map are most likely to contain it.

Example from Trajectory

Initial Observation: "You are in the middle of a room... you see a armchair 2, a armchair 1, a diningtable 1, a drawer 4... sofa 1." Skill Execution:

  • Thought: Initiating environment scan. I see multiple receptacles: armchairs, tables, drawers, a dresser, garbage can, ottoman, sidetables, and a sofa. I will build a mental map and then start searching for the laptop on a likely surface.
  • Action: go to sofa 1 (The agent then proceeds with the main task using the knowledge from the scan.)

Didn't find tool you were looking for?

Be as detailed as possible for better results