API Reference
This section documents all public classes, methods, and functions provided by the CUA SDK.
ComputerAgent
Class: ComputerAgent
The main interface for creating and controlling agents.
Constructor
ComputerAgent(name: str = "Merlin", mode: str = "interactive", config: dict = None)
- name (str): Name of the agent (default: "Merlin")
- mode (str): Operation mode, e.g., "interactive" or "batch"
- config (dict): Optional configuration dictionary
Methods
run_task(task: str) -> Any
Runs a task using the agent.
- task (str): Description of the task to perform (e.g., "Open Notepad and type 'Hello'")
- Returns: Result of the task (type may vary)
agent = ComputerAgent()
result = agent.run_task("Open Notepad and type 'Hello'")
stop() -> None
Stops the agent and cleans up resources.
agent.stop()
Docker Utilities
If using Docker, the following scripts/utilities are available:
- Dockerfile: Builds the agent image.
- cua_docker/cua_docker_loop.py: Main entrypoint for running agents in Docker.
- cua_docker/config.py: Handles Docker-specific configuration.
Configuration Utilities
- load_dotenv(): Loads environment variables from
.env
. - os.getenv("VAR_NAME"): Accesses environment variables in code.
Additional Functions
Document any other public functions or utilities here as your SDK evolves.
For more usage examples, see Usage Examples.