Skip to main content
Welcome to contribute to the browseruse-bench project! This guide will help you understand how to participate in project development.

Development Environment Setup

1

Fork Repository

Fork the browseruse-bench repository on GitHub
2

Clone Code

git clone https://github.com/your-username/browseruse-bench.git
cd browseruse-bench
3

Install Dependencies

# Using uv (Recommended)
uv sync --all-extras

# Or using pip
pip install -e ".[all,dev]"
4

Create Branch

git checkout -b feature/your-feature-name

Contribution Types

Adding New Agents

  1. Add a new agent module in browseruse_bench/agents/ (e.g., your_agent.py)
  2. Implement BaseAgent.run_task and register with @register_agent
  3. Add runtime config under configs/agents/<agent>/config.yaml (optional config.yaml.example)
  4. Import the module in browseruse_bench/agents/__init__.py
  5. Register the agent in root config.yaml
  6. Update documentation
See Adding New Agents for details. For browser-capable agents and new browser types, also follow Custom Agent Integration.

Adding New Benchmarks

  1. Create a new Benchmark directory in benchmarks/
  2. Prepare task data and data_info.json
  3. Implement evaluator (optional)
  4. Update documentation
See Custom Benchmark for details.

Fixing Bugs

  1. Create an Issue to describe the problem
  2. Submit a PR with the fix
  3. Ensure tests pass

Improving Documentation

  1. Modify documentation in docs/ directory
  2. Submit PR

Adding New Agents

Directory Structure

browseruse_bench/
└── agents/
    └── your_agent.py
config/
└── agents/
    └── your-agent/
        ├── config.yaml
        └── config.yaml.example  # Optional example config
Keep secrets in the root .env file and avoid storing API keys in YAML.

Implement Interface

New agents implement BaseAgent and the run_task method:
from __future__ import annotations

import logging
from pathlib import Path
from typing import Any, Dict

from browseruse_bench.agents.base import BaseAgent
from browseruse_bench.agents.registry import register_agent

logger = logging.getLogger(__name__)


@register_agent
class YourAgent(BaseAgent):
    name = "your-agent"

    def run_task(
        self,
        task_info: Dict[str, Any],
        agent_config: Dict[str, Any],
        task_workspace: Path,
    ) -> Dict[str, Any]:
        task_id = task_info.get("task_id", "unknown")
        logger.info("Running task %s", task_id)
        return {
            "task_id": task_id,
            "status": "success",
            "answer": "",
            "metrics": {"steps": 0},
        }

Register Agent

  1. Import the module in browseruse_bench/agents/__init__.py:
from browseruse_bench.agents import your_agent  # noqa: F401
  1. Add the agent entry to the root config.yaml:
agents:
  your-agent:
    path: browseruse_bench/agents
    entrypoint: scripts/agent_runner.py
    config: configs/agents/your-agent/config.yaml
    supported_benchmarks:
      - Online-Mind2Web
    venv: .venv

Code Standards

Formatting

# Use ruff to format
ruff format .

# Check code style
ruff check .

Type Checking

# Use mypy for type checking
mypy browseruse_bench

Testing

# Run all tests
pytest tests/

# Run specific test
pytest tests/test_eval.py -v

Submitting PR

1

Ensure Tests Pass

pytest tests/
ruff check .
2

Commit Changes

git add .
git commit -m "feat: add your feature description"
3

Push Branch

git push origin feature/your-feature-name
4

Create PR

Create a Pull Request on GitHub and describe your changes

Commit Convention

Use Conventional Commits format:
  • feat: New feature
  • fix: Bug fix
  • docs: Documentation update
  • refactor: Code refactoring
  • test: Add test
  • chore: Other changes
Example:
feat: add WebArena benchmark support
fix: resolve timeout issue in browser-use agent
docs: update quickstart guide