Installation
bubench run will create the agent venv defined in config.yaml (default .venvs/browser_use) and install the browser-use
extra on first use.
Activate .venv (or use uv run bubench ...) before running bubench commands.
Configuration
First copy the example configuration, then edit:configs/agents/browser-use/config.yaml:
Supported Model Types
| MODEL_TYPE | Description | Configuration Items |
|---|---|---|
OPENAI | OpenAI Models | OPENAI_API_KEY, OPENAI_BASE_URL |
GEMINI | Gemini Models | GEMINI_API_KEY, GEMINI_BASE_URL |
BROWSER_USE | Browser Use Official API | BROWSER_USE_API_KEY |
Configuration Description
| Parameter | Description | Options/Examples |
|---|---|---|
MODEL_TYPE | Model Provider Type | OPENAI, GEMINI, BROWSER_USE |
MODEL_ID | Model ID | BU-1.0, gpt-4o, gemini-3-flash-preview |
BROWSER_USE_API_KEY | Browser Use Official API Key | Used mainly for BROWSER_USE mode |
BROWSER_ID | Browser Type | Chrome-Local, lexmount, browser-use-cloud, agentbay |
USE_VISION | Enable Vision Capabilities | true, false (default) |
MAX_STEPS | Maximum Task Steps | Integer (default 40) |
TIMEOUT | Task Timeout (seconds) | Default 600, CLI --timeout overrides |
GEMINI3_THINKING_LEVEL | Gemini 3 Thinking Level | low, medium, high |
LEXMOUNT_BROWSER_MODE | Lexmount Browser Mode | normal (default), uc |
AGENTBAY_API_KEY | AgentBay API Key | Set in .env as AGENTBAY_API_KEY (env only) |
AGENTBAY_IMAGE_ID | AgentBay Session Image ID | Default browser_latest |
AGENTBAY_ENABLE_BROWSER_REPLAY | Enable AgentBay Browser Replay | true (default), false |
AGENTBAY_BROWSER_USE_STEALTH | Enable AgentBay Stealth Browser Option | false (default), true |
AgentBay Runtime Notes
- AgentBay SDK is treated as an optional dependency. Missing packages or incompatible exports fail only when
BROWSER_ID=agentbay; other browser modes continue to work. - Session cleanup failures in AgentBay backend are logged and do not mask task execution errors.
Browser Modes
Local Browser
Use local Chrome browser, suitable for development and debugging.
Cloud Browser
Use Lexmount cloud browser, suitable for large-scale evaluation.
Usage Examples
Basic Run
Run Specific Tasks
Evaluation
Supported Benchmarks
- ✅ LexBench-Browser
- ✅ Online-Mind2Web
- ✅ BrowseComp