Getting Started
This guide will help you set up and run browseruse-bench.Prerequisites
- Python 3.8+
- Node.js (for Agent-TARS)
- PostgreSQL (optional, for database integration)
Installation
1. Clone the Repository
2. Install Dependencies
3. Install Agent-TARS (Optional)
4. Configure Environment
OPENAI_API_KEY: OpenAI API key for evaluationLEXMOUNT_API_KEY: Lexmount cloud browser API keyLEXMOUNT_PROJECT_ID: Lexmount project ID
5. Configure Agents
Quick Start
Run a Benchmark
Evaluate Results
Logs: Script execution logs are saved inoutput/logs/.
run.py:output/logs/run/eval.py:output/logs/eval/generate_leaderboard.py:output/logs/leaderboard/