browseruse_bench.utils.db_utils
PostgreSQL database connection and operation utilities.Import
get_db_connection
Get PostgreSQL connection.Optional database URL
Psycopg2 connection object
db_cursor
Context manager providing auto-commit/rollback cursor.Usage Example
init_db_schema
Initialize database schema (executesscripts/db_schema.sql).
create_run_record
Create a record inbenchmark_run table and return run_uuid.
Agent name
Benchmark name
Execution mode
Number of tasks
JSON string of task IDs
Start timestamp
Initial status
Path to experiment directory
UUID of the created run record
update_run_status
Updatebenchmark_run status, finished time, and error message.
insert_task_results
Batch insert intobenchmark_task_result.
Required Fields
| Field | Description |
|---|---|
run_uuid | Run UUID |
task_id | Task ID |
benchmark_name | Benchmark name |
agent_name | Agent name |
status | Status |
Optional Fields
started_at, finished_at, latency_ms, retries, error_type, raw_log_path, extra
insert_eval_metric
Insert abenchmark_eval_metric record.
Field Description
| Field | Type | Description |
|---|---|---|
run_uuid | str | Run UUID (required) |
agent_name | str | Agent name (required) |
benchmark_name | str | Benchmark name (required) |
model_name | str | Model name |
score_threshold | int | Score threshold |
total_tasks | int | Total tasks |
completed_tasks | int | Completed tasks |
passed_tasks | int | Passed tasks |
avg_score | float | Average score |
pass_rate | float | Pass rate |
avg_latency_ms | int | Average latency (ms) |
extra | dict | Extra data (JSON) |