Skip to main content

What this is

Use this when you want LexBench Server to run benchmarks (e.g. via Kestra) instead of executing tasks on your machine.

Dedicated CLI entrypoint

Use bubench submit for LexBench jobs. It is intentionally separate from bubench run:
  • bubench run: execute tasks locally on your machine
  • bubench submit: create a job on LexBench Server
Under the hood, bubench builds the payload and calls the official Python SDK (lexbench-sdk) method LexbenchClient.submit_eval_run().

When to use the SDK directly

If you are wiring LexBench from your own Python service, script, or CI and do not need bubench task filtering or agent YAML mapping, you may call submit_eval_run() yourself. That path does not go through bubench submit.

Prerequisites

  • A reachable LexBench Server instance
  • An API token (recommended: Node CLI)
npm install -g @lexbench/sdk
lexbench login --base-url "https://bench.lexmount.com"
After lexbench login, bubench submit can automatically reuse ~/.config/lexbench/credentials.json, so you do not have to duplicate the base URL and token in .env.

Environment variables

Put recurring values in .env (see .env.example in the repo root). For bubench submit, the current working directory .env is loaded first, then the repo root .env is used as fallback. Typical keys:
VariablePurpose
LEXBENCH_BASE_URLServer base URL (optional if lexbench login has saved credentials)
LEXBENCH_API_TOKENAPI token (optional if lexbench login has saved credentials)
LEXBENCH_EVAL_API_KEYEvaluation model API key (often required for non-admin users)
LEXBENCH_PROJECT_IDOptional target project
LEXBENCH_PROJECT_BENCHMARK_IDOptional project benchmark
LEXBENCH_UI_LANGUAGEOptional UI locale (zh / en)
LEXBENCH_RESUME_TIMESTAMPOptional resume timestamp
LEXBENCH_FORCE_RERUNtrue / false
LEXBENCH_DEBUGtrue / false
LEXBENCH_BATCH_SEQUENTIALtrue / false
LEXBENCH_CAPTCHA_SOLVER_SERVICEOptional captcha provider
LEXBENCH_CAPTCHA_SOLVER_API_KEYOptional captcha API key
Most LexBench platform settings are env-only by design, so users do not need to maintain the same option in both .env and CLI flags. The main one-off CLI overrides kept on bubench submit are --run-name and --version. The current platform accepts all, first_n, and sample_n as submit modes. LEXBENCH_EVAL_API_KEY also falls back to OPENAI_API_KEY. For browser-use, the current platform validates these values even in --dry-run:
  • model API key (OPENAI_API_KEY or equivalent)
  • LEXMOUNT_API_KEY
  • LEXMOUNT_PROJECT_ID
  • eval API key (LEXBENCH_EVAL_API_KEY or fallback OPENAI_API_KEY)

Example

Production defaults used by the hosted LexBench platform:
  • LexBench base URL: https://bench.lexmount.com
  • Model gateway: https://llmapi.hk.lexmount.net
  • Lexmount browser API: https://api.lexmount.com
export LEXBENCH_BASE_URL="https://bench.lexmount.com"
export LEXBENCH_API_TOKEN="<api_token>"
export OPENAI_API_KEY="<model_api_key>"
export LEXBENCH_EVAL_API_KEY="<eval_api_key>"
export LEXMOUNT_API_KEY="<lexmount_api_key>"
export LEXMOUNT_PROJECT_ID="<lexmount_project_id>"
export LEXMOUNT_BASE_URL="https://api.lexmount.com"
export OPENAI_BASE_URL="https://llmapi.hk.lexmount.net"

cp config.example.yaml config.yaml

uv run bubench submit \
  --agent browser-use \
  --benchmark LexBench-Browser \
  --mode sample_n \
  --count 1 \
  --split L1 \
  --dry-run \
  --run-name nightly-smoke

Notes

  • bubench submit prefers the current working directory config.yaml; if missing, it falls back to the repo root config.yaml.
  • bubench submit maps root config.yaml plus relevant env fallbacks to the server-side config schema; it does not send raw YAML keys blindly.
  • --dry-run is useful for connectivity checks; the server may still create a run record and validate inputs.