What this is
Use this when you want LexBench Server to run benchmarks (e.g. via Kestra) instead of executing tasks on your machine.Dedicated CLI entrypoint
Usebubench submit for LexBench jobs. It is intentionally separate from bubench run:
bubench run: execute tasks locally on your machinebubench submit: create a job on LexBench Server
bubench builds the payload and calls the official Python SDK (lexbench-sdk) method LexbenchClient.submit_eval_run().
When to use the SDK directly
If you are wiring LexBench from your own Python service, script, or CI and do not needbubench task filtering or agent YAML mapping, you may call submit_eval_run() yourself. That path does not go through bubench submit.
Prerequisites
- A reachable LexBench Server instance
- An API token (recommended: Node CLI)
lexbench login, bubench submit can automatically reuse
~/.config/lexbench/credentials.json, so you do not have to duplicate the base URL and token in .env.
Environment variables
Put recurring values in.env (see .env.example in the repo root). For bubench submit, the current working directory .env is loaded first, then the repo root .env is used as fallback. Typical keys:
| Variable | Purpose |
|---|---|
LEXBENCH_BASE_URL | Server base URL (optional if lexbench login has saved credentials) |
LEXBENCH_API_TOKEN | API token (optional if lexbench login has saved credentials) |
LEXBENCH_EVAL_API_KEY | Evaluation model API key (often required for non-admin users) |
LEXBENCH_PROJECT_ID | Optional target project |
LEXBENCH_PROJECT_BENCHMARK_ID | Optional project benchmark |
LEXBENCH_UI_LANGUAGE | Optional UI locale (zh / en) |
LEXBENCH_RESUME_TIMESTAMP | Optional resume timestamp |
LEXBENCH_FORCE_RERUN | true / false |
LEXBENCH_DEBUG | true / false |
LEXBENCH_BATCH_SEQUENTIAL | true / false |
LEXBENCH_CAPTCHA_SOLVER_SERVICE | Optional captcha provider |
LEXBENCH_CAPTCHA_SOLVER_API_KEY | Optional captcha API key |
.env and CLI flags. The main one-off CLI overrides kept on bubench submit are --run-name and --version. The current platform accepts all, first_n, and sample_n as submit modes. LEXBENCH_EVAL_API_KEY also falls back to OPENAI_API_KEY.
For browser-use, the current platform validates these values even in --dry-run:
- model API key (
OPENAI_API_KEYor equivalent) LEXMOUNT_API_KEYLEXMOUNT_PROJECT_ID- eval API key (
LEXBENCH_EVAL_API_KEYor fallbackOPENAI_API_KEY)
Example
Production defaults used by the hosted LexBench platform:- LexBench base URL:
https://bench.lexmount.com - Model gateway:
https://llmapi.hk.lexmount.net - Lexmount browser API:
https://api.lexmount.com
Notes
bubench submitprefers the current working directoryconfig.yaml; if missing, it falls back to the repo rootconfig.yaml.bubench submitmaps rootconfig.yamlplus relevant env fallbacks to the server-side config schema; it does not send raw YAML keys blindly.--dry-runis useful for connectivity checks; the server may still create a run record and validate inputs.