Evaluation Command
Basic Evaluation
Evaluate Data Subset
Force Re-evaluation
Results
Results are saved inexperiments/Online-Mind2Web/<Agent>/<Timestamp>/tasks_eval_result/.Online-Mind2Web Evaluation Guide
experiments/Online-Mind2Web/<Agent>/<Timestamp>/tasks_eval_result/.