Hub
    Docs
Try for Free
xiangyi-li
/
OS-World
mirrored a minute ago
Benchmark CardFiles and versionsLeaderboard
  • Hub
  • Contact
DiscordGitHubXLinkedIn
0
  • __init__.py
    1.59 kB
    ​
  • calc.py
    522 B
    ​
  • chrome.py
    123 kB
    ​
  • file.py
    5.46 kB
    ​
  • general.py
    1.25 kB
    ​
  • gimp.py
    1.11 kB
    ​
  • impress.py
    7.03 kB
    ​
  • info.py
    1.5 kB
    ​
  • misc.py
    21.6 kB
    ​
  • replay.py
    709 B
    ​
  • vlc.py
    3.69 kB
    ​
  • vscode.py
    1.08 kB
    ​
TimothyxxxAdd new section in README for OSWorld-MCP project 8365edc
  1. /
  2. evaluators
  3. desktop_env
  4. getters
Clean code; Refactor environment to pass screenshot content instead of path
2 years ago
Add safe browsing feature to Chrome evaluator - Implemented `get_enable_safe_browsing` function to retrieve safe browsing settings based on the operating system. - Updated the `__init__.py` to include the new function. - Modified JSON examples to reflect the change from enabling enhanced safety browsing to enabling safe browsing. - Added necessary commands in the JSON examples for setting up preferences for safe browsing.
a month ago
Add safe browsing feature to Chrome evaluator - Implemented `get_enable_safe_browsing` function to retrieve safe browsing settings based on the operating system. - Updated the `__init__.py` to include the new function. - Modified JSON examples to reflect the change from enabling enhanced safety browsing to enabling safe browsing. - Added necessary commands in the JSON examples for setting up preferences for safe browsing.
a month ago
fix: Enhance error handling and logging across multiple evaluators - Added logging for file retrieval and error handling in file.py, improving robustness during file operations. - Implemented checks for file existence and parsing errors in general.py, enhancing reliability in JSON/YAML processing. - Improved table comparison logic in table.py with detailed error logging for sheet loading and cell value reading. - Enhanced metrics evaluation in slides.py with additional checks for paragraph and run counts, ensuring thorough comparison. - Updated utils.py to include file existence checks and detailed error logging during cell value reading.
4 months ago
feat: enhance VM wallpaper retrieval and image similarity checks - Added logging to the VM wallpaper retrieval function to capture errors and warnings related to content retrieval and file creation. - Implemented checks for None, empty, and invalid content types to ensure robustness in wallpaper handling. - Enhanced the SSIM structure check function with size validation and improved error handling for image processing. - Added logging for image size discrepancies and exceptions during SSIM computation to aid in debugging. These changes improve error handling and logging, ensuring better maintainability and reliability of the evaluators.
3 months ago
Support Docker VM manager and provider (#75) * Add docker provider framework * Update VM download link * Add stop container * Update docker manager & provider * Update * Update * Update provider
a year ago
Finish loading the vscode examples v1; Improve on the infra: Add accessibility tree into the observation; Add activate window function, etc
2 years ago
[Feature] Initialize and Implement Aguvis Evaluation on OSWorld (#98) * Initialize Aguvis eval on OSWorld * Debug * Debug * v1, internal version * Add experiments script * Fix minor bugs * Update new endpoint * Update ip * Update * Update * Update * Update * Update * Update * Update * Update * Fix model name * Fix docker close issues; update prompting * Fix missed * Fix the default port to avoid crashing on examples like '_update_browse_history_setup' * Fix server and chromium ports in setup * Revert and add missed dependency * Add VLC port for docker * Update * Clean --------- Co-authored-by: Tianbao Xie <tianbaoxie@U-492FC39R-0217.local> Co-authored-by: FredWuCZ <fredwucz@outlook.com>
a year ago
Fix minor errors in vscode and gimp about path and postconfig
2 years ago
add multi-app examples
2 years ago
Check and fix on Chrome tasks - Added `pytz` dependency to `requirements.txt` for timezone handling. - Introduced `get_macys_product_url_parse` function to replace the old `get_url_path_parse` for better clarity and maintain backward compatibility. - Enhanced logging throughout the `get_active_tab_html_parse` and `get_rule_relativeTime` functions for improved debugging and traceability. - Updated JSON examples to reflect changes in expected keys and added new fields for better evaluation context. - Removed deprecated execution commands from JSON examples to streamline the evaluation process.
4 months ago
update multi-apps
2 years ago