[completely optional] direnv+mise autosetup (#87)
Makes life a lot easier in my experience.
a year ago
[completely optional] direnv+mise autosetup (#87)
Makes life a lot easier in my experience.
a year ago
feat&refactor: add proxy setup functionality and update .gitignore for proxy config file
3 months ago
feat: add client password argument to multiple agents and scripts
- Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility.
- Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration.
- Modified evaluation guidelines to reflect the new client password requirement.
- Ensured existing logic remains intact while enhancing functionality for better user experience.
2 months ago
feat: add client password argument to multiple agents and scripts
- Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility.
- Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration.
- Modified evaluation guidelines to reflect the new client password requirement.
- Ensured existing logic remains intact while enhancing functionality for better user experience.
2 months ago
feat: add client password argument to multiple agents and scripts
- Introduced `--client_password` argument in `run_multienv_aguvis.py`, `run_multienv_claude.py`, and `run_multienv_gta1.py` for enhanced security and flexibility.
- Updated agent classes (`PromptAgent`, `AguvisAgent`, `GTA1Agent`) to accept and utilize `client_password` for improved configuration.
- Modified evaluation guidelines to reflect the new client password requirement.
- Ensured existing logic remains intact while enhancing functionality for better user experience.
2 months ago
PROXY_GUIDELINE.md Updates by Changyu Pang from Tsinghua (#41)
* fix proxy readme
* Add logs directory with .gitignore
* Update PROXY_GUIDELINE.md
a year ago
Hiroidfix(maestro): Fixed the debug logging level (#334)
Co-authored-by: Liangxuan Guo <guoliangxuan@deepmatrix.com.cn>a668670
Fix README update
14 days ago
Refactoring VMware Integration and Implementing AWS Support (#44)
* Initailize aws support
* Add README for the VM server
* Refactor OSWorld for supporting more cloud services.
* Initialize vmware and aws implementation v1, waiting for verification
* Initlize files for azure, gcp and virtualbox support
* Debug on the VMware provider
* Fix on aws interface mapping
* Fix instance type
* Refactor
* Clean
* hk region; debug
* Fix lock
* Remove print
* Remove key_name requirements when allocating aws vm
* Clean README
---------
Co-authored-by: XinyuanWangCS <xywang626@gmail.com>
a year ago
feat: add run_multienv_o3.py script for multi-environment evaluation
- Introduced a new script `run_multienv_o3.py` to facilitate end-to-end evaluation across multiple environments.
- Implemented command-line argument parsing for various configurations including environment settings, logging levels, and AWS parameters.
- Integrated signal handling for graceful shutdown of environments and processes.
- Enhanced logging capabilities for better traceability during execution.
- Maintained existing logic from previous scripts while introducing new functionalities for improved evaluation processes.
2 months ago
feat: add run_multienv_o3.py script for multi-environment evaluation
- Introduced a new script `run_multienv_o3.py` to facilitate end-to-end evaluation across multiple environments.
- Implemented command-line argument parsing for various configurations including environment settings, logging levels, and AWS parameters.
- Integrated signal handling for graceful shutdown of environments and processes.
- Enhanced logging capabilities for better traceability during execution.
- Maintained existing logic from previous scripts while introducing new functionalities for improved evaluation processes.
2 months ago
feat: add run_multienv_o3.py script for multi-environment evaluation
- Introduced a new script `run_multienv_o3.py` to facilitate end-to-end evaluation across multiple environments.
- Implemented command-line argument parsing for various configurations including environment settings, logging levels, and AWS parameters.
- Integrated signal handling for graceful shutdown of environments and processes.
- Enhanced logging capabilities for better traceability during execution.
- Maintained existing logic from previous scripts while introducing new functionalities for improved evaluation processes.
2 months ago
Enhance Public Evaluation Guidelines by adding new images for AWS setup and monitoring instructions. Included additional contact information for leaderboard updates and error reporting. Ensured clarity and usability for users while preserving existing content structure.
2 months ago
feat: refactor run_multienv_qwen25vl.py and qwen25vl_agent.py for improved logging and task management
- Introduced signal handling for graceful shutdown of environments and processes.
- Enhanced logging configuration to support dynamic log levels and structured output.
- Updated argument parsing to include new parameters for model selection and task execution.
- Refactored task distribution logic to streamline environment task management.
- Improved error handling during task execution and environment cleanup.
- Adjusted Qwen25VLAgent initialization to support new model and thought prefix options.
- Reduced max tries for LLM calls to optimize performance.
2 months ago
Update default path_to_vm argument to None in quickstart.py for improved flexibility
10 days ago
Update instruction wording in LibreOffice Impress example to clarify text color change requirements. Address https://github.com/xlang-ai/OSWorld/issues/324
14 days ago
fix(maestro): Fixed the debug logging level (#334)
Co-authored-by: Liangxuan Guo <guoliangxuan@deepmatrix.com.cn>
4 days ago
Add AutoGLM-OS agent (#309)
* autoglm-os initialize
* clean code
* chore: use proxy for download setup
* feat(autoglm-os): add parameter to toggle images
* fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
* update
* add client_password
* update multienv
* fix
* fix prompt
* fix prompt
* fix prompt
* fix sys prompt
* feat: use proxy in file evaluator
* fix client_password
* fix note_prompt
* fix autoglm agent cmd type
* fix
* revert: fix: use temporary directory for files pulled from the vm to prevent potential collision when running multiple instances of the same task in parallel
reverts commit bab5473eea1de0e61b0e1d68b23ce324a5b0ee57
* feat(autoglm): setup tools
* fix(autoglm): remove second time of get a11y tree
* add osworld server restart
* Revert "add osworld server restart"
This reverts commit 7bd9d84122e246ce2a26de0e49c25494244c2b3d.
* fix _launch_setup
* fix autoglm agent tools & xml tree
* fix desktop_env
* fix bug for tool name capitalization
* fix: always use proxy for setup download
* add fail after exceeding max turns
* fix(autoglm): avoid adding image to message when screenshot is empty
* fix maximize_window
* fix maximize_window
* fix maximize_window
* fix import browsertools module bug
* fix task proxy config bug
* restore setup
* refactor desktop env
* restore image in provider
* restore file.py
* refactor desktop_env
* quick fix
* refactor desktop_env.step
* fix our env reset
* add max truns constraint
* clean run script
* clean lib_run_single.py
---------
Co-authored-by: hanyullai <hanyullai@outlook.com>
Co-authored-by: JingBh <jingbohao@yeah.net>
a month ago
fix: update Flask port configuration to support environment variable
- Modified the Flask application to allow the port to be set via the `FLASK_PORT` environment variable, defaulting to 8080 if not specified.
- Ensured existing application logic remains unchanged while enhancing configurability for deployment environments.
2 months ago
Update OpenCV dependency to headless version in requirements and setup files
- Replaced 'opencv-python' with 'opencv-python-headless' in both requirements.txt and setup.py to reduce unnecessary GUI dependencies.
- Added a new .gitkeep file in the logs directory to ensure it is tracked in version control.
- Maintained existing code logic while improving dependency management.
a month ago
feat: enhance logging and signal handling in run_multienv_claude.py
- Refactored logging configuration to support dynamic log levels via command-line arguments, allowing for better control over log verbosity.
- Introduced a new signal handler for graceful shutdown of environments and processes, improving robustness during termination.
- Added functionality to save command-line arguments to a JSON file for better traceability of execution parameters.
- Maintained existing logic while enhancing the overall structure and error handling capabilities of the script.
2 months ago
fix some multi_apps tasks (#245)
* fix chrome
* fix some multi_apps tasks.
* fix some multiapps tasks
* fix some multiapps tasks
---------
Co-authored-by: yuanmengqi <yuanmengqi@mail.ustc.edu.cn>
2 months ago
add GDrive guideline
3 months ago
feat: enhance run_coact.py and related agents with improved task handling and configuration
- Updated TASK_DESCRIPTION in run_coact.py to clarify task-solving steps and requirements.
- Modified configuration parameters for provider name and client password for better security and flexibility.
- Enhanced OrchestratorUserProxyAgent to include user instruction in the auto-reply and improved screenshot handling.
- Adjusted coding_agent.py to ensure proper verification of results before saving changes.
- Improved CUA agent prompts to maintain application state and handle user instructions more effectively.
- Ensured existing code logic remains unchanged while enhancing functionality and usability.
a month ago
Clean Code; Refactor README
a year ago
Wxy/opencua (#290)
* OpenCUA Agent code base
* update url
* debug, modify url input
* debug opencua
* show result
* debug agent history overlap
* modify opencua agent; add comment lines
* update parallel; clean code; use sleep 3s
* ui-tars-0717
* update detail
* add system password to system prompt
* add running command
2 months ago
Fix minor problems when aggragating the results (#106)
10 months ago
Merge branch 'main' of github.com:xlang-ai/OSWorld
24 days ago
Merge branch 'main' of github.com:xlang-ai/OSWorld
24 days ago
Uitars/dev (#291)
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* add all the uitars agents:
1. run_multienv_uitars.py: Qwen2VL-based UITARS models
2. run_multienv_uitars15_v1.py: UITARS1.5-7B
3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
2 months ago
Uitars/dev (#291)
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* add all the uitars agents:
1. run_multienv_uitars.py: Qwen2VL-based UITARS models
2. run_multienv_uitars15_v1.py: UITARS1.5-7B
3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
2 months ago
Uitars/dev (#291)
* use aws pub ip
* os task fix: set the default dim screen time to be 300s
* add all the uitars agents:
1. run_multienv_uitars.py: Qwen2VL-based UITARS models
2. run_multienv_uitars15_v1.py: UITARS1.5-7B
3. run_multienv_uitars15_v2.py: SeedVL1.5 thining/non-thinking
---------
Co-authored-by: Jiaqi <dengjiaqi@moonshot.cn>
2 months ago
fix multienv bug (#327)
16 days ago
fix multienv bug (#327)
16 days ago
Merge pull request #264 from yuanmengqi/main
Improve the parallel logic
2 months ago
Add support for GUI-Owl agent (#318)
* add run_multienv_owl.py
* add owl_agent.py
19 days ago
add support for mobile agent v3 (#328)
* add support for mobile agent v3
* add mobile_agent
* add support for mobile agent v3
15 days ago
add support for mobile agent v3 (#328)
* add support for mobile agent v3
* add mobile_agent
* add support for mobile agent v3
15 days ago
Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333)
* Added a **pyproject.toml** file to define project metadata and dependencies.
* Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic.
* Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis.
* Added a **tools module** containing utility functions and tool configurations to improve code reusability.
* Updated the **README** and documentation with usage examples and module descriptions.
These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience.
Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
7 days ago
Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333)
* Added a **pyproject.toml** file to define project metadata and dependencies.
* Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic.
* Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis.
* Added a **tools module** containing utility functions and tool configurations to improve code reusability.
* Updated the **README** and documentation with usage examples and module descriptions.
These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience.
Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
7 days ago
Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333)
* Added a **pyproject.toml** file to define project metadata and dependencies.
* Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic.
* Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis.
* Added a **tools module** containing utility functions and tool configurations to improve code reusability.
* Updated the **README** and documentation with usage examples and module descriptions.
These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience.
Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
7 days ago
Add multiple new modules and tools to enhance the functionality and extensibility of the Maestro project (#333)
* Added a **pyproject.toml** file to define project metadata and dependencies.
* Added **run\_maestro.py** and **osworld\_run\_maestro.py** to provide the main execution logic.
* Introduced multiple new modules, including **Evaluator**, **Controller**, **Manager**, and **Sub-Worker**, supporting task planning, state management, and data analysis.
* Added a **tools module** containing utility functions and tool configurations to improve code reusability.
* Updated the **README** and documentation with usage examples and module descriptions.
These changes lay the foundation for expanding the Maestro project’s functionality and improving the user experience.
Co-authored-by: Hiroid <guoliangxuan@deepmatrix.com>
7 days ago