[Add] browser-use and main.py

2025-05-18 21:57:54 +09:00 · 2025-05-18 21:57:54 +09:00 · 96914d44ac
commit 96914d44ac
parent 08e64bdf45
221 changed files with 30952 additions and 1 deletions
--- a/browser-use/docs/development/contribution-guide.mdx
+++ b/browser-use/docs/development/contribution-guide.mdx
@ -0,0 +1,12 @@
+---
+title: "Contribution Guide"
+description: "Learn how to contribute to Browser Use"
+icon: "github"
+---
+
+
+- check out our most active issues or ask in [Discord](https://discord.gg/zXJJHtJf3k) for ideas of what to work on
+- get inspiration / share what you build in the [`#showcase-your-work`](https://discord.com/channels/1303749220842340412/1305549200678850642) channel and on [`awesome-browser-use-prompts`](https://github.com/browser-use/awesome-prompts)!
+- no typo/style-only nit PRs, you can submit nit fixes but only if part of larger bugfix or new feature PRs
+- include a demo screenshot/gif, tests, and ideally an example script demonstrating any changes in your PR
+- bump your issues/PRs with comments periodically if you want them to be merged faster
--- a/browser-use/docs/development/evaluations.mdx
+++ b/browser-use/docs/development/evaluations.mdx
@ -0,0 +1,48 @@
+---
+title: "Evaluations"
+description: "Test the Browser Use agent on standardized benchmarks"
+icon: "chart-bar"
+---
+
+## Prerequisites
+
+Browser Use uses proprietary/private test sets that must never be committed to Github and must be fetched through a authorized api request.
+Accessing these test sets requires an approved Browser Use account.
+There are currently no publicly available test sets, but some may be released in the future.
+
+## Get an Api Access Key
+
+First, navigate to https://browser-use.tools and log in with an authorized browser use account.
+
+Then, click the "Account" button at the top right of the page, and click the "Cycle New Key" button on that page.
+
+Copy the resulting url and secret key into your `.env` file. It should look like this:
+
+```bash .env
+EVALUATION_TOOL_URL= ...
+EVALUATION_TOOL_SECRET_KEY= ...
+```
+
+## Running Evaluations
+
+First, ensure your file `eval/service.py` is up to date.
+
+Then run the file:
+
+```bash
+python eval/service.py
+```
+
+## Configuring Evaluations
+
+You can modify the evaluation by providing flags to the evaluation script. For instance:
+
+```bash
+python eval/service.py --parallel_runs 5 --parallel_evaluations 5 --max-steps 25 --start 0 --end 100 --model gpt-4o
+```
+
+The evaluations webpage has a convenient GUI for generating these commands. To use it, navigate to https://browser-use.tools/dashboard.
+
+Then click the button "New Eval Run" on the left panel. This will open a interface with selectors, inputs, sliders, and switches.
+
+Input your desired configuration into the interface and copy the resulting python command at the bottom. Then run this command as before.
--- a/browser-use/docs/development/local-setup.mdx
+++ b/browser-use/docs/development/local-setup.mdx
@ -0,0 +1,119 @@
+---
+title: "Local Setup"
+description: "Set up Browser Use development environment locally"
+icon: "laptop-code"
+---
+
+## Prerequisites
+
+Browser Use requires Python 3.11 or higher. We recommend using [uv](https://docs.astral.sh/uv/) for Python environment management.
+
+## Clone the Repository
+
+First, clone the Browser Use repository:
+
+```bash
+git clone https://github.com/browser-use/browser-use
+cd browser-use
+```
+
+## Environment Setup
+
+1. Create and activate a virtual environment:
+
+```bash
+uv venv --python 3.11
+source .venv/bin/activate
+```
+
+2. Install dependencies:
+
+```bash
+# Install the package in editable mode with all development dependencies
+uv sync --all-extras
+
+# Install the default browser
+playwright install chromium --with-deps --no-shell
+```
+
+## Configuration
+
+Set up your environment variables:
+
+```bash
+# Copy the example environment file
+cp .env.example .env
+```
+
+Or manually create a `.env` file with the API key for the models you want to use set:
+
+```bash .env
+OPENAI_API_KEY=...
+ANTHROPIC_API_KEY=
+AZURE_ENDPOINT=
+AZURE_OPENAI_API_KEY=
+GOOGLE_API_KEY=
+DEEPSEEK_API_KEY=
+GROK_API_KEY=
+NOVITA_API_KEY=
+```
+
+<Note>
+  You can use any LLM model supported by LangChain. See 
+  [LangChain Models](/customize/supported-models) for available options and their specific
+  API key requirements.
+</Note>
+
+## Development
+
+After setup, you can:
+
+- Try demos in the example library with `uv run examples/simple.py`
+- Run the linter/formatter with `uv run ruff format examples/some/file.py`
+- Run tests with `uv run pytest`
+- Build the package with `uv build`
+
+### Linting
+
+```bash
+# Run the linter on the whole project (must pass for PR to be allowed to merge)
+uv run pre-commit run --all-files
+
+# Install the linter & formatter pre-commit hooks to run automatically
+pre-commit install --install-hooks
+
+# Experimental: run the type checker
+uv run type
+```
+
+### Tests
+
+```bash
+# Run tests
+uv run pytest                                                                         # run everything
+uv run pytest tests/test_controller.py                                                # run a specific test file
+uv run pytest tests/test_sensitive_data.py tests/test_tab_management.py               # run two test files
+uv run pytest tests/test_tab_management.py::TestTabManagement::test_user_changes_tab  # run a single test
+```
+
+### Build
+
+```bash
+uv build
+uv pip install dist/*.whl
+
+# bush build to PyPI (automatically run by Github Actions CI)
+uv publish
+```
+
+## Getting Help
+
+If you run into any issues:
+
+1. Check our [GitHub Issues](https://github.com/browser-use/browser-use/issues)
+2. Join our [Discord community](https://link.browser-use.com/discord) for support
+
+<Note>
+  We welcome contributions! See our [Contribution Guide](/development/contribution-guide) for guidelines on how to help improve
+  Browser Use.
+</Note>
--- a/browser-use/docs/development/n8n-integration.mdx
+++ b/browser-use/docs/development/n8n-integration.mdx
@ -0,0 +1,122 @@
+---
+title: 'n8n Integration'
+description: 'Learn how to integrate Browser Use with n8n workflows'
+---
+
+# Browser Use n8n Integration
+
+Browser Use can be integrated with [n8n](https://n8n.io), a workflow automation platform, using our community node. This integration allows you to trigger browser automation tasks directly from your n8n workflows.
+
+## Installing the n8n Community Node
+
+There are several ways to install the Browser Use community node in n8n:
+
+### Using n8n Desktop or Cloud
+
+1. Navigate to **Settings > Community Nodes**
+2. Click on **Install**
+3. Enter `n8n-nodes-browser-use` in the **Name** field
+4. Click **Install**
+
+### Using a Self-hosted n8n Instance
+
+Run the following command in your n8n installation directory:
+
+```bash
+npm install n8n-nodes-browser-use
+```
+
+### For Development
+
+If you want to develop with the n8n node:
+
+1. Clone the repository:
+   ```bash
+   git clone https://github.com/draphonix/n8n-nodes-browser-use.git
+   ```
+2. Install dependencies:
+   ```bash
+   cd n8n-nodes-browser-use
+   npm install
+   ```
+3. Build the code:
+   ```bash
+   npm run build
+   ```
+4. Link to your n8n installation:
+   ```bash
+   npm link
+   ```
+5. In your n8n installation directory:
+   ```bash
+   npm link n8n-nodes-browser-use
+   ```
+
+## Setting Up Browser Use Cloud API Credentials
+
+To use the Browser Use node in n8n, you need to configure API credentials:
+
+1. Sign up for an account at [Browser Use Cloud](https://cloud.browser-use.com)
+2. Navigate to the Settings or API section
+3. Generate or copy your API key
+4. In n8n, create a new credential:
+   - Go to **Credentials** tab
+   - Click **Create New**
+   - Select **Browser Use Cloud API**
+   - Enter your API key
+   - Save the credential
+
+## Using the Browser Use Node
+
+Once installed, you can add the Browser Use node to your workflows:
+
+1. In your workflow editor, search for "Browser Use" in the nodes panel
+2. Add the node to your workflow
+3. Set-up the credentials
+4. Choose your saved credentials
+5. Select an operation:
+   - **Run Task**: Execute a browser automation task with natural language instructions
+   - **Get Task**: Retrieve task details
+   - **Get Task Status**: Check task execution status
+   - **Pause/Resume/Stop Task**: Control running tasks
+   - **Get Task Media**: Retrieve screenshots, videos, or PDFs
+   - **List Tasks**: Get a list of tasks
+
+### Example: Running a Browser Task
+
+Here's a simple example of how to use the Browser Use node to run a browser task:
+
+1. Add the Browser Use node to your workflow
+2. Select the "Run Task" operation
+3. In the "Instructions" field, enter a natural language description of what you want the browser to do, for example:
+   ```
+   Go to example.com, take a screenshot of the homepage, and extract all the main heading texts
+   ```
+4. Optionally enable "Save Browser Data" to preserve cookies and session information
+5. Connect the node to subsequent nodes to process the results
+
+## Workflow Examples
+
+The Browser Use n8n node enables various automation scenarios:
+
+- **Web Scraping**: Extract data from websites on a schedule
+- **Form Filling**: Automate data entry across web applications
+- **Monitoring**: Check website status and capture visual evidence
+- **Report Generation**: Generate PDFs or screenshots of web dashboards
+- **Multi-step Processes**: Chain browser tasks together using session persistence
+
+## Troubleshooting
+
+If you encounter issues with the Browser Use node:
+
+- Verify your API key is valid and has sufficient credits
+- Check that your instructions are clear and specific
+- For complex tasks, consider breaking them into multiple steps
+- Refer to the [Browser Use documentation](https://docs.browser-use.com) for instruction best practices
+
+## Resources
+
+- [n8n Community Nodes Documentation](https://docs.n8n.io/integrations/community-nodes/)
+- [Browser Use Documentation](https://docs.browser-use.com)
+- [Browser Use Cloud](https://cloud.browser-use.com)
+- [n8n-nodes-browser-use GitHub Repository](https://github.com/draphonix/n8n-nodes-browser-use) 
--- a/browser-use/docs/development/observability.mdx
+++ b/browser-use/docs/development/observability.mdx
@ -0,0 +1,66 @@
+---
+title: "Observability"
+description: "Trace Browser Use's agent execution steps and browser sessions"
+icon: "eye"
+---
+
+## Overview
+
+Browser Use has a native integration with [Laminar](https://lmnr.ai) - open-source platform for tracing, evals and labeling of AI agents.
+Read more about Laminar in the [Laminar docs](https://docs.lmnr.ai).
+
+<Note>
+  Laminar excels at tracing browser agents by providing unified visibility into both browser session recordings and agent execution steps.
+</Note>
+
+## Setup
+
+To setup Laminar, you need to install the `lmnr` package and set the `LMNR_PROJECT_API_KEY` environment variable.
+
+To get your project API key, you can either:
+- Register on [Laminar Cloud](https://lmnr.ai) and get the key from your project settings
+- Or spin up a local Laminar instance and get the key from the settings page
+
+```bash
+pip install 'lmnr[all]'
+export LMNR_PROJECT_API_KEY=<your-project-api-key>
+```
+
+## Usage
+
+Then, you simply initialize the Laminar at the top of your project and both Browser Use and session recordings will be automatically traced.
+
+```python {5-8}
+from langchain_openai import ChatOpenAI
+from browser_use import Agent
+import asyncio
+
+from lmnr import Laminar
+# this line auto-instruments Browser Use and any browser you use (local or remote)
+Laminar.initialize(project_api_key="...") # you can also pass project api key here
+
+async def main():
+    agent = Agent(
+        task="open google, search Laminar AI",
+        llm=ChatOpenAI(model="gpt-4o-mini"),
+    )
+    result = await agent.run()
+    print(result)
+
+asyncio.run(main())
+```
+
+## Viewing Traces
+
+You can view traces in the Laminar UI by going to the traces tab in your project.
+When you select a trace, you can see both the browser session recording and the agent execution steps.
+
+Timeline of the browser session is synced with the agent execution steps, timeline highlights indicate the agent's current step synced with the browser session.
+In the trace view, you can also see the agent's current step, the tool it's using, and the tool's input and output. Tools are highlighted in the timeline with a yellow color.
+
+<img className="block" src="/images/laminar.png" alt="Laminar" />
+
+
+## Laminar
+
+To learn more about tracing and evaluating your browser agents, check out the [Laminar docs](https://docs.lmnr.ai).
--- a/browser-use/docs/development/roadmap.mdx
+++ b/browser-use/docs/development/roadmap.mdx
@ -0,0 +1,7 @@
+---
+title: "Roadmap"
+description: "Future plans and upcoming features for Browser Use"
+icon: "road"
+---
+
+Big things coming soon!
--- a/browser-use/docs/development/telemetry.mdx
+++ b/browser-use/docs/development/telemetry.mdx
@ -0,0 +1,39 @@
+---
+title: "Telemetry"
+description: "Understanding Browser Use's telemetry and privacy settings"
+icon: "chart-mixed"
+---
+
+## Overview
+
+Browser Use collects anonymous usage data to help us understand how the library is being used and to improve the user experience. It also helps us fix bugs faster and prioritize feature development.
+
+## Data Collection
+
+We use [PostHog](https://posthog.com) for telemetry collection. The data is completely anonymized and contains no personally identifiable information.
+
+<Note>
+  We never collect personal information, credentials, or specific content from
+  your browser automation tasks.
+</Note>
+
+## Opting Out
+
+You can disable telemetry by setting an environment variable:
+
+```bash .env
+ANONYMIZED_TELEMETRY=false
+```
+
+Or in your Python code:
+
+```python
+import os
+os.environ["ANONYMIZED_TELEMETRY"] = "false"
+```
+
+<Note>
+  Even when enabled, telemetry has zero impact on the library's performance or
+  functionality. Code is available in [Telemetry
+  Service](https://github.com/browser-use/browser-use/tree/main/browser_use/telemetry).
+</Note>