browser-use-oauth/.github/instructions/browser-settings.instructions.md at bcca364021fbe54f2404de9664072c23ecd78aa0

mirror of https://github.com/j93es/browser-use-oauth.git synced 2026-06-04 06:41:53 +09:00

imnyang 638a3d47ce Add comprehensive documentation for Browser Use features

- Introduced custom output format instructions with example code.
- Detailed connection methods for launching and connecting to browsers, including local and remote options.
- Provided guidelines for handling sensitive data securely, including best practices and examples.
- Documented supported LangChain chat models with setup instructions and environment variable requirements.
- Added instructions for customizing the system prompt to control agent behavior.

2025-06-21 16:18:16 +09:00

29 KiB

Raw Blame History

description	applyTo
Launch or connect to an existing browser and configure it to your needs.	**

Browser Use uses playwright (or patchright) to manage its connection with a real browser.

To launch or connect to a browser, pass any playwright / browser-use configuration arguments you want to BrowserSession(...):

from browser_use import BrowserSession, Agent

browser_session = BrowserSession(
    headless=True,
    viewport={'width': 964, 'height': 647},
    user_data_dir='~/.config/browseruse/profiles/default',
)
agent = Agent('fill out the form on this page', browser_session=browser_session)

The new `BrowserSession` & `BrowserProfile` accept all the same arguments that Playwright's [`launch_persistent_context(...)`](https://playwright.dev/python/docs/api/class-browsertype#browser-type-launch-persistent-context) takes, giving you full control over browser settings at launch. (see below for the full list)

`BrowserSession`

🎭 BrowserSession(**params) is Browser Use's object that tracks a playwright connection to a running browser. It sets up:
- the playwright library, browser and/or browser_context, and page objects and tracks which tabs the agent & human are focused on
- methods to interact with the browser window, apply config needed by the Agent, and run the DOMService for element detection
- it can take a browser_profile=BrowserProfile(...) template containing some config defaults, and **kwargs session-specific config overrides

Browser Connection Parameters

Provide any one of these options to connect to an existing browser. These options are session-specific and cannot be stored in a BrowserProfile(...) template.

`wss_url`

wss_url: str | None = None

WSS URL of the playwright-protocol browser server to connect to. See here for WSS connection instructions.

`cdp_url`

cdp_url: str | None = None

CDP URL of the browser to connect to (e.g. http://localhost:9222). See here for CDP connection instructions.

`browser_pid`

browser_pid: int | None = None

PID of a running chromium-based browser process to connect to on localhost. See here for connection via pid instructions.

For web scraping tasks on sites that restrict automated access, we recommend using [our cloud](https://browser-use.com) or an external browser provider for better reliability. See the [Connect to your Browser](real-browser) guide for detailed connection instructions.

Session-Specific Parameters

`browser_profile`

browser_profile: BrowserProfile = BrowserProfile()

Optional BrowserProfile template containing default config to use for the BrowserSession. (see below for more info)

`playwright`

playwright: Playwright | None = None

Optional playwright or patchright API client handle to use, the result of (await async_playwright().start()) or (await async_patchright().start()), which spawns a node.js child subprocess that relays commands to the browser over CDP.

See here for more detailed usage instructions.

`browser`

browser: Browser | None = None

Playwright Browser object to use (optional). See here for more detailed usage instructions.

`browser_context`

browser_context: BrowserContext | None = None

Playwright BrowserContext object to use (optional). See here for more detailed usage instructions.

`page` aka `agent_current_page`

page: Page | None = None

Foreground Page that the agent is focused on, can also be passed as page=... as a shortcut. See here for more detailed usage instructions.

`human_current_page`

human_current_page: Page | None = None

Foreground Page that the human is focused on to start, not necessary to set manually.

`initialized`

initialized: bool = False

Mark BrowserSession as already initialized, skips launch/connection (not recommended)

`**kwargs`

BrowserSession can also accept all of the parameters below. (the parameters above this point are specific to BrowserSession and cannot be stored in a BrowserProfile template)

Extra **kwargs passed to BrowserSession(...) act as session-specific overrides to the BrowserProfile(...) template.

base_iphone13 = BrowserProfile(
    storage_state='/tmp/auth.json',     # share cookies between parallel browsers
    **playwright.devices['iPhone 13'],
    timezone_id='UTC',
)
usa_phone = BrowserSession(
    browser_profile=base_iphone13,
    timezone_id='America/New_York',     # kwargs override values in base_iphone13
)
eu_phone = BrowserSession(
    browser_profile=base_iphone13,
    timezone_id='Europe/Paris',
)

usa_agent = Agent(task='show me todays schedule...', browser_session=usa_phone)
eu_agent = Agent(task='show me todays schedule...', browser_session=eu_phone)
await asyncio.gather(agent1.run(), agent2.run())

`BrowserProfile`

A BrowserProfile is a 📋 config template for a 🎭 BrowserSession(...).

It's basically just a typed + validated version of a dict to hold config.

When you find yourself storing or re-using many browser configs, you can upgrade from:

- config = {key: val, key: val, ...}
- BrowserSession(**config)

To this instead:

+ config = BrowserProfile(key=val, key=val, ...)
+ BrowserSession(browser_profile=config)

You don't ever *need* to use a `BrowserProfile`, you can always pass config parameters directly to `BrowserSession`: ```python session = BrowserSession(headless=True, storage_state='auth.json', viewport={...}, ...) ```

BrowserProfile is optional, but it provides a number of benefits over a normal dict for holding config:

has type hints and pydantic field descriptions that show up in your IDE
validates config at runtime quickly without having to start a browser
provides helper methods to autodetect screen size, set up local paths, save/load config as json, and more...

`BrowserProfiles`s are designed to easily be given 🆔 `uuid`s and put in a database + made editable by users. `BrowserSession`s get their own 🆔 `uuid`s and be linked by 🖇 foreign key to whatever `BrowserProfiles` they use.

This cleanly separates the per-connection rows from the bulky re-usable config and avoids wasting space in your db. This is useful because a user may only have 2 or 3 profiles, but they could have 100k+ sessions within a few months.

BrowserProfile and BrowserSession can both take any of the:

Playwright parameters
Browser-Use parameters (extra options we provide on top of playwright)

The only parameters BrowserProfile can NOT take are the session-specific connection parameters and live playwright objects:
cdp_url, wss_url, browser_pid, page, browser, browser_context, playwright, etc.

Basic Example

from browser_use.browser import BrowserProfile

profile = BrowserProfile(
    stealth=True,
    storage_state='/tmp/google_docs_cookies.json',
    allowed_domains=['docs.google.com', 'https://accounts.google.com'],
    viewport={'width': 396, 'height': 774},
    # ... playwright args / browser-use config args ...
)

phone1 = BrowserSession(browser_profile=profile, device_scale_factor=1)
phone2 = BrowserSession(browser_profile=profile, device_scale_factor=2)
phone3 = BrowserSession(browser_profile=profile, device_scale_factor=3)

Browser-Use Parameters

These parameters control Browser Use-specific features, and are outside the standard playwright set. They can be passed to BrowserSession(...) and/or stored in a BrowserProfile template.

`keep_alive`

keep_alive: bool | None = None

If True it wont close the browser after the first agent.run() ends. Useful for running multiple tasks with the same browser instance. If this is left as None and the Agent launched its own browser, the default is to close the browser after the agent completes. If the agent connected to an existing browser then it will leave it open.

`stealth`

stealth: bool = False

Set to True to use patchright to avoid bot-blocking. (Might cause issues with some sites, requires manual testing.)

`allowed_domains`

allowed_domains: list[str] | None = None

List of allowed domains for navigation. If None, all domains are allowed. Example: ['google.com', '*.wikipedia.org'] - Here the agent will only be able to access google.com exactly and wikipedia.org + *.wikipedia.org.

Glob patterns are supported:

['example.com'] ✅ will match only https://example.com/* exactly, subdomains will not be allowed. It's always the most secure to list all the domains you want to give the access to explicitly w/ schemes e.g. ['https://google.com', 'http*://www.google.com', 'https://myaccount.google.com', 'https://mail.google.com', 'https://docs.google.com']
['*.example.com'] ⚠️ CAUTION this will match https://example.com and all its subdomains. Make sure all the subdomains are safe for the agent! abc.example.com, def.example.com, ..., useruploads.example.com, admin.example.com

`disable_security`

disable_security: bool = False

Completely disables all basic browser security features. Allows interacting across cross-site iFrames boundaries, but

This option is very INSECURE and is only for niche use cases. DO NOT LET YOUR AGENT visit untrusted URLs or give it real cookies when `disable_security=True`. Visiting a single malicious site in this mode can trivially compromise *all* the cookies in the browser profile in under 1 second.

`deterministic_rendering`

deterministic_rendering: bool = False

Attempt to forced more deterministic rendering for consistent screenshots across different host operating systems and hardware.

Disables OS-specific font hints, aliasing, GPU-accelerated rendering, normalizes DPI, and sets a specific JS random seed to try to avoid nondeterministic JS.

This flag is for niche use cases (e.g. screenshot diffing) where pixel-perfect rendering across different server operating systems is more important than stability. It makes the agent more likely to be blocked as a bot and triggers some glitchy behavior in chrome occasionally, it's not recommended unless you know you need it.

`highlight_elements`

highlight_elements: bool = True

Highlight interactive elements on the screen with colorful bounding boxes.

`viewport_expansion`

viewport_expansion: int = 500

Viewport expansion in pixels. With this you can control how much of the page is included in the context of the LLM:

-1: All elements from the entire page will be included, regardless of visibility (highest token usage but most complete).
0: Only elements which are currently visible in the viewport will be included.
500 (default): Elements in the viewport plus an additional 500 pixels in each direction will be included, providing a balance between context and token usage.

`include_dynamic_attributes`

include_dynamic_attributes: bool = True

Include dynamic attributes in selectors for better element targeting.

`minimum_wait_page_load_time`

minimum_wait_page_load_time: float = 0.25

Minimum time to wait before capturing page state for LLM input.

`wait_for_network_idle_page_load_time`

wait_for_network_idle_page_load_time: float = 0.5

Time to wait for network activity to cease. Increase to 3-5s for slower websites. This tracks essential content loading, not dynamic elements like videos.

`maximum_wait_page_load_time`

maximum_wait_page_load_time: float = 5.0

Maximum time to wait for page load before proceeding.

`wait_between_actions`

wait_between_actions: float = 0.5

Time to wait between agent actions.

`cookies_file`

cookies_file: str | None = None

JSON file path to save cookies to.

This option is DEPRECATED. Use [`storage_state`](#storage-state) instead, it's the standard playwright format and also supports `localStorage` and `indexedDB`!

The library will automatically save a new storage_state.json next to any cookies_file path you provide, just use `storage_state='path/to/storage_state.json' to switch to the new format:

cookies_file.json: [{cookie}, {cookie}, {cookie}]
⬇️ storage_state.json: {"cookies": [{cookie}, {cookie}, {cookie}], "origins": {... optional localstorage state ...}}

Or run playwright open https://example.com/ --save-storage=storage_state.json and log into any sites you need to generate a fresh storage state file.

`profile_directory`

profile_directory: str = 'Default'

Chrome profile subdirectory name inside of your user_data_dir (e.g. Default, Profile 1, Work, etc.). No need to set this unless you have multiple profiles set up in a single user_data_dir and need to use a specific one.

`window_position`

window_position: dict | None = {"width": 0, "height": 0}

Window position from top-left.

Playwright Launch Options

All the parameters below are standard playwright parameters and can be passed to both BrowserSession and BrowserProfile. They are defined in browser_use/browser/profile.py. See here for the official Playwright documentation for all of these options.

`headless`

headless: bool | None = None

Runs the browser without a visible UI. If None, auto-detects based on display availability. If you set headless=False on a server with no monitor attached, the browser will fail to launch (use xvfb + vnc to give a headless server a virtual display you can remote control).

headless=False is recommended for maximum stealth and is required for human-in-the-loop workflows.

`channel`

channel: BrowserChannel = 'chromium'

Browser channel: ['chromium'] (default when stealth=False), 'chrome' (default when stealth=True), 'chrome-beta', 'chrome-dev', 'chrome-canary', 'msedge', 'msedge-beta', 'msedge-dev', 'msedge-canary'

Don't worry, other chromium-based browsers not in this list (e.g. brave) are still supported if you provide your own executable_path, just set it to chromium for those.

`executable_path`

executable_path: str | Path | None = None

Path to browser executable for custom installations.

`user_data_dir`

user_data_dir: str | Path | None = '~/.config/browseruse/profiles/default'

Directory for browser profile data. Set to None to use an ephemeral temporary profile (aka incognito mode).

Multiple running browsers cannot share a single user_data_dir at the same time. You must set it to None or provide a unique user_data_dir per-session if you plan to run multiple browsers.

The browser version run must always be equal to or greater than the version used to create the user_data_dir. If you see errors like Failed to parse Extensions or similar and failures when launching, you're attempting to run an older browser with an incompatible user_data_dir that's already been migrated to a newer schema version.

`args`

args: list[str] = []

Additional command-line arguments to pass to the browser. See here for the full list of available chrome launch options.

`ignore_default_args`

ignore_default_args: list[str] | bool = ['--enable-automation', '--disable-extensions']

List of default CLI args to stop playwright from including when launching chrome. Set it to True to disable all default options (not recommended).

`env`

env: dict[str, str] = {}

Extra environment variables to set when launching browser. e.g. {'DISPLAY': '1'} to use a specific X11 display.

`chromium_sandbox`

chromium_sandbox: bool = not IN_DOCKER

Whether to enable Chromium sandboxing (recommended for security). Should always be False when running inside Docker because Docker provides its own sandboxing can conflict with Chrome's.

`devtools`

devtools: bool = False

Whether to open DevTools panel automatically (only works when headless=False).

`slow_mo`

slow_mo: float = 0

Slow down actions by this many milliseconds.

`timeout`

timeout: float = 30000

Default timeout in milliseconds for connecting to a remote browser.

`accept_downloads`

accept_downloads: bool = True

Whether to automatically accept all downloads.

`proxy`

proxy: dict | None = None

Proxy settings. Example: {"server": "http://proxy.com:8080", "username": "user", "password": "pass"}.

`permissions`

permissions: list[str] = ['clipboard-read', 'clipboard-write', 'notifications']

Browser permissions to grant. See here for the full list of available permission.

`storage_state`

storage_state: str | Path | dict | None = None

Browser storage state (cookies, localStorage). Can be file path or dict. See here for the Playwright storage_state documentation on how to use it. This option is only applied when launching a new browser using the default builtin playwright chromium and user_data_dir=None is set.

# to create a storage state file, run the following and log into the sites you need once the browser opens:
playwright open https://example.com/ --save-storage=./storage_state.json
# then setup a BrowserSession with storage_state='./storage_state.json' and user_data_dir=None to use it

Playwright Timing Settings

These control how the browser waits for CDP API calls to complete and pages to load.

`default_timeout`

default_timeout: float | None = None

Default timeout for Playwright operations in milliseconds.

`default_navigation_timeout`

default_navigation_timeout: float | None = None

Default timeout for page navigation in milliseconds.

Playwright Viewport Options

Configure browser window size, viewport, and display properties:

`user_agent`

user_agent: str | None = None

Specific user agent to use in this context.

`is_mobile`

is_mobile: bool = False

Whether the meta viewport tag is taken into account and touch events are enabled.

`has_touch`

has_touch: bool = False

Specifies if viewport supports touch events.

`geolocation`

geolocation: dict | None = None

Geolocation coordinates. Example: {"latitude": 59.95, "longitude": 30.31667}

`locale`

locale: str | None = None

Specify user locale, for example en-GB, de-DE, etc. Locale will affect the navigator.language value, Accept-Language request header value as well as number and date formatting rules.

`timezone_id`

timezone_id: str | None = None

Timezone identifier (e.g., 'America/New_York').

`window_size`

window_size: dict | None = None

Browser window size for headful mode. Example: {"width": 1920, "height": 1080}

`viewport`

viewport: dict | None = None

Viewport size with width and height. Example: {"width": 1280, "height": 720}

`no_viewport`

no_viewport: bool | None = not headless

Disable fixed viewport. Content will resize with window.

Tip: don't use this parameter, it's a playwright standard parameter but it's redundant and only serves to override the viewport setting above. A viewport is always used in headless mode regardless of this setting, and is never used in headful mode unless you pass viewport={width, height} explicitly.

`device_scale_factor`

device_scale_factor: float | None = None

Device scale factor (DPI). Useful for high-resolution screenshots (set it to 2).

`screen`

screen: dict | None = None

Screen size available to browser. Auto-detected if not specified.

`color_scheme`

color_scheme: ColorScheme = 'light'

Preferred color scheme: 'light', 'dark', 'no-preference'

`contrast`

contrast: Contrast = 'no-preference'

Contrast preference: 'no-preference', 'more', 'null'

`reduced_motion`

reduced_motion: ReducedMotion = 'no-preference'

Reduced motion preference: 'reduce', 'no-preference', 'null'

`forced_colors`

forced_colors: ForcedColors = 'none'

Forced colors mode: 'active', 'none', 'null'

`**playwright.devices[...]`

Playwright provides launch & context arg presets to emulate common device fingerprints.

BrowserProfile(
    ...
    **playwright.devices['iPhone 13'],    # playwright = await async_playwright().start()
)

Because BrowserSession and BrowserProfile take all the standard playwright args, we are able to support these device presets as well.

Playwright Security Options

See allowed_domains above too!

`offline`

offline: bool = False

Emulate network being offline.

`http_credentials`

http_credentials: dict | None = None

Credentials for HTTP authentication.

`extra_http_headers`

extra_http_headers: dict[str, str] = {}

Additional HTTP headers to be sent with every request.

`ignore_https_errors`

ignore_https_errors: bool = False

Whether to ignore HTTPS errors when sending network requests.

`bypass_csp`

bypass_csp: bool = False

Toggles bypassing Content-Security-Policy.

`java_script_enabled`

java_script_enabled: bool = True

Whether or not to enable JavaScript in the context.

`service_workers`

service_workers: ServiceWorkers = 'allow'

Whether to allow sites to register Service workers: 'allow', 'block'

`base_url`

base_url: str | None = None

Base URL to be used in page.goto() and similar operations.

`strict_selectors`

strict_selectors: bool = False

If true, selector passed to Playwright methods will throw if more than one element matches.

`client_certificates`

client_certificates: list[ClientCertificate] = []

Client certificates to be used with requests.

Playwright Recording Options

Note: Browser Use also provides some of our own recording-related options not listed below (see above).

`record_video_dir`

record_video_dir: str | Path | None = None

Directory to save .webm video recordings. Playwright Docs: record_video_dir

This parameter also has an alias `save_recording_path` for backwards compatibility with past versions, but we recommend using the standard Playwright name `record_video_dir` going forward.

`record_video_size`

record_video_size: dict | None = None. [Playwright Docs: `record_video_size`](https://playwright.dev/python/docs/api/class-browsertype#browser-type-launch-persistent-context-option-record-video-size)

Video size. Example: {"width": 1280, "height": 720}

`record_har_path`

record_har_path: str | Path | None = None

Path to save .har network trace files. Playwright Docs: record_har_path

This parameter also has an alias `save_har_path` for backwards compatibility with past versions, but we recommend using the standard Playwright name `record_har_path` going forward.

`record_har_content`

record_har_content: RecordHarContent = 'embed'

How to persist HAR content: 'omit', 'embed', 'attach'

`record_har_mode`

record_har_mode: RecordHarMode = 'full'

HAR recording mode: 'full', 'minimal'

`record_har_omit_content`

record_har_omit_content: bool = False

Whether to omit request content from the HAR.

`record_har_url_filter`

record_har_url_filter: str | Pattern | None = None

URL filter for HAR recording.

`downloads_path`

downloads_path: str | Path | None = '~/.config/browseruse/downloads'

(aliases: downloads_dir, save_downloads_path)

Local filesystem directory to save browser file downloads to.

`traces_dir`

traces_dir: str | Path | None = None

Directory to save all-in-one trace files. Files are automatically named as {traces_dir}/{context_id}.zip. Playwright Docs: traces_dir

This parameter also has an alias `trace_path` for backwards compatibility with past versions, but we recommend using the standard Playwright name `traces_dir` going forward.

`handle_sighup`

handle_sighup: bool = True

Whether playwright should swallow SIGHUP signals and kill the browser.

`handle_sigint`

handle_sigint: bool = False

Whether playwright should swallow SIGINT signals and kill the browser.

`handle_sigterm`

handle_sigterm: bool = False

Whether playwright should swallow SIGTERM signals and kill the browser.

Full Example

from browser_use import BrowserSession, BrowserProfile, Agent

browser_profile = BrowserProfile(
    headless=False,
    storage_state="path/to/storage_state.json",
    wait_for_network_idle_page_load_time=3.0,
    viewport={"width": 1280, "height": 1100},
    locale='en-US',
    user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
    highlight_elements=True,
    viewport_expansion=500,
    allowed_domains=['*.google.com', 'http*://*.wikipedia.org'],
    user_data_dir=None,
)

browser_session = BrowserSession(
    browser_profile=browser_profile,
    headless=True,                          # extra kwargs to the session override the defaults in the profile
)

# you can drive a session without the agent / reuse it between agents
await browser_session.start()
page = await browser_session.get_current_page()
await page.goto('https://example.com/first/page')

async def run_search():
    agent = Agent(
        task='Your task',
        llm=llm,
        page=page,                        # optional: pass a specific playwright page to start on
        browser_session=browser_session,  # optional: pass an existing browser session to an agent
    )

Summary

BrowserSession (defined in browser_use/browser/session.py) handles the live browser connection and runtime state
BrowserProfile (defined in browser_use/browser/profile.py) is a template that can store default config parameters for a BrowserSession(...)

Configuration parameters defined in both scopes consumed by these calls depending on whether we're connecting/launching:

BrowserConnectArgs - args for playwright.BrowserType.connect_over_cdp(...)
BrowserLaunchArgs - args for playwright.BrowserType.launch(...)
BrowserNewContextArgs - args for playwright.BrowserType.new_context(...)
BrowserLaunchPersistentContextArgs - args for playwright.BrowserType.launch_persistent_context(...)
Browser Use's own internal methods

For more details on Playwright's browser context options, see their launch args documentation.

29 KiB Raw Blame History

BrowserSession

Browser Connection Parameters

wss_url

cdp_url

browser_pid

Session-Specific Parameters

browser_profile

playwright

browser

browser_context

page aka agent_current_page

human_current_page

initialized

**kwargs

BrowserProfile

Basic Example

Browser-Use Parameters

keep_alive

stealth

allowed_domains

disable_security

deterministic_rendering

highlight_elements

viewport_expansion

include_dynamic_attributes

minimum_wait_page_load_time

wait_for_network_idle_page_load_time

maximum_wait_page_load_time

wait_between_actions

cookies_file

profile_directory

window_position

Playwright Launch Options

headless

channel

executable_path

user_data_dir

args

ignore_default_args

env

chromium_sandbox

devtools

slow_mo

timeout

accept_downloads

proxy

permissions

storage_state