How Playwright Powers Browser Automation: A Deep Dive

7 min readDec 21, 2024

Playwright, a Web UI automation testing framework developed by Microsoft, is engineered to provide a cross-platform, cross-language, and cross-browser automation testing solution, with additional support for mobile browsers.

As stated on its official homepage:

Playwright incorporates automatic waiting mechanisms, intelligent assertions for page elements, and execution tracing capabilities. These features collectively enhance its robustness in managing the inherent instability of web pages.
It operates browsers in a separate process from the test execution process. This design circumvents the limitations of in-process test runners and facilitates Shadow DOM penetration.
Each test is executed within its own browser context, effectively functioning as a new browser profile. This ensures complete isolation with minimal overhead, and the creation of a new browser context occurs within milliseconds.
Playwright also offers valuable utilities such as code generation, step-by-step debugging, and a trace viewer.

Playwright vs. Selenium vs. Cypress

When assessing the premier Web UI automation testing frameworks currently available, the leading contenders are the long-established Selenium, the increasingly popular Cypress, and the subject of our discussion, Playwright. The following is a detailed comparison of these frameworks for your reference:

Here’s a concise comparison of Playwright, Selenium, and Cypress:

Languages Supported: Playwright supports JavaScript, Java, C#, and Python. Selenium supports a wider range, including Ruby. Cypress is limited to JavaScript/TypeScript.
Browsers: Playwright, Selenium, and Cypress support Chrome, Edge, Firefox, and Safari. Selenium also supports Internet Explorer, which is becoming outdated.
Usability: Playwright and Cypress are user-friendly and easy to set up, while Selenium has a more complex setup and steeper learning curve.
Code Complexity: Playwright and Cypress offer simpler code complexity compared to Selenium, which tends to be moderate.
DOM Manipulation: Playwright and Cypress handle DOM manipulation simply, while Selenium is more complex in this regard.
Community Maturity: Selenium has the most mature community, followed by Cypress. Playwright is growing but still relatively new.
Headless Mode: All three frameworks support headless mode.
Concurrency: Playwright and Selenium support concurrency. Cypress depends on CI/CD tools for parallelism.
iframe Support: All three frameworks support iframes, though Cypress requires plugins for full support.
Driver: Playwright and Cypress do not require a browser-specific driver, unlike Selenium.
Multi-Tab Operations: Playwright and Cypress support multi-tab operations, while Selenium does not.
Drag and Drop: All three frameworks support drag-and-drop functionality.
Built-in Reporting: Playwright and Cypress have built-in reporting, while Selenium does not.
Cross-Origin Support: All frameworks support cross-origin testing.
Built-in Debugging: Playwright and Cypress include built-in debugging, but Selenium does not.
Automatic Wait: Playwright and Cypress automatically wait for elements, while Selenium requires manual waiting.
Built-in Screenshot/Video: Playwright and Cypress offer built-in screenshot/video support, whereas Selenium lacks built-in video recording.

Key Comparisons:

Supported Languages: Both Playwright and Selenium support Java, C#, and Python, making them favorable choices for test engineers who may lack proficiency in JavaScript/TypeScript.
Technical Approach: Playwright and Selenium utilize Google’s Remote Debugging Protocol to control Chromium-based browsers. For browsers like Firefox that do not support this protocol, JavaScript injection is employed. While Selenium encapsulates this within a Driver, Playwright directly invokes it. Cypress employs JavaScript for browser control.
Browser Support: Selenium supports Internet Explorer, although its relevance is diminishing as IE is being phased out.
Ease of Use: All three frameworks entail a learning curve, but Playwright and Cypress generally offer enhanced user-friendliness for straightforward test cases compared to Selenium.

Getting Started with Playwright

Although Playwright supports multiple programming languages, it exhibits a significant reliance on Node.js. Irrespective of whether you are utilizing the Python or Java version, Playwright necessitates Node.js during initialization to download a Node.js driver. Consequently, this guide will concentrate on JavaScript/TypeScript.

Installation and Setup

Ensure Node.js is installed.
Initialize a Playwright project using either npm or yarn:

# Using npm
npm init playwright@latest
# Using yarn
yarn create playwright

Follow this:

If you choose to download browsers, Playwright will retrieve Chromium, Firefox, and WebKit, which may consume some time. This process is required only once, unless Playwright is updated.

Select TypeScript or JavaScript (default: TypeScript).
Specify the test directory name.
Decide whether to install Playwright-supported browsers (default: True).

Project Structure

Upon initialization, your project will assume the following structure:

playwright.config.ts    # Playwright configuration file
package.json # Node.js configuration file
package-lock.json # Node.js dependency lock file
tests/ # Your test directory
 example.spec.ts # Template test case
tests-examples/ # Example tests directory
 demo-todo-app.spec.ts # Example test case

To execute the example test case:

npx playwright test

Tests will run silently (in headless mode), and the results will be presented as follows:

Running 6 tests using 6 workers
 6 passed (10s)
To open the last HTML report, run:
 npx playwright show-report

Example Source Code

Here is the example.spec.ts test case:

import { test, expect } from '@playwright/test';
test('has title', async ({ page }) => {
 await page.goto('https://playwright.dev/');
 await expect(page).toHaveTitle(/Playwright/);
});
test('get started link', async ({ page }) => {
 await page.goto('https://playwright.dev/');
 await page.getByRole('link', { name: 'Get started' }).click();
 await expect(page).toHaveURL(/.*intro/);
});

First Test: Verifies that the page title encompasses “Playwright”.
Second Test: Clicks the “Get started” link and validates that the URL conforms to the expected pattern.

Each test method comprises:

A test name (e.g., ‘has title’).
A function to execute the test logic.

Key methods include:

page.goto: Navigates to a URL.
expect(page).toHaveTitle: Asserts the page title.
page.getByRole: Locates an element based on its role.
await: Waits for asynchronous operations to conclude.

Running Tests from the Command Line

The following are some commonly used commands:

Run all tests:

npx playwright test

Run a specific test file:

npx playwright test landing-page.spec.ts

Debug a test case:

npx playwright test --debug

Code Recording

The codegen feature can be utilized to record interactions:

npx playwright codegen https://leapcell.io/

The recorded code can then be transferred into your project files. It should be noted that the recorder may not capture complex actions such as hovering.

In-Depth Playwright Guide

Actions and Behaviors

This section acquaints you with some typical Playwright actions for interacting with page elements. The locator object, introduced earlier, does not actually identify the element at the time of its creation. Even if the element is absent from the page, employing the locator methods will not trigger exceptions. The element is located only when an interaction is initiated. This approach diverges from Selenium’s findElement method, which searches for the element on the page and throws an exception if the element is not found.

Text Input

To input text, utilize the fill method, primarily applicable to <input>, <textarea>, or elements with the [contenteditable] attribute:

// Text input
await page.getByRole('textbox').fill('Peter');

Checkbox and Radio Buttons

Employ locator.setChecked() or locator.check() to interact with input[type=checkbox], input[type=radio], or elements with the [role=checkbox] attribute:

await page.getByLabel('I agree to the terms above').check();
expect(await page.getByLabel('Subscribe to newsletter').isChecked()).toBeTruthy();
// Uncheck
await page.getByLabel('XL').setChecked(false);

Select Control

Use locator.selectOption() to interact with <select> elements:

// Select by value
await page.getByLabel('Choose a color').selectOption('blue');
// Select by label
await page.getByLabel('Choose a color').selectOption({ label: 'Blue' });
// Multi-select
await page.getByLabel('Choose multiple colors').selectOption(['red', 'green', 'blue']);

Mouse Clicks

Basic operations:

// Left click
await page.getByRole('button').click();
// Double click
await page.getByText('Item').dblclick();
// Right click
await page.getByText('Item').click({ button: 'right' });
// Shift+click
await page.getByText('Item').click({ modifiers: ['Shift'] });
// Hover
await page.getByText('Item').hover();
// Click at specific position
await page.getByText('Item').click({ position: { x: 0, y: 0 } });

For elements obscured by others, use force click:

await page.getByRole('button').click({ force: true });

Or trigger the click event programmatically:

await page.getByRole('button').dispatchEvent('click');

Typing Characters

The locator.type() method simulates typing character-by-character, triggering keydown, keyup, and keypress events:

await page.locator('#area').type('Hello World!');

Special Keys

Use locator.press() for special keys:

await page.getByText('Submit').press('Enter');
// Ctrl+Right Arrow
await page.getByRole('textbox').press('Control+ArrowRight');
// Dollar key
await page.getByRole('textbox').press('$');

Supported keys encompass Backquote, Minus, Equal, Backslash, Backspace, Tab, Delete, Escape, ArrowDown, End, Enter, Home, Insert, PageDown, PageUp, ArrowRight, ArrowUp, F1-F12, Digit0-Digit9, and KeyA-KeyZ.

File Upload

Use locator.setInputFiles() to specify files for upload. Multiple files are supported:

await page.getByLabel('Upload file').setInputFiles('myfile.pdf');
// Upload multiple files
await page.getByLabel('Upload files').setInputFiles(['file1.txt', 'file2.txt']);
// Remove files
await page.getByLabel('Upload file').setInputFiles([]);
// Upload from buffer
await page.getByLabel('Upload file').setInputFiles({
 name: 'file.txt',
 mimeType: 'text/plain',
 buffer: Buffer.from('this is test')
});

Focus Element

Use locator.focus() to focus on an element:

await page.getByLabel('Password').focus();

Drag and Drop

The drag-and-drop process entails four steps:

Hover the mouse over the draggable element.
Press the left mouse button.
Move the mouse to the target position.
Release the left mouse button.

You can use the locator.dragTo() method:

await page.locator('#item-to-be-dragged').dragTo(page.locator('#item-to-drop-at'));

Alternatively, manually implement the process:

await page.locator('#item-to-be-dragged').hover();
await page.mouse.down();
await page.locator('#item-to-drop-at').hover();
await page.mouse.up();

Dialog Handling

By default, Playwright automatically cancels dialogs such as alert, confirm, and prompt. You can pre-register a dialog handler to accept dialogs:

page.on('dialog', dialog => dialog.accept());
await page.getByRole('button').click();

Handling New Pages

When a new page emerges, you can use the popup event to manage it:

const newPagePromise = page.waitForEvent('popup');
await page.getByText('Click me').click();
const newPage = await newPagePromise;
await newPage.waitForLoadState();
console.log(await newPage.title());

The Best Platform for Playwright: Leapcell

Leapcell represents a modern cloud computing platform tailored for distributed applications. It adopts a pay-as-you-go model, eliminating idle costs and ensuring that you are billed only for the resources you consume.

Unique Benefits of Leapcell for Playwright Applications

Multi-Language Support

Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

For more deployment examples, refer to the documentation.