在web自动化测试与爬虫开发领域,playwright凭借其跨浏览器兼容性、智能等待机制和强大的网络控制能力,已成为开发者首选的现代化工具。本文将系统讲解python环境下playwright的安装配置、核心功能及实战案例,帮助读者快速掌握这一高效工具。
一、环境搭建:三步完成开发准备
1. 系统要求
- python版本:3.8+(推荐3.10+以获得最佳兼容性)
- 操作系统:windows 10+/macos 12+/linux(ubuntu 20.04+或debian 11+)
- 浏览器支持:chromium、firefox、webkit(safari内核)
2. 安装步骤
# 1. 安装核心库(推荐使用清华源加速) pip install playwright -i https://pypi.tuna.tsinghua.edu.cn/simple # 2. 下载浏览器二进制文件(自动匹配系统环境) python -m playwright install # 3. 安装pytest插件(可选,用于测试框架集成) pip install pytest-playwright
3. 验证安装
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=false) # 有头模式便于调试
page = browser.new_page()
page.goto("https://playwright.dev")
print(page.title()) # 应输出"playwright"
browser.close()
二、核心功能详解:四大场景实战
1. 基础操作流程
from playwright.sync_api import sync_playwright
def test_baidu_search():
with sync_playwright() as p:
# 启动浏览器(无头模式示例)
browser = p.chromium.launch(headless=true)
context = browser.new_context() # 创建隔离环境
page = context.new_page()
# 执行操作
page.goto("https://www.baidu.com")
page.fill("input[name=wd]", "playwright教程")
page.click("text=百度一下")
# 验证结果
assert "playwright" in page.title()
# 资源清理
context.close()
browser.close()
2. 元素定位策略(重点掌握)
playwright提供五种定位方式,按推荐优先级排序:
语义化定位(最佳实践)
# 通过角色定位按钮
page.get_by_role("button", name="登录").click()
# 通过标签文本定位
page.get_by_text("立即购买").click()
# 通过placeholder定位输入框
page.get_by_placeholder("请输入手机号").fill("13800138000")
css选择器
# id定位
page.locator("#submit-btn").click()
# 属性组合定位
page.locator("input[type='text'][class='form-control']").fill("test")
# 伪类选择(定位奇数行)
page.locator("tr:nth-of-type(odd)").click()
xpath定位
# 组合条件定位
page.locator("//input[@type='text' and contains(@placeholder,'密码')]").fill("123456")
# 轴定位(定位父元素)
page.locator("//button/parent::div").click()
3. 高级功能实现
网络请求拦截(api模拟)
def test_api_mock():
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
# 拦截特定api请求
def handle_route(route):
if "api/data" in route.request.url:
route.fulfill(
status=200,
json={"message": "mock数据"}
)
else:
route.continue_()
page.route("**/*", handle_route)
page.goto("https://example.com")
assert "mock数据" in page.content()
browser.close()
移动设备模拟
def test_mobile_view():
with sync_playwright() as p:
# 使用预置设备参数
iphone = p.devices["iphone 12"]
browser = p.chromium.launch()
# 创建移动环境
context = browser.new_context(
**iphone,
locale="zh-cn",
timezone_id="asia/shanghai"
)
page = context.new_page()
page.goto("https://m.taobao.com")
page.screenshot(path="mobile_view.png")
context.close()
browser.close()
三、最佳实践:提升测试稳定性
1. 测试组织策略
# pytest_playwright示例(conftest.py)
import pytest
from playwright.sync_api import sync_playwright
@pytest.fixture(scope="session")
def browser():
pw = sync_playwright().start()
browser = pw.chromium.launch(headless=true)
yield browser
browser.close()
pw.stop()
def test_example(browser):
page = browser.new_page()
page.goto("https://bing.com")
assert "bing" in page.title()
2. 稳定性保障措施
显式等待:优先使用page.wait_for_selector()替代硬性等待
测试id规范:与开发约定使用data-testid属性
<button data-testid="login-submit">登录</button>
page.get_by_test_id("login-submit").click()
跨iframe处理:
frame = page.frame_locator("iframe.login-frame")
frame.locator("#username").fill("admin")
3. 调试技巧
录制工具:使用playwright codegen生成基础代码
playwright codegen https://example.com
可视化调试:在vs code中安装playwright插件,使用"pick locator"工具
日志级别控制:
import logging logging.basicconfig(level=logging.info) # 显示详细日志
四、实战案例:自动化登录流程
from playwright.sync_api import sync_playwright
import json
import os
class quarkdownloader:
def __init__(self):
self.storage_path = "state.json"
self.quark_url = "https://quark.cn/share"
self.username = "your_username" # 需替换为实际用户名
def login_and_save_state(self):
with sync_playwright() as p:
browser = p.chromium.launch(headless=false) # 显示浏览器界面
page = browser.new_page()
# 访问登录页
page.goto(self.quark_url)
# 等待扫码登录(实际项目中需替换为具体登录逻辑)
page.wait_for_selector(f"span:text('{self.username}')")
# 保存浏览器状态
storage_state = browser.new_context().storage_state()
with open(self.storage_path, "w") as f:
json.dump(storage_state, f)
browser.close()
def download_file(self, share_url, extract_code):
if not os.path.exists(self.storage_path):
self.login_and_save_state()
with sync_playwright() as p:
browser = p.chromium.launch(headless=true)
context = browser.new_context(storage_state=self.storage_path)
page = context.new_page()
# 监听下载事件
def handle_download(route):
if "download" in route.request.url:
route.continue_()
page.route("**/*", handle_download)
# 访问分享链接
page.goto(share_url)
page.fill("input[placeholder='提取码']", extract_code)
page.click("text=提取文件")
# 等待下载完成(实际项目中需根据具体下载逻辑调整)
page.wait_for_timeout(10000)
context.close()
browser.close()
# 使用示例
downloader = quarkdownloader()
downloader.download_file(
share_url="https://quark.cn/share/xxxxxx", # 替换为实际分享链接
extract_code="1234" # 替换为实际提取码
)
五、总结与展望
playwright通过其现代化的架构设计,解决了传统自动化工具的三大痛点:
- 跨浏览器一致性:统一api适配chromium/firefox/webkit
- 智能等待机制:自动处理元素加载状态
- 网络控制能力:支持请求拦截与修改
对于python开发者而言,掌握playwright不仅能提升测试效率,还能为爬虫开发提供新的解决方案。建议从语义化定位开始实践,逐步掌握网络拦截和移动模拟等高级功能,最终构建出健壮的自动化测试体系。
以上就是从环境搭建到实战详解python中playwright的入门教程的详细内容,更多关于python playwright使用的资料请关注代码网其它相关文章!
发表评论