1. 安装 httpx
首先,确保已经安装了 httpx。可以通过以下命令安装:pip install httpx
如果需要支持 http/2,可以安装额外依赖:pip install httpx[http2]
2. 同步请求
发送 get 请求
import httpx # 发送 get 请求 response = httpx.get('https://httpbin.org/get') print(response.status_code) # 状态码 print(response.text) # 响应内容
发送 post 请求
# 发送 post 请求 data = {'key': 'value'} response = httpx.post('https://httpbin.org/post', json=data) print(response.json()) # 解析 json 响应
设置请求头
headers = {'user-agent': 'my-app/1.0.0'} response = httpx.get('https://httpbin.org/headers', headers=headers) print(response.json())
设置查询参数
params = {'key1': 'value1', 'key2': 'value2'} response = httpx.get('https://httpbin.org/get', params=params) print(response.json())
处理超时
try: response = httpx.get('https://httpbin.org/delay/5', timeout=2.0) except httpx.timeoutexception: print("请求超时")
3. 异步请求
httpx 支持异步操作,适合高性能场景。
发送异步 get 请求
import httpx import asyncio async def fetch(url): async with httpx.asyncclient() as client: response = await client.get(url) print(response.text) asyncio.run(fetch('https://httpbin.org/get'))
发送异步 post 请求
async def post_data(url, data): async with httpx.asyncclient() as client: response = await client.post(url, json=data) print(response.json()) asyncio.run(post_data('https://httpbin.org/post', {'key': 'value'}))
并发请求
async def fetch_multiple(urls): async with httpx.asyncclient() as client: tasks = [client.get(url) for url in urls] responses = await asyncio.gather(*tasks) for response in responses: print(response.text) urls = ['https://httpbin.org/get', 'https://httpbin.org/ip'] asyncio.run(fetch_multiple(urls))
4. 高级功能
使用 http/2
# 启用 http/2 client = httpx.client(http2=true) response = client.get('https://httpbin.org/get') print(response.http_version) # 输出协议版本
文件上传
files = {'file': open('example.txt', 'rb')} response = httpx.post('https://httpbin.org/post', files=files) print(response.json())
流式请求
# 流式上传 def generate_data(): yield b"part1" yield b"part2" response = httpx.post('https://httpbin.org/post', data=generate_data()) print(response.json())
流式响应
# 流式下载 with httpx.stream('get', 'https://httpbin.org/stream/10') as response: for chunk in response.iter_bytes(): print(chunk)
5. 错误处理
httpx 提供了多种异常类,方便处理错误。
处理网络错误
try: response = httpx.get('https://nonexistent-domain.com') except httpx.networkerror: print("网络错误")
处理 http 错误状态码
response = httpx.get('https://httpbin.org/status/404') if response.status_code == 404: print("页面未找到")
6. 配置客户端
可以通过 httpx.client 或 httpx.asyncclient 配置全局设置。
设置超时
client = httpx.client(timeout=10.0) response = client.get('https://httpbin.org/get') print(response.text)
设置代理
proxies = { "http://": "http://proxy.example.com:8080", "https://": "http://proxy.example.com:8080", } client = httpx.client(proxies=proxies) response = client.get('https://httpbin.org/get') print(response.text)
设置基础 url
client = httpx.client(base_url='https://httpbin.org') response = client.get('/get') print(response.text)
7. 结合 beautiful soup 使用
httpx 可以与 beautiful soup 结合使用,抓取并解析网页。
import httpx from bs4 import beautifulsoup # 抓取网页 response = httpx.get('https://example.com') html = response.text # 解析网页 soup = beautifulsoup(html, 'lxml') title = soup.find('title').text print("网页标题:", title)
8. 示例:抓取并解析网页
以下是一个完整的示例,展示如何使用 httpx 抓取并解析网页数据:
import httpx from bs4 import beautifulsoup # 抓取网页 url = 'https://example.com' response = httpx.get(url) html = response.text # 解析网页 soup = beautifulsoup(html, 'lxml') # 提取标题 title = soup.find('title').text print("网页标题:", title) # 提取所有链接 links = soup.find_all('a', href=true) for link in links: href = link['href'] text = link.text print(f"链接文本: {text}, 链接地址: {href}")
9. 注意事项
性能:httpx 的异步模式适合高并发场景。
兼容性:httpx 的 api 与 requests 高度兼容,迁移成本低。
http/2:如果需要使用 http/2,确保安装了 httpx[http2]。
通过以上方法,可以使用 httpx 高效地发送 http 请求,并结合其他工具(如 beautiful soup)实现数据抓取和解析。
到此这篇关于python中httpx库的详细使用方法及案例详解的文章就介绍到这了,更多相关python httpx库使用及案例内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网!
发表评论