Python文件读写操作基础知识和实战应用_Python

一、文件操作基础入门

1.1 文件打开与关闭

python通过内置的open()函数实现文件操作，该函数接受两个核心参数：文件路径和操作模式。例如，open('data.txt', 'r')表示以只读模式打开当前目录下的data.txt文件。常用模式包括：

r：只读模式（默认），文件不存在时报错
w：写入模式，覆盖原内容，文件不存在时创建
a：追加模式，在文件末尾添加内容
b：二进制模式（如rb读取图片，wb写入音频）

传统写法需手动关闭文件：

file = open('demo.txt', 'w')
file.write('hello world')
file.close()  # 必须显式关闭

更推荐使用with语句实现自动资源管理：

with open('demo.txt', 'w') as f:
    f.write('auto-closed file')  # 退出代码块自动关闭

1.2 核心读写方法

读取操作三剑客

read()：一次性读取全部内容（适合小文件）

with open('example.txt', 'r') as f:
    full_content = f.read()
readline()：逐行读取，返回单行字符串
python
with open('example.txt', 'r') as f:
    first_line = f.readline()
readlines()：返回包含所有行的列表
python
with open('example.txt', 'r') as f:
    lines_list = f.readlines()

写入操作双雄

write()：写入字符串（需手动处理换行符）

with open('output.txt', 'w') as f:
    f.write('line 1\nline 2')  # 需自行添加换行符
writelines()：写入字符串列表（不自动换行）
python
lines = ['line 1\n', 'line 2\n']
with open('output.txt', 'w') as f:
    f.writelines(lines)  # 需确保列表元素含换行符

二、进阶操作技巧

2.1 文件指针控制

每个文件对象都有独立指针，记录当前读写位置：

tell()：获取当前指针位置

with open('example.txt', 'r') as f:
    print(f.tell())  # 初始位置0
    f.read(5)
    print(f.tell())  # 读取5字符后位置5

seek()：移动指针位置

f.seek(offset, whence) # whence=0(开头)/1(当前)/2(结尾)

2.2 二进制文件处理

处理图片、音频等非文本文件时，需使用二进制模式：

# 复制图片文件
with open('image.jpg', 'rb') as src:
    binary_data = src.read()
with open('copy.jpg', 'wb') as dst:
    dst.write(binary_data)

2.3 异常处理机制

文件操作需防范常见异常：

try:
    with open('missing.txt', 'r') as f:
        content = f.read()
except filenotfounderror:
    print("文件不存在！")
except permissionerror:
    print("无读取权限！")

三、实战场景解析

3.1 文本数据处理

日志文件分析

# 提取包含"error"的日志条目
with open('app.log', 'r') as f:
    errors = [line for line in f if 'error' in line]
    for error in errors:
        print(error.strip())

csv数据清洗

使用pandas处理结构化数据：

import pandas as pd
 
# 读取csv文件
df = pd.read_csv('sales.csv')
# 删除缺失值
df.dropna(inplace=true)
# 保存清洗结果
df.to_csv('cleaned_sales.csv', index=false)

3.2 大文件处理优化

分块读取策略

block_size = 1024 * 1024  # 1mb块大小
with open('large_file.bin', 'rb') as f:
    while true:
        chunk = f.read(block_size)
        if not chunk:
            break
        # 处理当前数据块

生成器处理

def read_in_chunks(file_path, chunk_size):
    with open(file_path, 'r') as f:
        while true:
            data = f.read(chunk_size)
            if not data:
                break
            yield data
 
for chunk in read_in_chunks('huge.log', 4096):
    process(chunk)  # 自定义处理函数

3.3 配置文件管理

json配置操作

import json
 
# 读取配置
with open('config.json', 'r') as f:
    config = json.load(f)
# 修改配置
config['debug'] = true
# 写回文件
with open('config.json', 'w') as f:
    json.dump(config, f, indent=4)

yaml配置示例

import yaml
 
with open('settings.yaml', 'r') as f:
    settings = yaml.safe_load(f)
# 修改参数
settings['max_connections'] = 100
with open('settings.yaml', 'w') as f:
    yaml.dump(settings, f)

四、性能优化指南

4.1 模式选择策略

场景	推荐模式	注意事项
频繁追加日志	`a`	自动定位文件末尾
随机访问文件	`r+`	需配合指针操作
大文件二进制处理	`rb/wb`	避免编码转换开销

4.2 缓冲机制优化

python默认使用全缓冲模式，可通过buffering参数调整：

# 行缓冲模式（文本模式）
with open('realtime.log', 'w', buffering=1) as f:
    f.write('log entry\n')  # 立即刷新缓冲区
 
# 自定义缓冲区大小（二进制模式）
with open('data.bin', 'wb', buffering=8192) as f:
    f.write(b'x'*16384)  # 每次写入8kb

4.3 内存映射技术

对于超大文件处理，可使用mmap模块：

import mmap
 
with open('huge_file.bin', 'r+b') as f:
    mm = mmap.mmap(f.fileno(), 0)
    # 像操作字符串一样处理文件
    mm.find(b'pattern')
    mm.close()  # 修改自动同步到磁盘

五、常见问题解决方案

5.1 编码问题处理

# 指定正确编码（如gbk文件）
with open('chinese.txt', 'r', encoding='gbk') as f:
    content = f.read()
 
# 忽略无法解码的字符
with open('corrupted.txt', 'r', errors='ignore') as f:
    content = f.read()

5.2 文件锁机制

import fcntl  # linux/unix系统
 
with open('critical.dat', 'r') as f:
    fcntl.flock(f, fcntl.lock_sh)  # 共享锁
    # 读取操作
    fcntl.flock(f, fcntl.lock_un)  # 释放锁

5.3 路径处理技巧

from pathlib import path
 
# 跨平台路径操作
file_path = path('documents') / 'report.txt'
# 扩展名处理
if file_path.suffix == '.tmp':
    file_path.rename(file_path.with_suffix('.bak'))

六、未来趋势展望

python文件操作正在向更高效、更安全的方向发展：

异步文件io：python 3.8+引入的aiofiles库支持异步文件操作

import aiofiles
async with aiofiles.open('data.txt', 'r') as f:
    content = await f.read()

内存映射增强：python 3.11+改进了mmap模块的跨平台兼容性
路径处理标准化：pathlib库逐渐取代os.path成为首选方案

掌握这些文件操作技巧，可以显著提升数据处理效率。实际开发中应根据具体场景选择合适的方法，在保证功能实现的同时，兼顾系统资源的高效利用。

以上就是python文件读写操作基础知识和实战应用的详细内容，更多关于python文件读写操作的资料请关注代码网其它相关文章！

Python文件读写操作基础知识和实战应用

2025年06月24日 • Python •我要评论