欢迎来到徐庆高(Tea)的个人博客网站
磨难很爱我,一度将我连根拔起。从惊慌失措到心力交瘁,我孤身一人,但并不孤独无依。依赖那些依赖我的人,信任那些信任我的人,帮助那些给予我帮助的人。如果我愿意,可以分裂成无数面镜子,让他们看见我,就像看见自己。察言观色和模仿学习是我的领域。像每个深受创伤的人那样,最终,我学会了随遇而安。
当前位置: 日志文章 > 详细内容

Java docx4j高效处理Word文档的实战指南

2025年07月15日 Java
引言在现代办公自动化和文档处理领域,microsoft word的.docx格式已成为行业标准。对于需要在java应用程序中生成、修改或处理word文档的开发者来说,docx4j是一个强大而专业的选择

引言

在现代办公自动化和文档处理领域,microsoft word的.docx格式已成为行业标准。对于需要在java应用程序中生成、修改或处理word文档的开发者来说,docx4j是一个强大而专业的选择。本文将全面介绍docx4j库的特点、使用方法和适用场景,并通过丰富的代码示例展示其强大功能。

一、环境准备与基础配置

1.1 maven依赖配置

<dependency>
    <groupid>org.docx4j</groupid>
    <artifactid>docx4j-core</artifactid>
    <version>8.3.4</version>
</dependency>
<dependency>
    <groupid>org.docx4j</groupid>
    <artifactid>docx4j-export-fo</artifactid>
    <version>8.3.4</version>
</dependency>

1.2 初始化测试类

public class docx4jtest {
    private wordprocessingmlpackage wordpackage;
    private objectfactory factory;
    
    @beforeeach
    public void setup() throws exception {
        wordpackage = wordprocessingmlpackage.createpackage();
        factory = context.getwmlobjectfactory();
    }
    
    @aftereach
    public void teardown() throws exception {
        if (wordpackage != null) {
            wordpackage.save(new file("test_output.docx"));
        }
    }
}

二、增强版文档操作示例

2.1 复杂表格生成(带样式和合并单元格)

@test
public void testcreatecomplextable() throws exception {
    // 创建5x5表格
    tbl table = factory.createtbl();
    
    // 设置表格属性
    tblpr tblpr = factory.createtblpr();
    tblwidth tblwidth = factory.createtblwidth();
    tblwidth.setw(biginteger.valueof(5000));
    tblwidth.settype("dxa");
    tblpr.settblwidth(tblwidth);
    table.settblpr(tblpr);
    
    // 创建表头行
    tr headerrow = factory.createtr();
    for (int i = 0; i < 5; i++) {
        tc cell = createtablecell("表头 " + (i+1), true, "ff0000");
        headerrow.getcontent().add(cell);
    }
    table.getcontent().add(headerrow);
    
    // 创建数据行(带合并单元格)
    for (int row = 0; row < 4; row++) {
        tr datarow = factory.createtr();
        for (int col = 0; col < 5; col++) {
            if (row == 1 && col == 1) {
                // 合并单元格(横向合并2个)
                tc cell = createtablecell("合并单元格", false, "00ff00");
                cell.gettcpr().setgridspan(new biginteger("2"));
                datarow.getcontent().add(cell);
                col++; // 跳过下一个单元格
            } else if (row == 2 && col == 0) {
                // 合并单元格(纵向合并2个)
                tc cell = createtablecell("纵向合并", false, "0000ff");
                cell.gettcpr().setvmerge(factory.createctvmerge());
                cell.gettcpr().getvmerge().setval("restart");
                datarow.getcontent().add(cell);
            } else if (row == 3 && col == 0) {
                // 继续纵向合并
                tc cell = createtablecell("", false, "0000ff");
                cell.gettcpr().setvmerge(factory.createctvmerge());
                cell.gettcpr().getvmerge().setval("continue");
                datarow.getcontent().add(cell);
            } else {
                datarow.getcontent().add(
                    createtablecell("数据 "+row+","+col, false, null));
            }
        }
        table.getcontent().add(datarow);
    }
    
    wordpackage.getmaindocumentpart().addobject(table);
    
    assertnotnull(table);
    assertequals(5, table.getcontent().size());
}

2.2 文档样式管理

@test
public void testdocumentstyles() throws exception {
    // 创建样式定义
    styles styles = factory.createstyles();
    
    // 标题1样式
    style titlestyle = factory.createstyle();
    titlestyle.settype("paragraph");
    titlestyle.setstyleid("heading1");
    style.name name = factory.createstylename();
    name.setval("标题 1");
    titlestyle.setname(name);
    
    ppr ppr = factory.createppr();
    ppr.setoutlinelvl(new biginteger("0"));
    jc jc = factory.createjc();
    jc.setval(jcenumeration.center);
    ppr.setjc(jc);
    titlestyle.setppr(ppr);
    
    rpr rpr = factory.createrpr();
    rpr.setb(new booleandefaulttrue());
    rpr.setsz(new hpsmeasure(biginteger.valueof(32)));
    rpr.setcolor(new color("2f5496"));
    titlestyle.setrpr(rpr);
    
    styles.getstyle().add(titlestyle);
    
    // 将样式添加到文档
    wordpackage.getmaindocumentpart().setstyledefinitionspart(
        new stylespart(wordpackage, styles));
    
    // 使用样式
    p p = factory.createp();
    ppr ppr = factory.createppr();
    ppr.setpstyle("heading1");
    p.setppr(ppr);
    r r = factory.creater();
    text t = factory.createtext();
    t.setvalue("这是标题1样式");
    r.getcontent().add(t);
    p.getcontent().add(r);
    wordpackage.getmaindocumentpart().addobject(p);
    
    // 验证样式是否存在
    assertnotnull(wordpackage.getmaindocumentpart().getstyledefinitionspart());
    assertequals(1, wordpackage.getmaindocumentpart()
        .getstyledefinitionspart().getjaxbelement().getstyle().size());
}

三、高级功能实现

3.1 生成带目录的文档

@test
public void testgeneratetoc() throws exception {
    // 添加标题样式(同2.2节)
    // ...
    
    // 添加几个带样式的标题
    addstyledparagraph("heading1", "第一章 简介");
    addstyledparagraph("heading2", "1.1 背景");
    addstyledparagraph("heading1", "第二章 实现");
    addstyledparagraph("heading2", "2.1 技术选型");
    
    // 创建目录字段代码
    p tocparagraph = factory.createp();
    fldchar fldchar = factory.createfldchar();
    fldchar.setfldchartype(stfldchartype.begin);
    tocparagraph.getcontent().add(fldchar);
    
    r tocrun = factory.creater();
    text toctext = factory.createtext();
    toctext.setspace("preserve");
    toctext.setvalue(" toc \\o \"1-3\" \\h \\z \\u ");
    tocrun.getcontent().add(toctext);
    tocparagraph.getcontent().add(tocrun);
    
    fldchar fldcharsep = factory.createfldchar();
    fldcharsep.setfldchartype(stfldchartype.separate);
    tocparagraph.getcontent().add(fldcharsep);
    
    // 目录占位文本
    r placeholderrun = factory.creater();
    text placeholdertext = factory.createtext();
    placeholdertext.setvalue("目录将在此生成...");
    placeholderrun.getcontent().add(placeholdertext);
    tocparagraph.getcontent().add(placeholderrun);
    
    fldchar fldcharend = factory.createfldchar();
    fldcharend.setfldchartype(stfldchartype.end);
    tocparagraph.getcontent().add(fldcharend);
    
    // 将目录添加到文档开头
    wordpackage.getmaindocumentpart().addobject(0, tocparagraph);
    
    // 更新字段(生成实际目录)
    fieldupdater updater = new fieldupdater(wordpackage);
    updater.update(true);
    
    // 验证目录是否存在
    asserttrue(wordpackage.getmaindocumentpart().getcontent().get(0) instanceof p);
}

3.2 文档加密与保护

@test
public void testdocumentprotection() throws exception {
    // 设置文档保护
    documentprotection protection = new documentprotection();
    protection.setedit(protectionedittype.read_only);
    protection.setpassword("123456");
    
    // 应用保护设置
    wordpackage.getmaindocumentpart().getcontents().getbody().setdocumentprotection(
        protection.createdocumentprotection());
    
    // 添加一些内容
    wordpackage.getmaindocumentpart().addparagraphoftext("这是受保护的文档");
    
    // 保存并重新加载验证保护
    bytearrayoutputstream baos = new bytearrayoutputstream();
    wordpackage.save(baos);
    
    bytearrayinputstream bais = new bytearrayinputstream(baos.tobytearray());
    wordprocessingmlpackage protectedpackage = wordprocessingmlpackage.load(bais);
    
    // 尝试修改(应抛出异常)
    assertthrows(docx4jexception.class, () -> {
        protectedpackage.getmaindocumentpart().addparagraphoftext("尝试修改");
        protectedpackage.save(new file("protected.docx"));
    });
    
    // 使用密码解除保护
    protectedpackage = wordprocessingmlpackage.load(new bytearrayinputstream(baos.tobytearray()));
    protectedpackage.getmaindocumentpart().removeprotection("123456");
    protectedpackage.getmaindocumentpart().addparagraphoftext("已解除保护");
    protectedpackage.save(new file("unprotected.docx"));
}

四、测试工具类与实用方法

4.1 文档比较工具类

public class docxcomparator {
    public static boolean comparedocs(file doc1, file doc2) throws exception {
        wordprocessingmlpackage pkg1 = wordprocessingmlpackage.load(doc1);
        wordprocessingmlpackage pkg2 = wordprocessingmlpackage.load(doc2);
        
        // 比较文档结构
        if (!compareparts(pkg1.getmaindocumentpart(), pkg2.getmaindocumentpart())) {
            return false;
        }
        
        // 比较样式
        if (!comparestyles(pkg1, pkg2)) {
            return false;
        }
        
        return true;
    }
    
    private static boolean compareparts(part part1, part part2) {
        // 实现具体的比较逻辑
        // ...
        return true;
    }
    
    private static boolean comparestyles(wordprocessingmlpackage pkg1, 
                                       wordprocessingmlpackage pkg2) {
        // 实现样式比较逻辑
        // ...
        return true;
    }
}

// 测试用例
@test
public void testdocumentcomparison() throws exception {
    file original = new file("template.docx");
    file generated = new file("test_output.docx");
    
    // 生成测试文档
    wordprocessingmlpackage pkg = wordprocessingmlpackage.createpackage();
    pkg.getmaindocumentpart().addparagraphoftext("测试内容");
    pkg.save(generated);
    
    // 比较文档
    assertfalse(docxcomparator.comparedocs(original, generated));
    
    // 比较相同文档
    asserttrue(docxcomparator.comparedocs(generated, generated));
}

4.2 性能测试工具

public class docx4jbenchmark {
    public static long measuredocumentcreation(int paragraphcount) throws exception {
        long start = system.currenttimemillis();
        
        wordprocessingmlpackage wordpackage = wordprocessingmlpackage.createpackage();
        objectfactory factory = context.getwmlobjectfactory();
        
        for (int i = 0; i < paragraphcount; i++) {
            p p = factory.createp();
            r r = factory.creater();
            text t = factory.createtext();
            t.setvalue("段落 " + (i+1));
            r.getcontent().add(t);
            p.getcontent().add(r);
            wordpackage.getmaindocumentpart().addobject(p);
        }
        
        bytearrayoutputstream out = new bytearrayoutputstream();
        wordpackage.save(out);
        
        return system.currenttimemillis() - start;
    }
}

// 性能测试用例
@test
public void testperformance() throws exception {
    int[] testsizes = {100, 1000, 5000};
    
    for (int size : testsizes) {
        long time = docx4jbenchmark.measuredocumentcreation(size);
        system.out.printf("生成 %d 段落的文档耗时: %d ms%n", size, time);
        asserttrue(time < 10000, "性能测试失败,耗时过长");
    }
}

五、集成测试示例

5.1 端到端文档生成测试

@test
public void testendtoenddocumentgeneration() throws exception {
    // 1. 创建文档
    wordprocessingmlpackage wordpackage = wordprocessingmlpackage.createpackage();
    
    // 2. 添加封面
    addcoverpage(wordpackage);
    
    // 3. 添加目录
    addtableofcontents(wordpackage);
    
    // 4. 添加章节内容
    addchapter(wordpackage, "1. 简介", "这是文档的简介部分...");
    addchapter(wordpackage, "2. 实现", "详细实现说明...");
    
    // 5. 添加表格
    addsampletable(wordpackage);
    
    // 6. 添加图表
    addsamplechart(wordpackage);
    
    // 7. 添加页眉页脚
    addheaderfooter(wordpackage);
    
    // 8. 保存文档
    file output = new file("full_document.docx");
    wordpackage.save(output);
    
    // 验证
    asserttrue(output.exists());
    asserttrue(output.length() > 1024); // 文档大小应大于1kb
    
    // 验证文档结构
    wordprocessingmlpackage loaded = wordprocessingmlpackage.load(output);
    assertnotnull(loaded.getmaindocumentpart());
    assertnotnull(loaded.getmaindocumentpart().getstyledefinitionspart());
    
    // 验证内容
    string xml = xmlutils.marshaltostring(
        loaded.getmaindocumentpart().getjaxbelement(), true);
    asserttrue(xml.contains("简介"));
    asserttrue(xml.contains("实现"));
}

5.2 异常处理测试

@test
public void testexceptionhandling() {
    // 测试无效文件加载
    assertthrows(docx4jexception.class, () -> {
        wordprocessingmlpackage.load(new file("nonexistent.docx"));
    });
    
    // 测试无效操作
    assertthrows(illegalstateexception.class, () -> {
        wordprocessingmlpackage wordpackage = wordprocessingmlpackage.createpackage();
        wordpackage.save(null);
    });
    
    // 测试样式操作错误
    assertthrows(invalidformatexception.class, () -> {
        wordprocessingmlpackage wordpackage = wordprocessingmlpackage.createpackage();
        p p = factory.createp();
        ppr ppr = factory.createppr();
        ppr.setpstyle("invalidstyle");
        p.setppr(ppr);
        wordpackage.getmaindocumentpart().addobject(p);
        wordpackage.save(new file("invalid_style.docx"));
    });
}

六、实用工具方法集

文档生成工具类

public class docxgenerator {
    private final wordprocessingmlpackage wordpackage;
    private final objectfactory factory;
    
    public docxgenerator() throws docx4jexception {
        this.wordpackage = wordprocessingmlpackage.createpackage();
        this.factory = context.getwmlobjectfactory();
    }
    
    public void addtitle(string text, int level) {
        p p = factory.createp();
        ppr ppr = factory.createppr();
        ppr.setpstyle("heading" + level);
        p.setppr(ppr);
        
        r r = factory.creater();
        text t = factory.createtext();
        t.setvalue(text);
        r.getcontent().add(t);
        p.getcontent().add(r);
        
        wordpackage.getmaindocumentpart().addobject(p);
    }
    
    public void addparagraph(string text) {
        wordpackage.getmaindocumentpart().addparagraphoftext(text);
    }
    
    public void addtable(list<list<string>> data) {
        tbl table = factory.createtbl();
        
        // 添加表头
        if (!data.isempty()) {
            tr headerrow = factory.createtr();
            for (string header : data.get(0)) {
                headerrow.getcontent().add(createtablecell(header, true, null));
            }
            table.getcontent().add(headerrow);
        }
        
        // 添加数据行
        for (int i = 1; i < data.size(); i++) {
            tr datarow = factory.createtr();
            for (string celldata : data.get(i)) {
                datarow.getcontent().add(createtablecell(celldata, false, null));
            }
            table.getcontent().add(datarow);
        }
        
        wordpackage.getmaindocumentpart().addobject(table);
    }
    
    public void savetofile(string filename) throws docx4jexception {
        wordpackage.save(new file(filename));
    }
    
    private tc createtablecell(string text, boolean isheader, string color) {
        tc cell = factory.createtc();
        p p = factory.createp();
        r r = factory.creater();
        text t = factory.createtext();
        t.setvalue(text);
        r.getcontent().add(t);
        
        if (isheader || color != null) {
            rpr rpr = factory.createrpr();
            if (isheader) {
                rpr.setb(new booleandefaulttrue());
            }
            if (color != null) {
                color textcolor = new color();
                textcolor.setval(color);
                rpr.setcolor(textcolor);
            }
            r.setrpr(rpr);
        }
        
        p.getcontent().add(r);
        cell.getcontent().add(p);
        return cell;
    }
}

// 使用示例
@test
public void testdocxgenerator() throws exception {
    docxgenerator generator = new docxgenerator();
    
    generator.addtitle("测试文档", 1);
    generator.addparagraph("这是一个自动生成的测试文档");
    
    list<list<string>> tabledata = new arraylist<>();
    tabledata.add(arrays.aslist("id", "名称", "数量"));
    tabledata.add(arrays.aslist("1", "商品a", "100"));
    tabledata.add(arrays.aslist("2", "商品b", "200"));
    generator.addtable(tabledata);
    
    generator.savetofile("generated_doc.docx");
    
    file output = new file("generated_doc.docx");
    asserttrue(output.exists());
}

结语

本文提供了docx4j的全面增强版实现,包含丰富的代码示例和测试用例。通过这些示例,开发者可以:

  • 掌握docx4j的高级功能实现
  • 学习如何为docx4j编写有效的测试用例
  • 了解性能优化和异常处理的最佳实践
  • 使用提供的工具类简化日常开发

建议在实际项目中:

  • 根据业务需求封装专用工具类
  • 建立完善的测试体系
  • 监控文档生成的性能指标
  • 做好异常处理和日志记录

通过这些实践,可以充分发挥docx4j的强大功能,构建稳定高效的文档处理系统。

到此这篇关于java docx4j高效处理word文档的实战指南的文章就介绍到这了,更多相关java docx4j处理word内容请搜索代码网以前的文章或继续浏览下面的相关文章希望大家以后多多支持代码网!