当前位置: 代码网 > it编程>编程语言>Java > SpringBoot文件上传接口并发性能调优

SpringBoot文件上传接口并发性能调优

2024年06月12日 Java 我要评论
前言在一个项目现场,文件上传接口(文件500k)qps只有30,这个并发性能确实堪忧。此文记录出坑过程。问题一、inputstream按字节读取效率低// 读取上传的文件part part = req

前言

在一个项目现场,文件上传接口(文件500k)qps只有30,这个并发性能确实堪忧。此文记录出坑过程。

问题一、inputstream按字节读取效率低

// 读取上传的文件
part part = request.getpart("data");
inputstream in = part.getinputstream();

bytearrayoutputstream str=new bytearrayoutputstream();
int k;
byte[] file = null;
while((k=in.read())!=-1){
    str.write(k);
}
file = str.tobytearray();
str.close();
in.close();
/**
 * reads the next byte of data from the input stream. the value byte is
 * returned as an {@code int} in the range {@code 0} to
 * {@code 255}. if no byte is available because the end of the stream
 * has been reached, the value {@code -1} is returned. this method
 * blocks until input data is available, the end of the stream is detected,
 * or an exception is thrown.
 *
 * <p> a subclass must provide an implementation of this method.
 *
 * @return     the next byte of data, or {@code -1} if the end of the
 *             stream is reached.
 * @throws     ioexception  if an i/o error occurs.
 */
public abstract int read() throws ioexception;

直接调用接口发现接口响应确实比较慢,经过排查是上述代码in.read()按字节读取效率特别低。既然定位到问题了,换个方式,每次读取8k数据。

byte[] buffer = new byte[8192];
int bytesread;
while ((bytesread = in.read(buffer)) != -1) {
    str.write(buffer, 0, bytesread);
}
/**
 * reads some number of bytes from the input stream and stores them into
 * the buffer array <code>b</code>. the number of bytes actually read is
 * returned as an integer.  this method blocks until input data is
 * available, end of file is detected, or an exception is thrown.
 *
 * <p> if the length of <code>b</code> is zero, then no bytes are read and
 * <code>0</code> is returned; otherwise, there is an attempt to read at
 * least one byte. if no byte is available because the stream is at the
 * end of the file, the value <code>-1</code> is returned; otherwise, at
 * least one byte is read and stored into <code>b</code>.
 *
 * <p> the first byte read is stored into element <code>b[0]</code>, the
 * next one into <code>b[1]</code>, and so on. the number of bytes read is,
 * at most, equal to the length of <code>b</code>. let <i>k</i> be the
 * number of bytes actually read; these bytes will be stored in elements
 * <code>b[0]</code> through <code>b[</code><i>k</i><code>-1]</code>,
 * leaving elements <code>b[</code><i>k</i><code>]</code> through
 * <code>b[b.length-1]</code> unaffected.
 *
 * <p> the <code>read(b)</code> method for class <code>inputstream</code>
 * has the same effect as: <pre><code> read(b, 0, b.length) </code></pre>
 *
 * @param      b   the buffer into which the data is read.
 * @return     the total number of bytes read into the buffer, or
 *             <code>-1</code> if there is no more data because the end of
 *             the stream has been reached.
 * @exception  ioexception  if the first byte cannot be read for any reason
 * other than the end of the file, if the input stream has been closed, or
 * if some other i/o error occurs.
 * @exception  nullpointerexception  if <code>b</code> is <code>null</code>.
 * @see        java.io.inputstream#read(byte[], int, int)
 */
public int read(byte b[]) throws ioexception {
    return read(b, 0, b.length);
}

如果jdk>=9,可以使用readallbytes方法,更为便捷。内部实现其实也是按照8k进行读取的。

文件上传接口通常仅对业务逻辑做处理,文件存储往往会调用专门的存储服务。有2种处理思路:1、接收到完整文件数据,存储至内存中,然后调用存储接口;2、用流的方式,一边read servletrequest#inputstream,一边write 到存储服务的stream中。个人认为方式2更合理,节约内存。

问题二、tomcat暂存性能瓶颈

接口采用multipart/form-data方式上传文件,tomcat接收到请求后会将请求内容暂存至本地磁盘,目录通常位于tomcat basedir目录下,比如我本地路径为{basedir}\work\tomcat\localhost\root。受限于磁盘写入速率瓶颈,限制了接口性能上限。

机械硬盘写入速率预估100mb/s,则在千兆组网场景不存在性能瓶颈,如果是固态硬盘,则写入速率更高。所以此项配置在2g以上组网才需考虑配置。

修改方法为修改sizethreshold,默认值为0。如下所示修改为1mb,即内容大于1mb才存入磁盘,小于直接存入内存。

关于sizethreshold,catalina包中处理逻辑为:如果对servlet做了配置,会使用配置的值。如果未配置,默认值为0。util包中diskfileitemfactory默认值为10k。

servlet:
  multipart:
    file-size-threshold: 1mb

tomcat中的相关处理逻辑,parserequest方法按照rfc 1867规范对request进行处理。

// org.apache.tomcat.util.http.fileupload.disk.diskfileitemfactory.java

/**
 * <p>the default {@link org.apache.tomcat.util.http.fileupload.fileitemfactory}
 * implementation. this implementation creates
 * {@link org.apache.tomcat.util.http.fileupload.fileitem} instances which keep
 * their
 * content either in memory, for smaller items, or in a temporary file on disk,
 * for larger items. the size threshold, above which content will be stored on
 * disk, is configurable, as is the directory in which temporary files will be
 * created.</p>
 *
 * <p>if not otherwise configured, the default configuration values are as
 * follows:</p>
 * <ul>
 *   <li>size threshold is 10 kib.</li>
 *   <li>repository is the system default temp directory, as returned by
 *       {@code system.getproperty("java.io.tmpdir")}.</li>
 * </ul>
 * <p>
 * <b>note</b>: files are created in the system default temp directory with
 * predictable names. this means that a local attacker with write access to that
 * directory can perform a toutoc attack to replace any uploaded file with a
 * file of the attackers choice. the implications of this will depend on how the
 * uploaded file is used but could be significant. when using this
 * implementation in an environment with local, untrusted users,
 * {@link #setrepository(file)} must be used to configure a repository location
 * that is not publicly writable. in a servlet container the location identified
 * by the servletcontext attribute {@code javax.servlet.context.tempdir}
 * may be used.
 * </p>
 *
 * <p>temporary files, which are created for file items, will be deleted when
 * the associated request is recycled.</p>
 *
 * @since fileupload 1.1
 */
public class diskfileitemfactory implements fileitemfactory {

    // ----------------------------------------------------- manifest constants

    /**
     * the default threshold above which uploads will be stored on disk.
     */
    public static final int default_size_threshold = 10240;
}
// org.apache.tomcat.util.http.fileupload.disk.diskfileitem.java

/**
 * the threshold above which uploads will be stored on disk.
 */
private final int sizethreshold;

/**
 * returns an {@link java.io.outputstream outputstream} that can
 * be used for storing the contents of the file.
 *
 * @return an {@link java.io.outputstream outputstream} that can be used
 *         for storing the contents of the file.
 *
 */
@override
public outputstream getoutputstream() {
    if (dfos == null) {
        final file outputfile = gettempfile();
        dfos = new deferredfileoutputstream(sizethreshold, outputfile);
    }
    return dfos;
}
// org.apache.tomcat.util.http.fileupload.deferredfileoutputstream.java

/**
 * an output stream which will retain data in memory until a specified
 * threshold is reached, and only then commit it to disk. if the stream is
 * closed before the threshold is reached, the data will not be written to
 * disk at all.
 * <p>
 * this class originated in fileupload processing. in this use case, you do
 * not know in advance the size of the file being uploaded. if the file is small
 * you want to store it in memory (for speed), but if the file is large you want
 * to store it to file (to avoid memory issues).
 */
public class deferredfileoutputstream
    extends thresholdingoutputstream
{
     /**
     * constructs an instance of this class which will trigger an event at the
     * specified threshold, and save data to a file beyond that point.
     * the initial buffer size will default to 1024 bytes which is bytearrayoutputstream's default buffer size.
     *
     * @param threshold  the number of bytes at which to trigger an event.
     * @param outputfile the file to which data is saved beyond the threshold.
     */
    public deferredfileoutputstream(final int threshold, final file outputfile)
    {
        this(threshold,  outputfile, null, null, null, bytearrayoutputstream.default_size);
    }
}

问题三、网络带宽瓶颈

对于常规企业内部应用,局域网环境下,至少能提供稳定的千兆带宽,常规业务接口不存在网络带宽瓶颈。但是对于文件上传接口而言,即使是小文件上传,接口并发高的场景带宽消耗依然较大,可能是性能瓶颈。

以千兆带宽为例,理论最大上传速率=1000mbps÷8=125mb/s理论最大上传速率=1000mbps÷8=125mb/s理论最大上传速率=1000mbps÷8=125mb/s,实际场景很难达到理论最大速率,按照100mb/s预估。500k:200qps,1m:100qps,2m:50qps

问题解决思路整理

  • client
    指请求接口的客户端
  • nginx
    作为反向代理服务器
  • tomcat
    web容器
  • webserver
    web服务,比如springboot项目

排查过程可以根据由外向内层层递进的方式进行排查,当然也可采用经验判断法,对最有可能出现性能瓶颈的webserver进行排查。

  • 复现问题,在高负载场景请求接口复现问题或者使用jmeter等工作做并发压力测试。复现问题是解决问题的基础。
  • 查看接口请求耗时,对耗时结构进行分析,比如wating(ttfb)、content download耗时长,。比如content download耗时长,那就会首先怀疑带宽。
  • nginx性能较高,出现瓶颈概率低。可通过查看nginx访问日志,对比接口总耗时,如果耗时差异较大,就需要排查nginx本身性能、nginx与tomcat之间网络。
  • tomcat作为主流的web容器,影响性能的配置主要是maxthreads、maxconnections、堆内存、垃圾回收。对于成熟的应用开发团队,会有相对合理的初始配置。可通过查看tomcat访问日志,对比webserver接口耗时,如果耗时差异较大,就需要排查tomcat自身性能问题。
  • webserver中的业务处理逻辑,通常是接口总耗时占比最高的。优先在controller入口和出口记录日志,计算controller总耗时。如果确定是业务逻辑耗时长,再层层递进排查缩小范围,找到罪魁祸首。

测试性能汇总

测试环境

  • 服务器主机、客户机
    测试环境所限,服务器主机、客户机使用同一台开发主机。操作系统:windows10,cpu:intel(r) xeon(r) gold 6242r cpu @ 3.10ghz,内存16g

  • 磁盘
    rnd512kq1t1 read1219.86mb/s write44.88mb/s

  • jmeter
    400线程,60s拉起全部线程

  • tomcat
    tomcat9,做了如下配置

tomcat:
    threads:
        max: 400
    max-connections: 10000
    accept-count: 1000
  • jar启动参数
    配置了初始堆内存 java -dfile.encoding=utf-8 -jar .\xxx.jar -server -xms4096m -xmx9000m

测试结果

类型平均响应时间 ms吞吐量/s
原始状态220810.18
优化byte[]396689
优化file-size-threshold1203265
基准-(form-data)1279279
基准-(优化file-size-threshold)1092930
基准-空接口2812401

原始状态:现场报性能问题时的版本,性能太过炸裂,jmeter线程数调整为4,测试上传文件5kb
优化byte[]:优化了从stream读取存入优化byte[]方法,测试上传文件5kb。此时网络吞吐量45mb/s,生产环境服务器配置性能至少比当前测试机器高2倍,接口性能至少提高1倍,对于千兆组网场景无须进一步优化,并发瓶颈是网络带宽
优化file-size-threshold:优化为>1mb文件才存入磁盘,测试场景文件全部读入内存,测试上传文件5kb。此时网络吞吐量已大于100mb/s
基准-(form-data):form-data配置简单key参数,不上传文件,服务端接口直接返回简单字符串。相当于默认情况下form-data参数类型接口的性能基准,性能瓶颈是磁盘写入速率
基准-(优化file-size-threshold):form-data配置简单key参数,不上传文件,服务端接口直接返回简单字符串,优化为>1mb文件才存入磁盘。可以对比看出磁盘与内存的速率差异
基准-空接口:普通的get无参接口,直接返回“hello”,作为当前配置环境下,tomcat接口性能极限

现场问题处理方案

经过定位现场性能瓶颈是网络。现场采用分布式架构,客户端、服务端部署多个节点,客户端通过本地回环地址调用服务端,降低网络压力。

原架构

新架构

以上就是springboot文件上传接口并发性能调优的详细内容,更多关于springboot接口性能调优的资料请关注代码网其它相关文章!

(0)

相关文章:

版权声明:本文内容由互联网用户贡献,该文观点仅代表作者本人。本站仅提供信息存储服务,不拥有所有权,不承担相关法律责任。 如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至 2386932994@qq.com 举报,一经查实将立刻删除。

发表评论

验证码:
Copyright © 2017-2025  代码网 保留所有权利. 粤ICP备2024248653号
站长QQ:2386932994 | 联系邮箱:2386932994@qq.com