Java知音 - MySQL批量插入數據的四種方案（性能測試對比）－鑽石舞台

點擊關注公眾號，實用技術文章及時了解

本文記錄個人使用MySQL插入大數據總結較實用的方案，通過對常用插入大數據的4種方式進行測試，即for循環單條、拼接SQL、批量插入saveBatch()、循環 + 開啟批處理模式，得出比較實用的方案心得。

一、前言

最近趁空閒之餘，在對MySQL數據庫進行插入數據測試，對於如何快速插入數據的操作無從下手，在僅1W數據量的情況下，竟花費接近47s，實在不忍直視！在不斷摸索之後，整理出一些較實用的方案。

二、準備工作

測試環境：SpringBoot項目、MyBatis-Plus框架、MySQL8.0.24、JDK13

前提：SpringBoot項目集成MyBatis-Plus上述文章有配置過程，同時實現IService接口用於進行批量插入數據操作saveBatch()方法

1、Maven項目中pom.xml文件引入的相關依賴如下

<dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>com.baomidou</groupId><artifactId>mybatis-plus-boot-starter</artifactId><version>3.3.1</version></dependency><dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId></dependency></dependencies>

2、application.yml配置屬性文件內容（重點：開啟批處理模式）

server:#端口號port:8080#MySQL連接配置信息（以下僅簡單配置，更多設置可自行查看）spring:datasource:#連接地址（解決UTF-8中文亂碼問題+時區校正）#（rewriteBatchedStatements=true開啟批處理模式）url:jdbc:mysql://127.0.0.1:3306/bjpowernode?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai&rewriteBatchedStatements=true#用戶名username:root#密碼password:xxx#連接驅動名稱driver-class-name:com.mysql.cj.jdbc.Driver

3、Entity實體類（測試）

/***Student測試實體類**@Data註解：引入Lombok依賴，可省略Setter、Getter方法*@authorLBF*@date2022/3/1816:06*/@Data@TableName(value="student")publicclassStudent{/**主鍵type:自增*/@TableId(type=IdType.AUTO)privateintid;/**名字*/privateStringname;/**年齡*/privateintage;/**地址*/privateStringaddr;/**地址號@TableField：與表字段映射*/@TableField(value="addr_num")privateStringaddrNum;publicStudent(Stringname,intage,Stringaddr,StringaddrNum){this.name=name;this.age=age;this.addr=addr;this.addrNum=addrNum;}}

4、數據庫student表結構（注意：無索引）

三、測試工作

簡明：完成準備工作後，即對for循環、拼接SQL語句、批量插入saveBatch()、循環插入+開啟批處理模式，該4種插入數據的方式進行測試性能。

注意：測試數據量為5W、單次測試完清空數據表（確保不受舊數據影響）

以下測試內容可能受測試配置環境、測試規範和數據量等諸多因素影響，讀者可自行結合參考進行測試

1、for循環插入（單條）（總耗時：177秒）

總結：測試平均時間約是177秒，實在是不忍直視（捂臉），因為利用for循環進行單條插入時，每次都是在獲取連接(Connection)、釋放連接和資源關閉等操作上，（如果數據量大的情況下）極其消耗資源，導致時間長。

@GetMapping("/for")publicvoidforSingle(){//開始時間longstartTime=System.currentTimeMillis();for(inti=0;i<50000;i++){Studentstudent=newStudent("李毅"+i,24,"張家界市"+i,i+"號");studentMapper.insert(student);}//結束時間longendTime=System.currentTimeMillis();System.out.println("插入數據消耗時間："+(endTime-startTime));}

（1）第一次測試結果：190155 約等於 190秒

（2）第二次測試結果：175926 約等於 176秒（服務未重啟）

（3）第三次測試結果：174726 約等於 174秒（服務重啟）

2、拼接SQL語句（總耗時：2.9秒）

簡明：拼接格式：insert into student(xxxx) value(xxxx),(xxxx),(xxxxx).......

總結：拼接結果就是將所有的數據集成在一條SQL語句的value值上，其由於提交到服務器上的insert語句少了，網絡負載少了，性能也就提上去。

但是當數據量上去後，可能會出現內存溢出、解析SQL語句耗時等情況，但與第一點相比，提高了極大的性能。

@GetMapping("/sql")publicvoidsql(){ArrayList<Student>arrayList=newArrayList<>();longstartTime=System.currentTimeMillis();for(inti=0;i<50000;i++){Studentstudent=newStudent("李毅"+i,24,"張家界市"+i,i+"號");arrayList.add(student);}studentMapper.insertSplice(arrayList);longendTime=System.currentTimeMillis();System.out.println("插入數據消耗時間："+(endTime-startTime));}//使用@Insert註解插入：此處為簡便，不寫Mapper.xml文件@Insert("<script>"+"insertintostudent(name,age,addr,addr_num)values"+"<foreachcollection='studentList'item='item'separator=','>"+"(#{item.name},#{item.age},#{item.addr},#{item.addrNum})"+"</foreach>"+"</script>")intinsertSplice(@Param("studentList")List<Student>studentList);

（1）第一次測試結果：3218 約等於 3.2秒

（2）第二次測試結果：2592 約等於 2.6秒（服務未重啟）

（3）第三次測試結果：3082 約等於 3.1秒（服務重啟）

3、批量插入saveBatch（總耗時：2.7秒）

簡明：使用MyBatis-Plus實現IService接口中批處理saveBatch()方法，對底層源碼進行查看時，可發現其實是for循環插入，但是與第一點相比，為什麼性能上提高了呢？因為利用分片處理（batchSize = 1000） + 分批提交事務的操作，從而提高性能，並非在Connection上消耗性能。

@GetMapping("/saveBatch1")publicvoidsaveBatch1(){ArrayList<Student>arrayList=newArrayList<>();longstartTime=System.currentTimeMillis();//模擬數據for(inti=0;i<50000;i++){Studentstudent=newStudent("李毅"+i,24,"張家界市"+i,i+"號");arrayList.add(student);}//批量插入studentService.saveBatch(arrayList);longendTime=System.currentTimeMillis();System.out.println("插入數據消耗時間："+(endTime-startTime));}

（1）第一次測試結果：2864 約等於 2.9秒

（2）第二次測試結果：2302 約等於 2.3秒（服務未重啟）

（3）第三次測試結果：2893 約等於 2.9秒（服務重啟）

重點注意：MySQL JDBC驅動默認情況下忽略saveBatch()方法中的executeBatch()語句，將需要批量處理的一組SQL語句進行拆散，執行時一條一條給MySQL數據庫，造成實際上是分片插入，即與單條插入方式相比，有提高，但是性能未能得到實質性的提高。

測試：數據庫連接URL地址缺少 rewriteBatchedStatements = true 參數情況

#MySQL連接配置信息spring:datasource:#連接地址（未開啟批處理模式）url:jdbc:mysql://127.0.0.1:3306/bjpowernode?useUnicode=true&characterEncoding=UTF-8&serverTimezone=Asia/Shanghai#用戶名username:root#密碼password:xxx#連接驅動名稱driver-class-name:com.mysql.cj.jdbc.Driver

測試結果：10541 約等於 10.5秒（未開啟批處理模式）

4、循環插入 + 開啟批處理模式（總耗時：1.7秒）（重點：一次性提交）

簡明：開啟批處理，關閉自動提交事務，共用同一個SqlSession之後，for循環單條插入的性能得到實質性的提高；由於同一個SqlSession省去對資源相關操作的耗能、減少對事務處理的時間等，從而極大程度上提高執行效率。（目前個人覺得最優方案）

@GetMapping("/forSaveBatch")publicvoidforSaveBatch(){//開啟批量處理模式BATCH、關閉自動提交事務falseSqlSessionsqlSession=sqlSessionFactory.openSession(ExecutorType.BATCH,false);//反射獲取，獲取MapperStudentMapperstudentMapper=sqlSession.getMapper(StudentMapper.class);longstartTime=System.currentTimeMillis();for(inti=0;i<50000;i++){Studentstudent=newStudent("李毅"+i,24,"張家界市"+i,i+"號");studentMapper.insertStudent(student);}//一次性提交事務sqlSession.commit();//關閉資源sqlSession.close();longendTime=System.currentTimeMillis();System.out.println("總耗時："+(endTime-startTime));}

（1）第一次測試結果：1831 約等於 1.8秒

（2）第二次測試結果：1382 約等於 1.4秒（服務未重啟）

（3）第三次測試結果：1883 約等於 1.9秒（服務重啟）

四、總結

本文記錄個人學習MySQL插入大數據一些方案心得，可得知主要是在獲取連接、關閉連接、釋放資源和提交事務等方面較耗能，其中最需要注意是開啟批處理模式，即URL地址的參數：rewriteBatchedStatements = true，否則也無法發揮作用。

對於測試方案的設定、對考慮不周、理解和編寫錯誤的地方等情況，請多指出，共同學習！

來源：blog.csdn.net/a18505947362/article/

details/123667215

鑽石舞台

鑽石鑽石亮晶晶

Java知音 - MySQL批量插入數據的四種方案（性能測試對比）

歷史上的今天

留言列表

文章搜尋

最新文章

熱門文章

誰來我家

參觀人氣