主页 > 软件开发  > 

MySQL按序批量操作大量数据

MySQL按序批量操作大量数据
MySQL按序批量操作大量数据(Java、springboot、mybatisplus、ElasticSearch)

以同步全量MySQL数据到ElasticSearch为例。

核心代码

业务逻辑:

public boolean syncToElasticsearch() { log.info("Starting data synchronization to Elasticsearch."); // 获取最大id和最小id IdRange idRange = newsMapper.getIdRange(); log.info("idRange is:{}", JSON.toJSONString(idRange)); if (idRange == null || idRange.getMinId() <= 0 || idRange.getMaxId() <= 0) { log.warn("Invalid id range or no data found in MySQL. Sync process aborted."); return false; } long pageSize = 200L; long startId = idRange.getMinId(); try { // 循环处理所有数据 while (startId <= idRange.getMaxId()) { // 业务逻辑 可以替换成自己需要的 log.info("syncToElasticsearch startId:{}", startId); List<News> newsList = newsService.getByIdRange(startId, pageSize); log.info("syncToElasticsearch newsList size:{}", newsList.size()); if (CollectionUtils.isEmpty(newsList)) { break; } newsEsService.bulkUpsertToElasticsearch(getNewsEsDTOList(newsList)); // 更新startId startId = newsList.get(newsList.size() - 1).getId() + 1; log.info("Synced {} Newss to Elasticsearch, current id is:{}", newsList.size(), startId); } } catch (Exception e) { log.error("Error occurred during News data synchronization to Elasticsearch.", e); return false; } log.info("Data synchronization to Elasticsearch completed."); return true; }

mapper:

public interface NewsMapper extends BaseMapper<News> { /** * 获取最小和最大id值的范围 * * @return */ @Select("SELECT MIN(id) AS minId, MAX(id) AS maxId FROM news") IdRange getIdRange(); }

newsService:

public List<News> getByIdRange(long startId, long pageSize) { if (startId <= 0 || pageSize <= 0) { return Collections.emptyList(); } return this.list(new LambdaQueryWrapper<News>() .ge(News::getId, startId) .orderByAsc(News::getId) .last("limit " + pageSize)); }

实体类定义:

@Data public class IdRange { private Long minId; private Long maxId; } 为什么不直接用分页?

页面深度越大查询性能越慢,当表有大量数据时处理后面的数据会很耗时。

标签:

MySQL按序批量操作大量数据由讯客互联软件开发栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“MySQL按序批量操作大量数据