基于编辑序列的跨语言重构检测方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61440012);河北省自然科学基金(F2023208001);河北省引进留学人员资助项目(C20230358)


Cross-language refactoring detection method based on edit sequence
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对基于深度学习的重构检测方法中开发人员没有一致性地记录重构操作,导致提交的信息不可靠以及语言单一性问题,提出了一种新的跨语言重构检测方法RefCode。首先,采用重构收集工具从不同编程语言中收集提交信息、代码变更信息和重构类型,通过代码变更信息生成编辑序列,将所有数据组合为数据集;其次,将CodeBERT预训练模型与BiLSTM-attention模型相结合,在数据集上进行训练和测试;最后,从6个方面对模型进行评估,以验证方法的有效性。结果表明,RefCode相较于只采用提交信息作为LSTM模型输入的重构检测方法,在精确度和召回率方面均实现了约50个百分点的显著提升。研究结果实现了跨语言重构检测,并有效弥补了提交信息不可靠的缺陷,可为其他编程语言和重构类型的检测提供参考。

    Abstract:

    Aiming at the problems of unreliable commit message caused by developers not consistently recording refactoring operations, and language singularityin deep learning-based refactoring detection methods, a cross-language refactoring detection method RefCode was proposed. Firstly, refactoring collection tools were employed to collect commit messages, code change information, and refactoring types from different programming languages, the edit sequences were generated from the code change information, and all the data were combined to create a dataset. Secondly, the CodeBERT pre-training model was combined with the BiLSTM-attention model to train and test on the dataset. Finally, the effectiveness of the proposed method was evaluated from six perspectives. The results show that RefCode achieves a significant improvement of about 50% in both precision and recall compared to the refactoring detection method which only uses commit messages as inputs to the LSTM model. The research results realize cross-language refactoring detection and effectively compensate for the defect of unreliable commit messages, which provides some reference for the detection of other programming languages and refactoring types.

    参考文献
    相似文献
    引证文献
引用本文

李 涛,张冬雯,张 杨,郑 琨.基于编辑序列的跨语言重构检测方法[J].河北科技大学学报,2024,45(6):627-635

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-12-20
  • 最后修改日期:2024-04-06
  • 录用日期:
  • 在线发布日期: 2025-01-02
  • 出版日期:
文章二维码