基于元路径异构网络嵌入的姓名实体消歧方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

中国留学基金委地方合作项目(201808130283); 中国教育部人工智能协同育人项目(201801003011); 河北科技大学校立课题(82/1182108); 河北科技大学雾霾与空气污染防治科研项目(82/1182169); 河北省科技支撑计划项目(17210104D, 18210109D); 河北省高等学校科学技术研究项目(ZD2015099); 河北省高层次人才资助项目(A2016002015)


Disambiguation method of name entities embedded in meta-path heterogeneous networks
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了解决大型学术数据库中重名作者的歧义消解问题,提出了基于元路径异构网络嵌入的姓名实体消歧模型。使用大型在线学术搜索系统DBLP上的公开数据集,首先抽取学术出版物的作者信息、标题和会议期刊名称等特征属性,再利用word2vec模型工具生成的特征属性词嵌入输入到GRU网络中进行训练,构造出一个PHNet矩阵网络进行随机游走操作,从而捕捉不同类型节点之间的关系,最后进行相似节点的划分,完成姓名消歧工作。实验结果显示,新方法的精确度为0.865,召回率为0.792,F1值为0.815。基于元路径的异构网络嵌入模型的精确度、召回率等指标都优于对比模型。因此,所提出的模型在提高大型学术数据库的消歧精准度方面具有良好的应用前景。

    Abstract:

    In order to solve the problem of disambiguation of duplicate authors in large academic databases, a name entity disambiguation model based on meta-path heterogeneous network was proposed. Based on the public data of the large online academic search system DBLP, the author information, title, name of conference journal and other characteristic attributes of academic publications were extracted first. Then the characteristic attribute words generated by the word2vec model tool were embedded into the GRU network for training, so that a PHNet matrix network for random walk operation was constructed to capture the relationship between different types of nodes and finally similar nodes were divided to complete the name disambiguation. The experimental results show that the accuracy of the method is 0.865, the recall rate is 0.792, and the F1 value is 0.815.The meta-path-based heterogeneous network embedding model is superior to the comparison model in terms of accuracy and recall rate. Therefore, the proposed model has a good application prospect in improving the accuracy of disambiguation of large academic databases.

    参考文献
    相似文献
    引证文献
引用本文

王建霞,张玉璇,许云峰.基于元路径异构网络嵌入的姓名实体消歧方法[J].河北科技大学学报,2020,41(3):233-241

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-03-25
  • 最后修改日期:2020-05-25
  • 录用日期:
  • 在线发布日期: 2020-07-06
  • 出版日期:
文章二维码