面向AI4Science的科学论文图像语义描述框架体系构建研究

摘要/Abstract

摘要： 科学论文图像的语义描述与标注是提升文献知识挖掘和智能信息管理能力的关键。本文基于学术图像的元数据结构和信息需求，提出了一个多层次的科学论文图像语义描述框架（SDFSLI），旨在实现 AI4Science 背景下科学文献图像内容的精确语义解析。该框架通过基础标识层、内容层、语义层及关系层4个相互关联的层级，分别从图像的基本信息、视觉内容、语义内涵和层次关联4个维度解析科学论文图像内容。本文同时构建了全面的本体模型，系统地映射了不同框架层之间的语义关系，利用多模态大型语言模型，在图书情报领域论文进行了实证案例验证，证明了该框架的实用性与有效性。本文讨论了科学论文图像语义描述的关键挑战与未来机遇，为科学知识管理的智能化发展提供了参考。

关键词: 科学论文图像, 语义描述框架, 本体模型, 知识服务

Abstract: The semantic description and annotation of scientific document images are essential for enhancing knowledge mining and intelligent information management of academic literature. Based on metadata structures and information requirements of academic images, this paper proposes a multi-level semantic description framework for scientific paper images (SDF-SLI), aimed at achieving precise semantic parsing of scientific document image content in the context of AI4Science. The framework analyzes scientific document images through four interconnected layers—basic identification layer, content layer, semantic layer, and relationship layer—which examine basic information, visual content, semantic connotation, and hierarchical relationships. This research also establishes a comprehensive ontology model that systematically maps semantic relationships between these framework layers, utilizing multimodal large language models for implementation. Through empirical case studies in library and information science literature, the framework demonstrates its practicality and effectiveness. The paper further discusses key challenges and future opportunities in semantic description of scientific paper images, offering insights into the intelligent development of scientific knowledge management.

张逸勤, 邓三鸿, 巩洪村, 杨杰, 刘浏. 面向AI4Science的科学论文图像语义描述框架体系构建研究[J]. 图书馆杂志, 2026, 45(5): 15-26.

Zhang Yiqin, Deng Sanhong, Gong Hongcun, Yang Jie, Liu Liu. Research on the Construction of a Semantic Description Framework for Scientific Document Images Oriented toward AI4Science[J]. Libraly Journal, 2026, 45(5): 15-26.

参考文献

［1］ Huang J, Chen H, Yu F,et al. From detection to application： Recent advances in understanding scientific tables and figures[J]. ACM Computing Surveys, 2024, 56（10）： 1-39.
［2］ Burri R V, Dumit J. 13 social studies of scientific imaging and visualization[J]. The Handbook of Science and Technology Studies, 2008, 297： 297-317.
［3］ Ariga K, Tashiro M. Change in the graphics of journal articles in the life sciences field： analysis of figures and tables in the journal “Cell”[J]. History and Philosophy of the Life Sciences, 2022, 44（3）： 1-34.
［4］邓三鸿,杨杰,王昊,等.多源异构数据视角下的学术评价：内涵、进展与展望[J].科技情报研究,2023, 5（4）： 42-56.
［5］曹树金,曹茹烨.基于知识图谱支持科研创新的跨学科知识发现研究[J].情报理论与实践,2022, 45（11）： 10-20.
［6］张鹤,黄倩.多模态信息融合的知识服务[J].中国科技资源导刊,2016, 48（6）： 50-54.
［7］黄永文,孙坦,赵瑞雪,等.大数据与人工智能背景下新型知识服务研究与实践[J].图书情报工作,2022, 66（19）： 36-46.
［8］ Eakins J P.Automatic image content retrieval are we getting anywhere?[C]//Proceeding of Third International Conference on E-lectronic Library and Visual Information Research. De Mont fort University. Milton Keynes： Aslib, 1996：123125.
［9］ Jaimes A, Chang S F.Model-based classification of visual information for content-based retrieval[C]//Proceedings of SPIEThe International Society for Optical Engineering, 1998： 402-414.
[10] Kong H, Hwang M, Kim P.The study on the semantic image retrieval based on the personalized ontology[J].International Journal of Information Technology, 2006, 12（2）： 35-46.
[11] Siegel N, Horvitz Z, Levin R, et al. Figure seer： parsing result-Figures in research papers[C]//European Conference on Computer Vision, 2016： 664-680.
[12] 丁培,叶兰.科技文献中学术图表标注研究进展[J].现代情报,2021, 41（4）： 165-177.
[13] 高隽,谢昭,张骏,等.图像语义分析与理解综述[J].模式识别与人工智能,2010, 23（2）： 191-202.
[14] 于永新.基于本体的图像语义识别和检索研究[D].天津：天津大学,2009.
[15] 王晓光,徐雷,李纲.敦煌壁画数字图像语义描述方法研究[J].中国图书馆学报,2014, 40（1）： 50-59.
[16] 雷珏莹,王晓光,侯西龙.基于IIIF的图像数字叙事策略研究[J].图书情报工作,2023, 67（21）： 25-34.
[17] 齐小英,杨海平.历史地图知识组织：需求、框架及实践[J].中国图书馆学报,2024, 50（3）： 82-95.
[18] 陈金菊,欧石燕.数字图像语义标注模型比较与分析[J].图书情报工作,2018, 62（6）： 116-124.
[19] Jwalin Bhatt, Khurram Azeem Hashmi, Muhammad Zeshan Afzal, et al. A survey of graphical page object detection with deep neural networks[J].Applied Sciences,2021, 11（12）： 53-44.
[20] 徐雷,张亚菲,叶均玲.科技文献创新内容的识别、组织与应用进展[J].情报学报,2024, 43（2）： 237-250.
[21] 王晓光,李梦琳,宋宁远.科学论文功能单元本体设计与标引应用实验[J].中国图书馆学报,2018, 44（4）： 73-88.
[22] 张敏,丁良萍,刘欢.面向科技文献的多维语义索引构建思路及实现[J].情报理论与实践,2021, 44（8）： 139-145.
[23] 石栖,陈文杰,胡正银,等.面向知识发现的科学实验知识图谱构建研究[J/OL].数据分析与知识发现,118[2024-10-28].http：//kns.cnki.net/kcms/detail/10.1478.G2.20240626.0951.002.html.
[24] 沈思,朱雨菲.面向学术全文本多维知识元的学术图谱构建研究[J].情报学报,2024, 43（8）： 960-975.
[25] 丁培.学术图表知识发现技术框架及研究进展[J].图书情报工作,2021, 65（23）： 136-148.
[26] 范昊,郑小川,热孜亚·艾海提,等.基于知识图谱的标准文献多维知识发现研究[J].情报理论与实践,2023, 46（9）： 175-184.
[27] 徐雷,秦翠玉,李娇.科技文献数据化及组织呈现路径研究[J].中国图书馆学报,2022, 48（3）： 25-42.
[28] Dai Wenjing, Wang Meng, Niu Zhibin, et al. Chart decoder： generating textual and numeric information from chart images automatically[J].Journal of Visual Languages & Computing, 2018, 10（48）： 101-109.
[29] Cliche M, Rosenberg D, Madeka D, et al. Scatteract： Automated extraction of data from scatter plots[C]//Machine Learning and Knowledge Discovery in Databases： European Conference, ECML PKDD 2017, Skopje, Macedonia, September 1822, 2017, Proceedings, Part I 10. Springer International Publishing, 2017： 135-150.
[30] Rights Link for Scientific Communications[EB/OL].[2024-10-28]. https：//www.copyright.com/solutions-rightslink-scientific-communications/.
[31] Jaradeh M Y, Oelen A, Farfar K E, et al. Open research knowledge graph： next generation infrastructure for semantic scholarly knowledge[C]//Proceedings of the 10th International Conference on Knowledge Capture. 2019： 243246.
[32] Hashmi K A, Liwicki M, Stricker D, et al. Current status and performance analysis of table recognition in document images with deep neural networks[J]. IEEE Access, 2021, 9： 87663-87685.
[33] Davila K,Setlur S, Doermann D, et al. Chart mining： a survey of methods for automated chart analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43（11）： 3799-3819.
[34] Berkley C, Bowers S, Jones M B,et al.Improving data discovery for metadata repositories through semantic search[C]//International Conference on Complex, Intelligent and Software Intensive Systems. Fukuoka： IEEE, 2009： 1152-1159.
[35] Takis J, Islam A Q M, Lange C,et al.Crowdsourced semantic annotation of scientific publications and tabular data in PDF[C]//Proceedings of the 11th International Conference on Semantic Systems. ACM, 2015： 1-8．
[36] Shi X, Wu Y, Cao H, et al. Layoutaware subfigure decomposition for complex figures in the biomedical literature[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing （ICASSP）. IEEE, 2019： 1343-1347.
[37] Wang N X R, Mahajan D,Danilevsky M, et al. Semeval-2021 task 9： fact verification and evidence finding for tabular data in scientific documents[EB/OL].[2024-10-31]https://aclanthology.org/2021.semeval-1.39/.
[38] Vra Core[EB/OL]. [2024-10-31]. https：//www.loc.gov/standards/vracore/.
[39] CIDOC-CRM[EB/OL]. [2024-10-31]. https：//cidoc-crm.org/sites/default/files/cidoc_crm_version_7.3.pdf.
[40] OpenAI. ChatGPT[EB/OL].[2024-07-27]. https：//chat.openai.com/.
[41] 杨杰,王左戎,邓三鸿,等.基于参考文献的论文跨学科性、跨时域性及其影响力研究[J].情报学报,2024, 43（9）： 1003-1014.
[42] 秦春秀,杨智娟,赵捧未,等.面向科技文献知识表示的知识元本体模型[J].图书情报工作,2018, 62（3）： 94-103.
[43] 赖茂生.数智时代的知识组织[J].科技情报研究,2024, 6（2）： 1-10.
[44] 陈涛,李惠,张永娟,等.LIBRA技术理论及其在史料图像资源中的应用[J].大学图书馆学报,2022, 40（4）： 64-74.

[1]	富国瑞　王平利　王一展　宋西贵(山东大学图书馆). 基于大语言模型的高校图书馆智能参考咨询服务构建与应用研究——以山东大学图书馆为例[J]. 图书馆杂志, 2025, 44(416): 31-40.
[2]	薛霏1 王静静2 叶鹰3, 4（1 浙江大学图书馆 2 山东大学新闻传播学院 3 复旦大学国家智能评价与治理实验基地 4 南京大学江苏省国际联合信息学实验室）. DeepSeek 推动下生成式AI走势及其图书馆应用前景探析[J]. 图书馆杂志, 2025, 44(409): 43-50.
[3]	李谦升（上海大学上海美术学院）. 数智时代美术资源知识服务体系构建研究——以艺术类院校应用场景为例[J]. 图书馆杂志, 2024, 43(401): 24-36.
[4]	王曦（北京大学考古文博学院）. 我国知识服务研究的现状、热点和趋势——基于CSSCI期刊论文[J]. 图书馆杂志, 2024, 43(400): 88-96.
[5]	周琼李卫姣王健任树怀（上海外国语大学图书馆上海外国语大学数字学术中心上海全球治理与区域国别研究院）. 教育4.0 环境下高校智慧图书馆实现途径研究[J]. 图书馆杂志, 2023, 42(389): 68-76.
[6]	肖敖夏, 董嘉慧, 刘华玮, 邸虹维, 杨思洛（武汉大学信息管理学院武汉大学中国科学评价研究中心）. 基于Session 识别的高校图书馆电子资源用户访问行为画像分析[J]. 图书馆杂志, 2022, 41(1): 98-105.
[7]	吴建中. 新中国图书馆事业70年专栏追求同步：图书馆新一轮发展的机遇与挑战 [J]. 图书馆杂志, 2019, 38(12): 4-10.
[8]	于倩倩　张建勇　黄永文. 大数据环境下的文献元数据标准设计特点分析[J]. 图书馆杂志, 2018, 37(11): 35-39.
[9]	贾苹刘雅静刘细文尤越彭小花. 科技创新创业早期项目平台：专业图书馆的信息服务新实践——以中国科学院文献情报中心为例[J]. 图书馆杂志, 2017, 36(6): 14-22.
[10]	聂英. 基于ORCID的图书馆科研信息服务创新研究[J]. 图书馆杂志, 2017, 36(3): 42-45.
[11]	程卫萍潘杏梅王衍. 省级科技文献共享服务平台现状调查与分析[J]. 图书馆杂志, 2016, 35(7): 50-58.
[12]	戴广珠. 知识服务：省级公共图书馆信息资源建设新走向[J]. 图书馆杂志, 2016, 35(2): 35-41.
[13]	夏翠娟张磊. 关联数据在家谱数字人文服务中的应用[J]. 图书馆杂志, 2016, 35(10): 26-34.
[14]	仝召娟,许鑫. 面向图书馆知识服务的开放式网络百科信息质量的控制[J]. 图书馆杂志, 2015, 34(7): 24-31.
[15]	吴贝贝;夏翠娟. 关联书目数据模型比较研究[J]. 图书馆杂志, 2015, 34(5): 71-79.