图书馆杂志

图书馆杂志 ›› 2023, Vol. 42 ›› Issue (386): 47-55.

• 工作研究 • 上一篇    下一篇

基于广义加性模型的图书馆借阅预测研究

陈金传 成志强(华东师范大学图书馆)   

  • 出版日期:2023-06-15 发布日期:2023-07-03
  • 作者简介:陈金传 华东师范大学图书馆,馆员。研究方向:大数据、人工智能等。作者贡献:论文的撰写与修改、代码的编写。E-mail: chenjinzhuan@library.ecnu.edu.cn上海 200241 成志强 华东师范大学图书馆,馆员。研究方向:竞争情报、大数据等。作者贡献:论文的撰写、实验数据的收集与整理。 上海 200241

Analysis on the Application of Generalized AdditiveModel Based on Nesterov Acceleration in Library LendingPrediction

Chen Jinzhuan, Cheng Zhiqiang (The Library of East China Normal University)   

  • Online:2023-06-15 Published:2023-07-03
  • About author:Chen Jinzhuan, Cheng Zhiqiang (The Library of East China Normal University)

摘要:

本文意图通过建立读者特征、不同类别图书流通量、读者借阅时间3 者的关系模型,探索读者特征与借阅趋势之间的隐含规律,为图书馆的智慧管理提供可靠且快速的预测与分析。本文创新性地提出了基于广义加性模型(GAM) 的3 阶段快速拟合模型,采用Onehot 编码、线性和非线性3 种函数进行数据拟合,建立读者特征与图书流通的回归模型。考虑到图书馆数据的庞大性,本文利用Nesterov 方法和Power Iteration 方法对回归模型进行加速,在保证回归准确率的前提下,大幅度提高了算法速度。在真实图书馆数据上的实验表明,本文方法相较于纯线性模型准确性可以提高约70%,速度仅下降约30%;相较于纯非线性模型速度可以提高约6 倍,而准确率仅下降约15%,较好地满足图书馆大规模数据的分析。

关键词: 广义加性模型 图书馆 Nesterov 加速 Power Iteration 方法

Abstract:

This paper intends to explore the implicit law between reader characteristics and borrowingtrends by establishing a relationship model among reader characteristics, different types of bookcirculation, and reader borrowing time, so as to provide reliable and rapid prediction and analysis forthe intelligent management of libraries. This paper innovatively proposes a three-stage fast fitting modelbased on generalized additive model (GAM), and uses one-hot coding, linear and nonlinear functions to fitdata, and establishes a regression model between reader characteristics and book circulation. Consideringthe hugeness of library data, this paper uses Nesterov method and power iteration method to acceleratethe regression model, which greatly improves the speed of the algorithm on the premise of ensuring theaccuracy of the regression. Experiments on real library data show that the accuracy of the method in thispaper can be improved by about 70% and the speed is only reduced by 30% compared with the pure linearmodel. Compared with the pure nonlinear model, the speed can be increased by about 6 times, and theaccuracy rate is only reduced by about 15%, which is better for the analysis of large-scale data in libraries.

Key words: Generalized additive model, Library, Nesterov acceleration, Power Iteration method