TY - JOUR
T1 - Towards linking libraries and Wikipedia
T2 - Aautomatic subject indexing of library records with Wikipedia concepts
AU - Joorabchi, Arash
AU - Mahdi, Abdulhussain E.
PY - 2014/4
Y1 - 2014/4
N2 - In this article, we first argue the importance and timely need of linking libraries and Wikipedia for improving the quality of their services to information consumers, as such linkage will enrich the quality of Wikipedia articles and at the same time increase the visibility of library resources which are currently overlooked to a large degree. We then describe the development of an automatic system for subject indexing of library metadata records with Wikipedia concepts as an important step towards library-Wikipedia integration. The proposed system is based on first identifying all Wikipedia concepts occurring in the metadata elements of library records. This is then followed by training and deploying generic machine learning algorithms to automatically select those concepts which most accurately reflect the core subjects of the library materials whose records are being indexed. We have assessed the performance of the developed system using standard information retrieval measures of precision, recall and F-score on a dataset consisting of 100 library metadata records manually indexed with a total of 469 Wikipedia concepts. The evaluation results show that the developed system is capable of achieving an averaged F-score as high as 0.92.
AB - In this article, we first argue the importance and timely need of linking libraries and Wikipedia for improving the quality of their services to information consumers, as such linkage will enrich the quality of Wikipedia articles and at the same time increase the visibility of library resources which are currently overlooked to a large degree. We then describe the development of an automatic system for subject indexing of library metadata records with Wikipedia concepts as an important step towards library-Wikipedia integration. The proposed system is based on first identifying all Wikipedia concepts occurring in the metadata elements of library records. This is then followed by training and deploying generic machine learning algorithms to automatically select those concepts which most accurately reflect the core subjects of the library materials whose records are being indexed. We have assessed the performance of the developed system using standard information retrieval measures of precision, recall and F-score on a dataset consisting of 100 library metadata records manually indexed with a total of 469 Wikipedia concepts. The evaluation results show that the developed system is capable of achieving an averaged F-score as high as 0.92.
KW - bibliographic records
KW - library metadata
KW - metadata generation
KW - subject metadata
KW - text mining
KW - Wikipedia
UR - http://www.scopus.com/inward/record.url?scp=84896744099&partnerID=8YFLogxK
U2 - 10.1177/0165551513514932
DO - 10.1177/0165551513514932
M3 - Article
AN - SCOPUS:84896744099
SN - 0165-5515
VL - 40
SP - 211
EP - 221
JO - Journal of Information Science
JF - Journal of Information Science
IS - 2
ER -