TY - JOUR
T1 - Automatic mapping of user tags to Wikipedia concepts
T2 - The case of a Q&A website - StackOverflow
AU - Joorabchi, Arash
AU - English, Michael
AU - Mahdi, Abdulhussain E.
N1 - Publisher Copyright:
© Chartered Institute of Library and Information Professionals.
PY - 2015/10/24
Y1 - 2015/10/24
N2 - The uncontrolled nature of user-assigned tags makes them prone to various inconsistencies caused by spelling variations, synonyms, acronyms and hyponyms. These inconsistencies in turn lead to some of the common problems associated with the use of folksonomies such as the tag explosion phenomenon. Mapping user tags to their corresponding Wikipedia articles, as well-formed concepts, offers multifaceted benefits to the process of subject metadata generation and management in a wide range of online environments. These include normalization of inconsistencies, elimination of personal tags and improvement of the interchangeability of existing subject metadata. In this article, we propose a machine learning-based method capable of automatic mapping of user tags to their equivalent Wikipedia concepts. We have demonstrated the application of the proposed method and evaluated its performance using the currently most popular computer programming Q&A website, StackOverflow.com, as our test platform. Currently, around 20 million posts in StackOverflow are annotated with about 37,000 unique user tags, from which we have chosen a subset of 1256 tags to evaluate the accuracy performance of our proposed mapping method. We have evaluated the performance of our method using the standard information retrieval measures of precision, recall and F1. Depending on the machine learning-based classification algorithm used as part of the mapping process, F1 scores as high as 99.6% were achieved.
AB - The uncontrolled nature of user-assigned tags makes them prone to various inconsistencies caused by spelling variations, synonyms, acronyms and hyponyms. These inconsistencies in turn lead to some of the common problems associated with the use of folksonomies such as the tag explosion phenomenon. Mapping user tags to their corresponding Wikipedia articles, as well-formed concepts, offers multifaceted benefits to the process of subject metadata generation and management in a wide range of online environments. These include normalization of inconsistencies, elimination of personal tags and improvement of the interchangeability of existing subject metadata. In this article, we propose a machine learning-based method capable of automatic mapping of user tags to their equivalent Wikipedia concepts. We have demonstrated the application of the proposed method and evaluated its performance using the currently most popular computer programming Q&A website, StackOverflow.com, as our test platform. Currently, around 20 million posts in StackOverflow are annotated with about 37,000 unique user tags, from which we have chosen a subset of 1256 tags to evaluate the accuracy performance of our proposed mapping method. We have evaluated the performance of our method using the standard information retrieval measures of precision, recall and F1. Depending on the machine learning-based classification algorithm used as part of the mapping process, F1 scores as high as 99.6% were achieved.
KW - Semantic mapping
KW - StackOverflow
KW - Wikipedia
KW - subject metadata
KW - user tags
UR - http://www.scopus.com/inward/record.url?scp=84942125266&partnerID=8YFLogxK
U2 - 10.1177/0165551515586669
DO - 10.1177/0165551515586669
M3 - Article
AN - SCOPUS:84942125266
SN - 0165-5515
VL - 41
SP - 570
EP - 583
JO - Journal of Information Science
JF - Journal of Information Science
IS - 5
ER -