TY - GEN
T1 - Automatic subject classification of scientific literature using citation metadata
AU - Mahdi, Abdulhussain E.
AU - Joorabchi, Arash
PY - 2011
Y1 - 2011
N2 - This paper describes a new method for automatic classification of scientific literature archived in digital libraries and repositories according to a standard library classification scheme. The method is based on identifying all the references cited in the document to be classified and, using the subject classification metadata of extracted references as catalogued in existing conventional libraries, inferring the most probable class for the document itself with the help of a weighting mechanism. We have demonstrated the application of the proposed method and assessed its performance by developing a prototype software system for automatic classification of scientific documents according to the Dewey Decimal Classification (DDC) scheme. A dataset of one thousand research articles, papers, and reports from a well-known scientific digital library, CiteSeer, were used to evaluate the classification performance of the system. Detailed results of this experiment are presented and discussed.
AB - This paper describes a new method for automatic classification of scientific literature archived in digital libraries and repositories according to a standard library classification scheme. The method is based on identifying all the references cited in the document to be classified and, using the subject classification metadata of extracted references as catalogued in existing conventional libraries, inferring the most probable class for the document itself with the help of a weighting mechanism. We have demonstrated the application of the proposed method and assessed its performance by developing a prototype software system for automatic classification of scientific documents according to the Dewey Decimal Classification (DDC) scheme. A dataset of one thousand research articles, papers, and reports from a well-known scientific digital library, CiteSeer, were used to evaluate the classification performance of the system. Detailed results of this experiment are presented and discussed.
KW - Dewey Decimal Classification (DDC)
KW - Digital library organization
KW - library classification schemes
KW - library Online Public Access Catalogues (OPACs)
KW - scientific literature classification
UR - http://www.scopus.com/inward/record.url?scp=80052147940&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-22603-8_48
DO - 10.1007/978-3-642-22603-8_48
M3 - Conference contribution
AN - SCOPUS:80052147940
SN - 9783642226021
T3 - Communications in Computer and Information Science
SP - 545
EP - 559
BT - Digital Enterprise and Information Systems - International Conference, DEIS 2011, Proceedings
T2 - International Conference on Digital Enterprise and Information Systems, DEIS 2011
Y2 - 20 July 2011 through 22 July 2011
ER -