Automatic subject classification of scientific literature using citation metadata

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper describes a new method for automatic classification of scientific literature archived in digital libraries and repositories according to a standard library classification scheme. The method is based on identifying all the references cited in the document to be classified and, using the subject classification metadata of extracted references as catalogued in existing conventional libraries, inferring the most probable class for the document itself with the help of a weighting mechanism. We have demonstrated the application of the proposed method and assessed its performance by developing a prototype software system for automatic classification of scientific documents according to the Dewey Decimal Classification (DDC) scheme. A dataset of one thousand research articles, papers, and reports from a well-known scientific digital library, CiteSeer, were used to evaluate the classification performance of the system. Detailed results of this experiment are presented and discussed.

Original languageEnglish
Title of host publicationDigital Enterprise and Information Systems - International Conference, DEIS 2011, Proceedings
Pages545-559
Number of pages15
DOIs
Publication statusPublished - 2011
EventInternational Conference on Digital Enterprise and Information Systems, DEIS 2011 - London, United Kingdom
Duration: 20 Jul 201122 Jul 2011

Publication series

NameCommunications in Computer and Information Science
Volume194 CCIS
ISSN (Print)1865-0929

Conference

ConferenceInternational Conference on Digital Enterprise and Information Systems, DEIS 2011
Country/TerritoryUnited Kingdom
CityLondon
Period20/07/1122/07/11

Keywords

  • Dewey Decimal Classification (DDC)
  • Digital library organization
  • library classification schemes
  • library Online Public Access Catalogues (OPACs)
  • scientific literature classification

Fingerprint

Dive into the research topics of 'Automatic subject classification of scientific literature using citation metadata'. Together they form a unique fingerprint.

Cite this