Abstract
In recent times, teething dispute in recognizing the drug, chemical name entities and automatic extracting of relevant information from biological literature causes difficulties for the experts. There is an essential need of data mining techniques to develop a system which can help in automatic extraction of information so that the problem to manually find the information could be minimized. To handle this assortment, this paper focuses on the proposed methodology of recognizing the biological entities, in which five chemical entities (Protein, DNA, RNA, Cell type, Cell line) are recognized accurately. The presented Conditional Random Fields (CRFs) in the core of solution, Biomedical Name Entity Recognizer, are trained on orthographic and contextual features to segment and label the sequence data. The system is also capable of interpreting chemical formulas. The system is successful in annotating the chemical entities containing 3000 abstracts as training data, 3500 abstracts as development data sets, and 14000 records containing 7000 subset records as test data. The obtained results are encouraging, with 92.2% of precision, 93.2% of recall, and 92.48% of F-score measures for Chemical Entity Mention in Patent (CEMP) and 92% of precision, 95.21% of recall, 93.4% of F-score for Chemical Passage Detection (CPD).
| Original language | English |
|---|---|
| Pages (from-to) | 5198-5214 |
| Number of pages | 17 |
| Journal | Advances in Artificial Intelligence and Machine Learning |
| Volume | 6 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 2026 |
| Externally published | Yes |
Keywords
- A biomedical Name entity recognizer
- Chemical entities
- Conditional random fields
- Machine learning
- Regular expressions
Fingerprint
Dive into the research topics of 'Biomedical Named Entity Identification using Machine Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver