Paper: Species Disambiguation for Biomedical Term Identification

Webmaster's Note: The whole dataset is available Here. Please download the dataset instead of crawling the website.

Basic Info:

id: W08-0610
title: Species Disambiguation for Biomedical Term Identification
authors: Wang, Xinglong (University of Edinburgh, Edinburgh UK), Matthews, Michael (University of Edinburgh, Edinburgh UK)
venue: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
year: 2008
pdf: link


Abstract


An important task in information extraction (IE) from biomedical articles is term iden- tification (TI), which concerns linking en- tity mentions (e.g., terms denoting proteins) in text to unambiguous identifiers in stan- dard databases (e.g., RefSeq). Previous work on TI has focused on species-specific docu- ments. However, biomedical documents, es- pecially full-length articles, often talk about entities across a number of species, in which case resolving species ambiguity becomes an indispensable part of TI. This paper de- scribes our rule-based and machine-learning based approaches to species disambiguation and demonstrates that performance of TI can beimprovedbyover20%ifthecorrectspecies are known. We also show that using the speciespredictedbytheautomaticspeciestag- gers can improve TI by a large margin.








Top Similar Papers
By Title
ID Title
L08-1074Learning the Species of Biomedical Named Entities from Annotated Corpora
P08-2026Self-Training for Biomedical Parsing
L08-1564Automatic Translation of Biomedical Terms by Supervised Machine Learning
W08-0611Knowledge Sources for Word Sense Disambiguation of Biomedical Text
W04-0816Word Sense Disambiguation Based On Term To Term Similarity In A Context Space
W06-3319Biomedical Term Recognition With The Perceptron HMM Algorithm
P08-3009An Unsupervised Vector Approach to Biomedical Term Disambiguation: Integrating UMLS and Medline
E99-1034Finding Content-Bearing Terms Using Term Similarities
W03-0108A Confidence-Based Framework For Disambiguating Geographic Terms
N07-2041Simultaneous Identification of Biomedical Named-Entity and Functional Relation Using Statistical Parsing Techniques


By Abstract
ID Title
W06-3319Biomedical Term Recognition With The Perceptron HMM Algorithm
W04-0711BioAR: Anaphora Resolution For Relating Protein Names To Proteome Database Entries
N06-3009A Hybrid Approach To Biomedical Named Entity Recognition And Semantic Role Labeling
W06-3308BIOSMILE: Adapting Semantic Role Labeling For Biomedical Verbs:
H05-1031Automatically Learning Cognitive Status For Multi-Document Summarization Of Newswire
W04-1217Exploiting Context For Biomedical Entity Recognition: From Syntax To The Web
H92-1042Inferencing In Information Retrieval
W02-0302Tagging Gene And Protein Names In Full Text Articles
P08-2026Self-Training for Biomedical Parsing
P06-4005An Intelligent Search Engine And GUI-Based Efficient MEDLINE Search Tool Based On Deep Syntactic Parsing


By Full Text
ID Title
P06-1060Factorizing Complex Models: A Case Study In Mention Detection
X96-1048Overview Of Results Of The MUC-6 Evaluation
X98-1014Algorithms That Learn To Extract Information BBN: TIPSTER Phase III
M95-1002Overview Of Results Of The MUC-6 Evaluation
M98-1009Algorithms That Learn To Extract Information - BBN: Description Of The SIFT System As Used For MUC-7
N07-1067Question Answering Using Integrated Information Retrieval and Information Extraction
W07-1019The Extraction of Enriched Protein-Protein Interactions from Biomedical Text
M98-1014FACILE: Description Of The NE System Used For MUC-7
X96-1028Progress In Information Extraction
W06-2202Simple Information Extraction (SIE): A Portable And Effective IE System


By Co-citation
ID Title Num Co-citations




Copyright © Univ. of Mich. and the CLAIR Group at the Univ. of Mich.
All information provided herein should be considered tentative and still under construction. Further analysis and correction is still being performed. Please remember that all statistics contained herein are the results of independent research and should not be considered a statement of fact regarding any of the papers, authors, or other entities they refer to.