Named Entity Recognition

From WebLichtWiki

Jump to: navigation, search

Named entity is a word or a phrase that identifies an item distinguishing it from other items with similar properties. Named entities are, for example, names of people and organizations, place names, names of genes and gene products, etc. Named entity recognition tools automatically identify and categorize such named entities. Named entities are informative and discriminative elements, and their automatic identification and classification can contribute into the accuracy of such natural language processing applications as Information Retrieval and Data Mining, Question Answering, Topic Detection and Tracking, Document Classification, Document Summarization, Machine Translation, etc. Named entity recognition tools can be based on handcrafted rules, on statistical methods and machine learning algorithms, or on combination of those. The most studied named entity types are names of people, locations and organizations. This group of named entity types is sometimes called ENAMEX, the name used at the sixth Message Understanding Conference. Other named entity types commonly in use are date/time expressions, measures (percentage, money, weight, etc.) and email addresses. The domain specific named entity types are also in use. For example, for biological and medical applications named entities of interest are names of genes, proteins, DNA, cell types, medical conditions, diseases, drugs, etc. Recent works also show interest in open class (open domain) named entity types, they don't limit the possible types of named entities. Although the meaning of such types of named entities as person, location and organization is relatively clear, generally named entity types might be defined differently for different purposes. For some applications a location can be defined only as a country, while for the others all possible kinds of locations can be of interest and the definition will be broader. For some applications a country can be defined as a location in all the contexts, for others only in the context of its geographical properties, while being considered as an organization in the context of government decisions and actions. For some applications the exact boundaries of a named entity do not matter, for others they do. As a result, the exact definition of a particular named entity type can vary considerably, and it is application and ultimate purpose specific.