Diesner, J., & Carley, K.M. (2007). Conditional Random Fields for Entity Extraction and Ontological Text Coding. Proc of North American Association for Computational Social and Organizational Science (NAACSOS) 2007 Conference, Atlanta, GA. Best Student Paper Award.

Previous research has shown that one field with a strong yet unsatisfied need for automated extraction of instances of various entities classes from text data is the analysis of socio-technical systems (Carley, 2002; Diesner & Carley, 2005). Domain-specific entity classes and the relations between them are often specified in ontologies or taxonomies. We present a Conditional Random Field-based approach to distilling a non-canonical set of entities, which is defined in an ontology that originates from organization science. The supervised learning technique applied herein facilitates the derivation of relational data from corpora by locating and classifying instances of various entity classes. The classified entities can then be used as nodes for the construction of socio-technical network. We envision researchers to use the presented methodology as one crucial step in the process of advanced modeling and analysis of complex and dynamic real-world organizations or networks. We find the outcome, particularly in the critical recall statistic, sufficiently successful for being applied in the described problem domain in the future.