Colloquium: Role and Applications of Text Mining in Biomedical Informatics

Dr. Ramakanth Kavuluru, Biomedical Informatics, UK

Even with the emergence of machine processable data formats, unstructured text is still the main medium of dissemination of information in many disciplines. In life sciences and healthcare operations, text arises in the form of scientific publications, clinical narratives (discharge summaries, pathology notes, progress notes), and interview narratives (drug abuse, relationship counseling). While a human reader can readily glean the information a textual fragment conveys, it is a very challenging task for machines. Text mining is the process of converting textual data into, ideally, 'actionable' information. But, often, it also includes converting unstructured text into structured data that is more straightforward to process using computers. In the first part of this talk, I will give an overview of different tasks that can be construed as text mining. Some of these include named entity recognition, triple extraction, word sense disambiguation, classification, clustering, and sentiment analysis. In the second part of the talk, I will discuss two applications I worked on that use text mining to facilitate effective question answering, searching, and exploration of biomedical literature. I will end with an operational task that we (BMI@UK) have taken upon to improve clinical trial recruitment for UK Healthcare.