BioNLP Workshop

From ACL Wiki
Jump to: navigation, search



An ACL 2016 Workshop associated with the SIGBIOMED special interest group Featuring two associated tasks: BioASQ ( and BioNLP-ST (

Berlin, Germany, August 12 -13, 2016


  • BioNLP workshop: Friday, August 12, 2016
  • BioNLP-ST and BioASQ workshop: Saturday August 13, 2016

BIONLP 2016 Workshop Schedule

Friday August 12, 2016
8:30–8:40Opening remarks
8:40–10:30Session 1: Entity extraction and representation
8:40–9:00A Machine Learning Approach to Clinical Terms Normalization
Jose Castano, María Laura Gambarte, Hee Joon Park, Maria del Pilar Avila Williams, David Perez, Fernando Campos, Daniel Luna, Sonia Benitez, Hernan Berinsky and Sofía Zanetti
9:00–9:20Improved Semantic Representation for Domain-Specific Entities
Mohammad Taher Pilehvar and Nigel Collier
9:20–9:40Identification, characterization, and grounding of gradable terms in clinical text
Chaitanya Shivade, Marie-Catherine de Marneffe, Eric Fosler-Lussier and Albert M. Lai
9:40–10:00Graph-based Semi-supervised Gene Mention Tagging
Golnar Sheikhshab, Elizabeth Starks, Aly Karsan, Anoop Sarkar and Inanc Birol
10:00–10:30Invited Talk: The BioNLP-ST challenges on information extraction and knowledge acquisition in biology
Speakers: Robert Bossy and Jin-Dong Kim
10:30–11:00Coffee Break
11:00–12:30Session 2: Event and Relation Extraction
11:00–11:20Feature Derivation for Exploitation of Distant Annotation via Pattern Induction against Dependency Parses
Dayne Freitag and John Niekrasz
11:40–12:00Inferring Implicit Causal Relationships in Biomedical Literature
Halil Kilicoglu
12:00–12:20SnapToGrid: From Statistical to Interpretable Models for Biomedical Information Extraction
Marco A. Valenzuela-Escárcega, Gus Hahn-Powell, Dane Bell and Mihai Surdeanu
12:20–12:40Character based String Kernels for Bio-Entity Relation Detection
Ritambhara Singh and Yanjun Qi
12:40–14:00Lunch break
14:00–15:40Session 3: Disambiguation, Classification, and more
14:00–14:20Disambiguation of entities in MEDLINE abstracts by combining MeSH terms with knowledge
Amy Siu, Patrick Ernst and Gerhard Weikum
14:20–14:40Using Distributed Representations to Disambiguate Biomedical and Clinical Concepts
Stephan Tulkens, Simon Suster and Walter Daelemans
14:40–15:00Unsupervised Document Classification with Informed Topic Models
Timothy Miller, Dmitriy Dligach and Guergana Savova
15:00–15:20Vocabulary Development To Support Information Extraction of Substance Abuse from Psychiatry Notes
Sumithra Velupillai, Danielle L Mowery, Mike Conway, John Hurdle and Brent Kious
15:20–15:40Syntactic analyses and named entity recognition for PubMed and PubMed Central __ up-to-the-minute
Kai Hakala, Suwisa Kaewphan, Tapio Salakoski and Filip Ginter
15:40–16:00Coffee Break
16:00–16:30Invited Talk: BioASQ: A challenge on large-scale biomedical semantic indexing and question answering
Speaker: Anastasia Krithara
16:30–17:30Poster Session
 Improving Temporal Relation Extraction with Training Instance Augmentation
Chen Lin, Timothy Miller, Dmitriy Dligach, Steven Bethard and Guergana Savova
 Using Centroids of Word Embeddings and Word Mover’s Distance for Biomedical Document Retrieval in Question Answering
Georgios-Ioannis Brokos, Prodromos Malakasiotis and Ion Androutsopoulos
 Measuring the State of the Art of Automated Pathway Curation Using Graph Algorithms - A Case Study of the mTOR Pathway
Michael Spranger, Sucheendra Palaniappan and Samik Gosh
 Construction of a Personal Experience Tweet Corpus for Health Surveillance
Keyuan Jiang, Ricardo Calix and Matrika Gupta
 Modelling the Combination of Generic and Target Domain Embeddings in a Convolutional Neural Network for Sentence Classification
Nut Limsopatham and Nigel Collier
 PubTermVariants: biomedical term variants and their use for PubMed search
Lana Yeganova, Won Kim, Sun Kim, Rezarta Islamaj Doğan, Wanli Liu, Donald C Comeau, Zhiyong Lu and W John Wilbur
 This before That: Causal Precedence in the Biomedical Domain
Gus Hahn-Powell, Dane Bell, Marco A. Valenzuela-Escárcega and Mihai Surdeanu
 Syntactic methods for negation detection in radiology reports in Spanish
Viviana Cotik, Vanesa Stricker, Jorge Vivaldi and Horacio Rodriguez
 How to Train good Word Embeddings for Biomedical NLP
Billy Chiu, Gamal Crichton, Anna Korhonen and Sampo Pyysalo
 An Information Foraging Approach to Determining the Number of Relevant Features
Brian Connolly, Benjamin Glass and John Pestian
 Assessing the Feasibility of an Automated Suggestion System for Communicating Critical Findings from Chest Radiology Reports to Referring Physicians
Brian E. Chapman, Danielle L Mowery, Evan Narasimhan, Neel Patel, Wendy Chapman and Marta Heilbrun
 Building a dictionary of lexical variants for phenotype descriptors
Simon Kocbek and Tudor Groza
 Applying deep learning on electronic health records in Swedish to predict healthcare-associated infections
Olof Jacobson and Hercules Dalianis
 Identifying First Episodes of Psychosis in Psychiatric Patient Records using Machine Learning
Genevieve Gorrell, Sherifat Oduola, Angus Roberts, Tom Craig, Craig Morgan and Rob Stewart
 Relation extraction from clinical texts using domain invariant convolutional neural network
Sunil Sahu, Ashish Anand, Krishnadev Oruganty and Mahanandeeshwar Gattu

BioASQ / BioNLP-ST Workshop Program

9:00-9:15 Welcome
9:15-10:15 Invited speaker: Sherri Matis-Mitchell
Solving Problems and Supporting Decisions in Pharma R&D using Text Analytics: A Recent History
10:15-10:30 Overview of BioASQ
10:30-11:00 Coffee break
11:00-12:30 BioASQ participant session
11:00-11:15 Using Learning-To-Rank to Enhance NLM Medical Text Indexer Results
Ilya Zavorin, James Mork and Dina Demner-Fushman
11:15-11:30 LABDA at the 2016 BioASQ challenge task 4a: Semantic Indexing by using ElasticSearch
Isabel Segura-Bedmar, Adrián Carruana and Paloma Martínez
11:30-11:45 Learning to Answer Biomedical Questions: OAQA at BioASQ 4B
Zi Yang, Yue Zhou and Eric Nyberg
11:45-12:00 HPI Question Answering System in BioASQ 2016
Frederik Schulze, Ricarda Schuler, Tim Draeger, Daniel Dummer, Alexander Ernst, Pedro Flemming, Cindy Perscheid, Mariana Neves
12:00-12:15 KSAnswer: Question-answering System of Kangwon National University and Sogang University in the 2016 BioASQ Challenge
Hyeon-gu Lee, Minkyoung Kim, Harksoo Kim, Juae Kim, Sunjae Kwon, Jungyun Seo, Yi-Reun Kim and Jung-Kyu Choi
12:15-12:30 Large-Scale Semantic Indexing and Question Answering in Biomedicine
Eirini Papagiannopoulou, Yiannis Papanikolaou, Dimitris Dimitriadis, Sakis Lagopoulos, Grigorios Tsoumakas, Manos Laliotis, Nikos Markantonatos and Ioannis Vlahavas
12:30-14:00 Lunch break
14:00-14:15 Overview of BioNLP-ST
14:15-15:30 BioNLP-ST participant session 1
14:15-14:30 LitWay, discriminative extraction for different bio-events
Chen Li, Zhiqiang Rao and Xiangrong Zhang
14:30-14:45 VERSE: Event and relation extraction in the BioNLP 2016 Shared Task
Jake Lever and Steven JM Jones
14:45-15:00 A dictionary- and rule-based system for identification of bacteria and habitats in text
Helen Cook, Evangelos Pafilis and Lars Juhl Jensen
15:00-15:15 Ontology Based Categorization of Bacteria and Habitat Entities using Information Retrieval Techniques
Mert Tiftikci, Hakan Şahin, Berfu Büyüköz, Alper Yayıkçı and Arzucan Özgür
15:15-15:30 Identification of mentions and relations between bacteria and biotope from PubMed abstracts
Cyril Grouin
15:30-16:00 Coffee break
16:00-17:00 BioNLP-ST participant session 2
16:00-16:15 Deep Learning With Minimal Training Data: TurkuNLP Entry in The BioNLP Shared Task 2016
Farrokh Mehryary, Jari Björne, Sampo Pyysalo, Tapio Salakoski and Filip Ginter
16:15-16:30 SeeDev Binary Event Extraction Using SVMs and a Rich Feature set
Nagesh Panyam Chandrasekarasastry, Gitansh Khirbat, Karin Verspoor, Trevor Cohn and Kotagiri Ramamohanarao
16:30-16:45 Extraction of Regulatory Events Using Kernel-based Classifiers and Distant Supervision
Andre Lamurias, Miguel J. Rodrigues, Luka A. Clarke and Francisco M Couto
16:45-17:00 DUTIR in BioNLP-ST 2016: Utilizing convolutional network and distributed representation to extract complicate relations
Honglei Li, Jianhai Zhang, Jian Wang, Hongfei Lin and Zhihao Yang
17:00-17:30 Closing session


Over the course of the past fourteen years, the ACL BioNLP workshop associated with the SIGBIOMED special interest group has established itself as the primary venue for presenting foundational research in language processing for the biological and medical domains. The workshop serves as both a venue for bringing together researchers in bio- and clinical NLP and exposing these researchers to the mainstream ACL research, and a venue for informing the mainstream ACL researchers about the fast growing and important domain.

The workshop will continue presenting work on a broad and interesting range of topics in NLP.

The active areas of research include:

  • Entity identification and normalization for a broad range of semantic categories
  • Extraction of complex relations and events
  • Semantic parsing
  • Discourse analysis
  • Anaphora /Coreference resolution
  • Text mining
    • Literature based discovery
  • Summarization
  • Question Answering
  • Resources and novel strategies for system testing and evaluation
    • Infrastructures for biomedical text mining
  • Processing and annotation platforms
  • Translating NLP research to practice
  • Theoretical underpinnings of biomedical language processing

Program Committee:

 * Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK 
 * Eiji Aramaki, University of Tokyo, Japan 
 * Alan Aronson, US National Library of Medicine 
 * Asma Ben Abacha, US National Library of Medicine 
 * Olivier Bodenreider, US National Library of Medicine 
 * Kevin Bretonnel Cohen, University of Colorado School of Medicine, USA 
 * Aaron Cohen, Oregon Health and Science University 
 * Dina Demner-Fushman, US National Library of Medicine 
 * Filip Ginter, University of Turku, Finland 
 * Cyril Grouin, LIMSI - CNRS, France 
 * Antonio Jimeno Yepes, IBM, Melbourne Area, Australia
 * Halil Kilicoglu, US National Library of Medicine
 * Robert Leaman, US National Library of Medicine 
 * Ulf Leser, Humboldt-Universität zu Berlin, Germany 
 * Zhiyong Lu, US National Library of Medicine 
 * Timothy Miller, Children’s Hospital Boston, USA 
 * Makoto Miwa, Toyota Technological Institute, Japan 
 * Danielle L Mowery, VA Salt Lake City Health Care System, USA
 * Yassine M'Rabet, US National Library of Medicine
 * Aurelie Neveol, LIMSI - CNRS, France 
 * Nhung Nguyen, The University of Manchester, UK
 * Naoaki Okazaki, Tohoku University, Japan 
 * Sampo Pyysalo, University of Cambridge, UK 
 * Bastien Rance, Hopital Europeen Georges Pompidou, France 
 * Fabio Rinaldi,  University of Zurich, Switzerland 
 * Thomas Rindflescht, US National Library of Medicine 
 * Kirk Roberts, The University of Texas Health Science Center at Houston, USA 
 * Angus Roberts, The University of Sheffield, UK 
 * Yoshimasa Tsuruoka, University of Tokyo, Japan 
 * Karin Verspoor, The University of Melbourne, Australia 
 * Byron C. Wallace,  University of Texas at Austin, USA 
 * W John Wilbur, US National Library of Medicine 
 * Pierre Zweigenbaum, LIMSI - CNRS, France


 * Kevin Bretonnel Cohen, University of Colorado School of Medicine
 * Dina Demner-Fushman, US National Library of Medicine
 * Sophia Ananiadou, National Centre for Text Mining and University of Manchester, UK
 * Jun-ichi Tsujii, National Institute of Advanced Industrial Science and Technology, Japan 

The BioNLP Shared Task (BioNLP-ST) and the BioASQ Challenge associated with the workshop

The BioNLP Shared Task (BioNLP-ST) has been organized three times so far, leading to the development of information extraction systems for molecular biology and medicine in 2009, 2011 and 2013. One of the major contributions of BioNLP-ST is the availability of resources such as high quality manually curated corpora, tools, and evaluation services.

Shared Task Organizers:

  • Jin-Dong Kim, Database Center for Life Science (DBCLS), Japan
  • Claire Nedellec, INRA, France
  • Robert Bossy, INRA, France

The second equally successful shared task, the BioASQ challenge on large-scale biomedical semantic indexing and question answering has been running on an annual basis since 2012. The results of the challenge were presented in a workshop, which has so far been taking place in conjunction with the CLEF conference and was extremely well-attended.

BioASQ assesses the performance of information systems in supporting two tasks that are central in the biomedical question answering process: (a) the indexing of large volumes of unlabeled data, primarily scientific articles, with biomedical concepts, (b) the processing of biomedical questions and the generation of answers and supporting material. BioASQ has been making publicly available the following benchmark data and tools: more than 1300 questions and related answers, as well as online "oracle" for objective evaluation of any system throughout the year, not only during the challenge.

BioASQ Organizers:

  • Georgios Paliouras NCSR "Demokritos", Greece and University of Houston, USA
  • Ioannis Kakadiaris University of Houston, USA
  • Anastasia Krithara, NCSR "Demokritos", Greece