Constrained Conditional Model
- Note: Need to replace we and our (below) with neutral references
Making complex decisions in real world problems often involves assigning values to sets of interdependent variables where the expressive dependency structure can influence, or even dictate, what assignments are possible. Structured learning problems provide one such example, but the setting we study is broader. We are interested in cases where decisions depend on multiple models that cannot be learned simultaneously as well as cases where constraints among models' outcomes are available only at decision time.
We have developed a general framework -- Constrained Conditional Models -- that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints (written, for example, using a first-order representation) as a way to support decisions in an expressive output space while maintaining modularity and tractability of training and inference. While incorporating nonlocal dependencies in a probabilistic model can lead to intractable training and inference, our framework allows one to learn a rather simple (or multiple simple) model(s), and make decisions with more expressive models that take into account also global declerative (hard or soft) constraints. We have used this framework successfully in the context of multiple NLP and IE problems, starting with our work on named entities and relations (CoNLL'94) and our SRL work. Our framework, which suggests to learn conditional models and use them as an objective function for a global constrained optimization problem, has been followed by a large body of work in NLP. Following (Roth and Yih, 2004) that has formalized global decision problems in the context of IE as constrained optimization problems and solved these optimization problems using Integer Linear Programming (ILP) we have seen (Punyakanok et al., 2005; Barzilay and Lapata, 2006; Clarke and Lapata, ; Marciniak and Strube, 2005) and others.
We have also studied theoretically training paradigms for CCMs and have developed an understanding for the advantages of different training regimes. Recently we studied unsupervised learning in this framework and have shown that declarative constraints can be used to take advantage of unlabeled data when training conditional models.