António Branco

Also published as: António Horta Branco, Antonio H. Branco, Antonio Branco


2018

pdf pdf bib
We Are Depleting Our Research Subject as We Are Investigating It: In Language Technology, more Replication and Diversity Are Needed
António Branco

pdf pdf bib
Finely Tuned, 2 Billion Token Based Word Embeddings for Portuguese
João Rodrigues | António Branco

pdf pdf bib
Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
João Rodrigues | Chakaveh Saedi | António Branco | João Silva

pdf pdf bib
Browsing and Supporting Pluricentric Global Wordnet, or just your Wordnet of Interest
António Branco | Ruben Branco | Chakaveh Saedi | João Silva

pdf pdf bib
Predicting Brain Activation with WordNet Embeddings
João António Rodrigues | Ruben Branco | João Silva | Chakaveh Saedi | António Branco

The task of taking a semantic representation of a noun and predicting the brain activity triggered by it in terms of fMRI spatial patterns was pioneered by Mitchell et al. 2008. That seminal work used word co-occurrence features to represent the meaning of the nouns. Even though the task does not impose any specific type of semantic representation, the vast majority of subsequent approaches resort to feature-based models or to semantic spaces (aka word embeddings). We address this task, with competitive results, by using instead a semantic network to encode lexical semantics, thus providing further evidence for the cognitive plausibility of this approach to model lexical meaning.

pdf pdf bib
WordNet Embeddings
Chakaveh Saedi | António Branco | João António Rodrigues | João Silva

Semantic networks and semantic spaces have been two prominent approaches to represent lexical semantics. While a unified account of the lexical meaning relies on one being able to convert between these representations, in both directions, the conversion direction from semantic networks into semantic spaces started to attract more attention recently. In this paper we present a methodology for this conversion and assess it with a case study. When it is applied over WordNet, the performance of the resulting embeddings in a mainstream semantic similarity task is very good, substantially superior to the performance of word embeddings based on very large collections of texts like word2vec.

pdf pdf bib
Computational Complexity of Natural Languages: A Reasoned Overview
António Branco

There has been an upsurge of research interest in natural language complexity. As this interest will benefit from being informed by established contributions in this area, this paper presents a reasoned overview of central results concerning the computational complexity of natural language parsing. This overview also seeks to help to understand why, contrary to recent and widespread assumptions, it is by no means sufficient that an agent handles sequences of items under a pattern an bn or under a pattern an bm cn dm to ascertain ipso facto that this is the result of at least an underlying context-free grammar or an underlying context-sensitive grammar, respectively. In addition, it seeks to help to understand why it is also not sufficient that an agent handles sequences of items under a pattern an bn for it to be deemed as having a cognitive capacity of higher computational complexity.

pdf pdf bib
Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
Shaohui Kuang | Junhui Li | António Branco | Weihua Luo | Deyi Xiong

In neural machine translation, a source sequence of words is encoded into a vector from which a target sequence is generated in the decoding phase. Differently from statistical machine translation, the associations between source words and their possible target counterparts are not explicitly stored. Source and target words are at the two ends of a long information processing procedure, mediated by hidden states at both the source encoding and the target decoding phases. This makes it possible that a source word is incorrectly translated into a target word that is not any of its admissible equivalent counterparts in the target language. In this paper, we seek to somewhat shorten the distance between source and target words in that procedure, and thus strengthen their association, by means of a method we term bridging source and target word embeddings. We experiment with three strategies: (1) a source-side bridging model, where source word embeddings are moved one step closer to the output target sequence; (2) a target-side bridging model, which explores the more relevant source word embeddings for the prediction of the target sequence; and (3) a direct bridging model, which directly connects source and target word embeddings seeking to minimize errors in the translation of ones by the others. Experiments and analysis presented in this paper demonstrate that the proposed bridging models are able to significantly improve quality of both sentence translation, in general, and alignment and translation of individual source words with target words, in particular.

2017

pdf pdf bib
Ways of Asking and Replying in Duplicate Question Detection
João António Rodrigues | Chakaveh Saedi | Vladislav Maraev | João Silva | António Branco

This paper presents the results of systematic experimentation on the impact in duplicate question detection of different types of questions across both a number of established approaches and a novel, superior one used to address this language processing task. This study permits to gain a novel insight on the different levels of robustness of the diverse detection methods with respect to different conditions of their application, including the ones that approximate real usage scenarios.

2016

pdf pdf bib
SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task
Rosa Gaudio | Gorka Labaka | Eneko Agirre | Petya Osenova | Kiril Simov | Martin Popel | Dieke Oele | Gertjan van Noord | Luís Gomes | João António Rodrigues | Steven Neale | João Silva | Andreia Querido | Nuno Rendeiro | António Branco

pdf pdf bib
Proceedings of the 2nd Deep Machine Translation Workshop
Jan Hajič | Gertjan van Noord | António Branco

pdf pdf bib
Adding syntactic structure to bilingual terminology for improved domain adaptation
Mikel Artetxe | Gorka Labaka | Chakaveh Saedi | João Rodrigues | João Silva | António Branco | Eneko Agirre

pdf pdf bib
Evaluating Machine Translation in a Usage Scenario
Rosa Gaudio | Aljoscha Burchardt | António Branco

In this document we report on a user-scenario-based evaluation aiming at assessing the performance of machine translation (MT) systems in a real context of use. We describe a sequel of experiments that has been performed to estimate the usefulness of MT and to test if improvements of MT technology lead to better performance in the usage scenario. One goal is to find the best methodology for evaluating the eventual benefit of a machine translation system in an application. The evaluation is based on the QTLeap corpus, a novel multilingual language resource that was collected through a real-life support service via chat. It is composed of naturally occurring utterances produced by users while interacting with a human technician providing answers. The corpus is available in eight different languages: Basque, Bulgarian, Czech, Dutch, English, German, Portuguese and Spanish.

pdf pdf bib
Use of Domain-Specific Language Resources in Machine Translation
Sanja Štajner | Andreia Querido | Nuno Rendeiro | João António Rodrigues | António Branco

In this paper, we address the problem of Machine Translation (MT) for a specialised domain in a language pair for which only a very small domain-specific parallel corpus is available. We conduct a series of experiments using a purely phrase-based SMT (PBSMT) system and a hybrid MT system (TectoMT), testing three different strategies to overcome the problem of the small amount of in-domain training data. Our results show that adding a small size in-domain bilingual terminology to the small in-domain training corpus leads to the best improvements of a hybrid MT system, while the PBSMT system achieves the best results by adding a combination of in-domain bilingual terminology and a larger out-of-domain corpus. We focus on qualitative human evaluation of the output of two best systems (one for each approach) and perform a systematic in-depth error analysis which revealed advantages of the hybrid MT system over the pure PBSMT system for this specific task.

pdf pdf bib
CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese
Rita de Carvalho | Andreia Querido | Marisa Campos | Rita Valadas Pereira | João Silva | António Branco

This paper presents a new linguistic resource for the study and computational processing of Portuguese. CINTIL DependencyBank PREMIUM is a corpus of Portuguese news text, accurately manually annotated with a wide range of linguistic information (morpho-syntax, named-entities, syntactic function and semantic roles), making it an invaluable resource specially for the development and evaluation of data-driven natural language processing tools. The corpus is under active development, reaching 4,000 sentences in its current version. The paper also reports on the training and evaluation of a dependency parser over this corpus. CINTIL DependencyBank PREMIUM is freely-available for research purposes through META-SHARE.

pdf pdf bib
Bootstrapping a Hybrid MT System to a New Language Pair
João António Rodrigues | Nuno Rendeiro | Andreia Querido | Sanja Štajner | António Branco

The usual concern when opting for a rule-based or a hybrid machine translation (MT) system is how much effort is required to adapt the system to a different language pair or a new domain. In this paper, we describe a way of adapting an existing hybrid MT system to a new language pair, and show that such a system can outperform a standard phrase-based statistical machine translation system with an average of 10 persons/month of work. This is specifically important in the case of domain-specific MT for which there is not enough parallel data for training a statistical machine translation system.

pdf pdf bib
Word Sense-Aware Machine Translation: Including Senses as Contextual Features for Improved Translation Models
Steven Neale | Luís Gomes | Eneko Agirre | Oier Lopez de Lacalle | António Branco

Although it is commonly assumed that word sense disambiguation (WSD) should help to improve lexical choice and improve the quality of machine translation systems, how to successfully integrate word senses into such systems remains an unanswered question. Some successful approaches have involved reformulating either WSD or the word senses it produces, but work on using traditional word senses to improve machine translation have met with limited success. In this paper, we build upon previous work that experimented on including word senses as contextual features in maxent-based translation models. Training on a large, open-domain corpus (Europarl), we demonstrate that this aproach yields significant improvements in machine translation from English to Portuguese.

pdf pdf bib
QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages
Arantxa Otegi | Nora Aranberri | Antonio Branco | Jan Hajič | Martin Popel | Kiril Simov | Eneko Agirre | Petya Osenova | Rita Pereira | João Silva | Steven Neale

This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference. The corpora comprise both the well-known Europarl corpus and a domain-specific question-answer troubleshooting corpus on the IT domain. English is common in all parallel corpora, with translations in five languages, namely, Basque, Bulgarian, Czech, Portuguese and Spanish. We describe the annotated corpora and the tools used for annotation, as well as annotation statistics for each language. These new resources are freely available and will help research on semantic processing for machine translation and cross-lingual transfer.

2015

pdf pdf bib
A Flexible Tool for Manual Word Sense Annotation
Steven Neale | João Silva | António Branco

pdf pdf bib
Bootstrapping a hybrid deep MT system
João Silva | João Rodrigues | Luís Gomes | António Branco

pdf pdf bib
Small in Size, Big in Precision: A Case for Using Language-Specific Lexical Resources for Word Sense Disambiguation
Steven Neale | João Silva | António Branco

pdf pdf bib
Proceedings of the 1st Deep Machine Translation Workshop
Jan Hajič | António Branco

pdf pdf bib
First Steps in Using Word Senses as Contextual Features in Maxent Models for Machine Translation
Steven Neale | Luís Gomes | António Branco

pdf pdf bib
Machine Translation for Multilingual Troubleshooting in the IT Domain: A Comparison of Different Strategies
Sanja Štajner | João Rodrigues | Luís Gomes | António Branco

2014

pdf bib
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite

pdf pdf bib
Answering List Questions using Web as a corpus
Patrícia Gonçalves | António Branco

2013

pdf pdf bib
Temporal Relation Classification Based on Temporal Reasoning
Francisco Costa | António Branco

2012

pdf bib
TimeBankPT: A TimeML Annotated Corpus of Portuguese
Francisco Costa | António Branco

pdf bib
A PropBank for Portuguese: the CINTIL-PropBank
António Branco | Catarina Carvalheiro | Sílvia Pereira | Sara Silveira | João Silva | Sérgio Castro | João Graça

pdf bib
Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese
Patrícia Gonçalves | Rita Santos | António Branco

pdf pdf bib
Assigning Deep Lexical Types Using Structured Classifier Features for Grammatical Dependencies
João Silva | António Branco

pdf pdf bib
Aspectual Type and Temporal Relation Classification
Francisco Costa | António Branco

2011

pdf pdf bib
Uma abordagem de classificação automática para Tipo de Pergunta e Tipo de Resposta (An Automatic Approach for Classification of Question Type and Answer Type) [in Portuguese]
Patricia Nunes Gonçalves | António Horta Branco

2010

pdf pdf bib
Temporal Information Processing of a New Language: Fast Porting with Minimal Resources
Francisco Costa | António Branco

pdf bib
Top-Performing Robust Constituency Parsing of Portuguese: Freely Available in as Many Ways as you Can Get it
João Silva | António Branco | Patricia Gonçalves

pdf bib
Developing a Deep Linguistic Databank Supporting a Collection of Treebanks: the CINTIL DeepGramBank
António Branco | Francisco Costa | João Silva | Sara Silveira | Sérgio Castro | Mariana Avelãs | Clara Pinto | João Graça

2009

pdf pdf bib
Language Independent System for Definition Extraction: First Results Using Learning Algorithms
Rosa Del Gaudio | António Branco

pdf pdf bib
LX-Center: a center of online linguistic services
António Branco | Francisco Costa | Eduardo Ferreira | Pedro Martins | Filipe Nunes | João Silva | Sara Silveira

2008

pdf pdf bib
High Precision Analysis of NPs with a Deep Processing Grammar
António Branco | Francisco Costa

pdf pdf bib
LXGram in the Shared Task “Comparing Semantic Representations” of STEP 2008
António Branco | Francisco Costa

pdf bib
Anaphora Resolution Exercise: an Overview
Constantin Orăsan | Dan Cristea | Ruslan Mitkov | António Branco

pdf bib
LX-Service: Web Services of Language Technology for Portuguese
António Branco | Francisco Costa | Pedro Martins | Filipe Nunes | João Silva | Sara Silveira

2007

pdf pdf bib
Self- or Pre-Tuning? Deep Linguistic Processing of Language Variants
António Branco | Francisco Costa

2006

pdf bib
Open Resources and Tools for the Shallow Processing of Portuguese: The TagShare Project
Florbela Barreto | António Branco | Eduardo Ferreira | Amália Mendes | Maria Fernanda Bacelar do Nascimento | Filipe Nunes | João Ricardo Silva

pdf pdf bib
A Suite of Shallow Processing Tools for Portuguese: LX-Suite
António Branco | João Ricardo Silva

2004

pdf bib
Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese
António Branco | João Silva

2002

pdf bib
Nexing Corpus: a corpus of verbal protocols on syllogistic reasoning
António Branco | José Leitão | João Silva | Luís Gomes

pdf pdf bib
Binding Machines
António Branco

2000

pdf pdf bib
Binding Constraints as Instructions of Binding Machines
Antonio Branco

1998

pdf pdf bib
The Logical Structure of Binding
Antonio Branco

pdf pdf bib
The Logical Structure of Binding
Antonio Branco

1996

pdf pdf bib
Subject-oriented and non Subject-oriented Long-distance Anaphora : an Integrated Approach
Antonio Branco | Palmira Marrafa

pdf pdf bib
Branching Split Obliqueness at the Syntax-Semantics Interface
Antonio H. Branco

Search
Co-authors
Venues