Difference between revisions of "Temporal Information Extraction (State of the art)"
Jump to navigation
Jump to search
(Splits Clinical TempEval relations into two tables so that sorting works.) |
|||
(26 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | == | + | == TempEval 2007 == |
+ | * '''TempEval''', ''Temporal Relation Identification'', 2007: [http://www.timeml.org/tempeval/ web page] | ||
− | == | + | == TempEval 2010 == |
+ | * '''TempEval-2''', ''Evaluating Events, Time Expressions, and Temporal Relations'', 2010: [http://www.timeml.org/tempeval2/ web page] | ||
− | == | + | == TempEval 2013 == |
− | + | * '''TempEval-3''', ''Evaluating Time Expressions, Events, and Temporal Relations'', 2013: [http://www.cs.york.ac.uk/semeval-2013/task1/ web page] | |
− | === | + | === Performance measures === |
− | + | === Results === | |
+ | Tables show the best result for each system. Lower scoring runs for the same system are not shown. | ||
+ | |||
+ | ====Task A: Temporal expression extraction and normalisation==== | ||
+ | {| width="100%" class="wikitable sortable" | ||
|- | |- | ||
− | ! rowspan="3" | System name | + | ! rowspan="3" | System name (best run) |
! rowspan="3" | Short description | ! rowspan="3" | Short description | ||
! rowspan="3" | Main publication | ! rowspan="3" | Main publication | ||
Line 32: | Line 38: | ||
! Value | ! Value | ||
|- | |- | ||
− | | HeidelTime | + | | HeidelTime (t) |
− | | | + | | rule-based |
− | | | + | | <ref name="Stroetgen-2013">Stro ̈tgen, J., Zell, J., and Gertz, M. [http://www.aclweb.org/anthology/S/S13/S13-2003.pdf Heideltime: Tuning english and developing spanish resources for tempeval-3]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 15–19.</ref> |
− | | | + | | 83.85 |
− | | | + | | 78.99 |
− | | | + | | 81.34 |
− | | | + | | 93.08 |
− | | | + | | 87.68 |
− | | | + | | 90.30 |
− | | | + | | 90.91 |
− | | | + | | '''85.95''' |
− | | | + | | '''77.61''' |
− | | | + | | [http://dbs.ifi.uni-heidelberg.de/index.php?id=129 Download] |
− | | | + | | [http://www.gnu.org/licenses/gpl.html GNU GPL v3] |
|- | |- | ||
− | | | + | | NavyTime (1,2) |
− | | | + | | rule-based |
− | | | + | | <ref name="Chambers-2013">Chambers, N. [http://www.aclweb.org/anthology/S/S13/S13-2012.pdf Navytime: Event and time ordering from raw text]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 73–77.</ref> |
− | | | + | | 78.72 |
− | | | + | | '''80.43''' |
− | | | + | | 79.57 |
− | | | + | | 89.36 |
− | | | + | | '''91.30''' |
− | | | + | | '''90.32''' |
− | | | + | | 88.90 |
− | | | + | | 78.58 |
− | | | + | | 70.97 |
− | | | + | | - |
− | | | + | | - |
|- | |- | ||
− | | | + | | ManTIME (4) |
− | | | + | | CRF, probabilistic post-processing pipeline, rule-based normaliser |
− | | | + | | <ref name="Filannino-2013">Filannino, M., Brown, G., and Nenadic, G. [http://www.aclweb.org/anthology/S/S13/S13-2009.pdf ManTIME: Temporal expression identification and normalization in the Tempeval-3 challenge]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evalu- ation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 53–57.</ref> |
− | | | + | | 78.86 |
− | | | + | | 70.29 |
− | | | + | | 74.33 |
− | | | + | | 95.12 |
− | | | + | | 84.78 |
− | | | + | | 89.66 |
− | | | + | | 86.31 |
− | | | + | | 76.92 |
− | | | + | | 68.97 |
− | | | + | | [http://www.cs.man.ac.uk/~filannim/projects/tempeval-3/ Demo & Download] |
− | | | + | | [http://www.gnu.org/licenses/gpl-2.0.html GNU GPL v2] |
|- | |- | ||
− | | | + | | SUTime |
− | | | + | | deterministic rule-based |
− | | | + | | <ref name="Chang-2013">Chang, A., and Manning, C. D. [http://www.aclweb.org/anthology/S/S13/S13-2013.pdf SUTime: Evaluation in TempEval-3]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 78–82.</ref> |
− | | | + | | 78.72 |
− | | | + | | '''80.43''' |
− | | | + | | 79.57 |
− | | | + | | 89.36 |
− | | | + | | '''91.30''' |
− | | | + | | '''90.32''' |
− | | | + | | 88.90 |
− | | | + | | 74.60 |
− | | | + | | 67.38 |
− | | | + | | [http://nlp.stanford.edu/software/sutime.shtml Demo & Download] |
− | | | + | | [http://www.gnu.org/licenses/gpl-2.0.html GNU GPL v2] |
|- | |- | ||
− | | | + | | ATT (2) |
− | | | + | | MaxEnt, third party normalisers |
− | | | + | | <ref name="Jung-2013">Jung, H., and Stent, A. [http://www.aclweb.org/anthology/S/S13/S13-2004.pdf ATT1: Temporal annotation using big windows and rich syntactic and semantic features]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 20–24.</ref> |
− | | | + | | '''90.57''' |
− | | | + | | 69.57 |
− | | | + | | 78.69 |
− | | | + | | '''98.11''' |
− | | | + | | 75.36 |
− | | | + | | 85.25 |
− | | | + | | 91.34 |
− | | | + | | 76.91 |
− | | | + | | 65.57 |
− | | | + | | - |
− | | | + | | - |
|- | |- | ||
− | | | + | | ClearTK (1,2) |
− | | | + | | SVM, Logistic Regression, third party normaliser |
− | | | + | | <ref name="Bethard-2013">Bethard, S. [http://www.aclweb.org/anthology/S/S13/S13-2002.pdf ClearTK-TimeML: A minimalist approach to tempeval 2013]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), vol. 2, Association for Computational Linguistics, Association for Computational Linguistics, pp. 10–14.</ref> |
− | | | + | | 85.94 |
− | | | + | | 79.71 |
− | | | + | | '''82.71''' |
− | | | + | | 93.75 |
− | | | + | | 86.96 |
− | | | + | | 90.23 |
− | | | + | | '''93.33''' |
− | | | + | | 71.66 |
− | | | + | | 64.66 |
− | | | + | | [https://code.google.com/p/cleartk/ Download] |
− | | | + | | [http://opensource.org/licenses/BSD-3-Clause BSD-3 Clause] |
|- | |- | ||
− | | | + | | JU-CSE |
− | | | + | | CRF, rule-based normaliser |
− | | | + | | <ref name="Kolya-2013">Kolya, A. K., Kundu, A., Gupta, R., Ekbal, A., and Bandyopadhyay, S. [http://www.aclweb.org/anthology/S/S13/S13-2011.pdf JU_CSE: A CRF based approach to annotation of temporal expression, event and temporal relations]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 64–72.</ref> |
− | | | + | | 81.51 |
− | | | + | | 70.29 |
− | | | + | | 75.49 |
− | | | + | | 93.28 |
− | | | + | | 80.43 |
− | | | + | | 86.38 |
− | | | + | | 87.39 |
− | | | + | | 73.87 |
− | | | + | | 63.81 |
− | | | + | | - |
− | | | + | | - |
|- | |- | ||
− | | | + | | KUL (2) |
− | | | + | | Logistic regression, post-processing, rule-based normaliser |
− | | | + | | <ref name="Kolomiyets-2013">Kolomiyets, O., and Moens, M.-F. [http://www.aclweb.org/anthology/S/S13/S13-2014.pdf KUL: Data-driven approach to temporal parsing of newswire articles]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceed- ings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 83–87.</ref> |
− | | | + | | 76.99 |
− | | | + | | 63.04 |
− | | | + | | 69.32 |
− | | | + | | 92.92 |
− | | | + | | 76.09 |
− | | | + | | 83.67 |
− | | | + | | 88.56 |
− | | | + | | 75.24 |
− | | | + | | 62.95 |
− | | | + | | - |
− | | | + | | - |
|- | |- | ||
− | | | + | | FSS-TimEx |
− | | | + | | rule-based |
− | | | + | | <ref name="Zavarella-2013">Zavarella, V., and Tanev, H. [http://www.aclweb.org/anthology/S/S13/S13-2010.pdf FSS-TimEx for tempeval-3: Extracting temporal information from text]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 58–63.</ref> |
− | | | + | | 52.03 |
− | | | + | | 46.38 |
− | | | + | | 49.04 |
− | | | + | | 90.24 |
− | | | + | | 80.43 |
− | | | + | | 85.06 |
− | | | + | | 81.08 |
− | | | + | | 68.47 |
− | | | + | | 58.24 |
− | | | + | | - |
− | | | + | | - |
|- | |- | ||
− | | | + | |} |
− | + | ||
− | + | ====Task B: Event extraction and classification==== | |
− | + | {| width="100%" class="wikitable sortable" | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | | | ||
|- | |- | ||
− | | | + | ! rowspan="3" | System name (best run) |
− | + | ! rowspan="3" | Short description | |
− | + | ! rowspan="3" | Main publication | |
− | + | ! colspan="3" | Identification | |
− | + | ! colspan="3" | Attributes | |
− | + | ! rowspan="3" | Overall score | |
− | + | ! rowspan="3" | Software | |
− | | | + | ! rowspan="3" | License |
− | | | ||
− | | | ||
− | | | ||
− | | | ||
− | | | ||
− | | | ||
|- | |- | ||
− | | | + | ! colspan="3" | Strict matching |
− | + | ! colspan="3" | Accuracy | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | | | ||
|- | |- | ||
− | + | ! Pre. | |
− | + | ! Rec. | |
− | + | ! F1 | |
− | + | ! Class | |
− | + | ! Tense | |
− | + | ! Aspect | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
− | | | + | | ATT (1) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Jung-2013"/> | ||
+ | | 81.44 | ||
+ | | 80.67 | ||
+ | | '''81.05''' | ||
+ | | 88.69 | ||
+ | | 73.37 | ||
+ | | 90.68 | ||
+ | | '''71.88''' | ||
| | | | ||
| | | | ||
|- | |- | ||
− | | | + | | KUL (2) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Kolomiyets-2013"/> | ||
+ | | 80.69 | ||
+ | | 77.99 | ||
+ | | 79.32 | ||
+ | | 88.46 | ||
+ | | - | ||
+ | | - | ||
+ | | 70.17 | ||
| | | | ||
| | | | ||
|- | |- | ||
− | | | + | | ClearTK (4) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Bethard-2013"/> | ||
+ | | 81.40 | ||
+ | | 76.38 | ||
+ | | 78.81 | ||
+ | | 86.12 | ||
+ | | 78.20 | ||
+ | | 90.86 | ||
+ | | 67.87 | ||
+ | | [https://code.google.com/p/cleartk/ Download] | ||
+ | | [http://opensource.org/licenses/BSD-3-Clause BSD-3 Clause] | ||
|- | |- | ||
− | | | + | | NavyTime (1) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Chambers-2013"/> | ||
+ | | 80.73 | ||
+ | | 79.87 | ||
+ | | 80.30 | ||
+ | | 84.03 | ||
+ | | 75.79 | ||
+ | | 91.26 | ||
+ | | 67.48 | ||
| | | | ||
| | | | ||
|- | |- | ||
− | | | + | | Temp: (ESAfeature) |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | X, 2013 | ||
+ | | 78.33 | ||
+ | | 61.61 | ||
+ | | 68.97 | ||
+ | | 79.09 | ||
+ | | - | ||
+ | | - | ||
+ | | 54.55 | ||
| | | | ||
| | | | ||
|- | |- | ||
− | | | + | | JU_CSE |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Kolya-2013"/> | ||
+ | | 80.85 | ||
+ | | 76.51 | ||
+ | | 78.62 | ||
+ | | 67.02 | ||
+ | | 74.56 | ||
+ | | 91.76 | ||
+ | | 52.69 | ||
| | | | ||
| | | | ||
|- | |- | ||
− | | FSS- | + | | FSS-TimeEx |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
| | | | ||
+ | | <ref name="Zavarella-2013"/> | ||
+ | | 63.13 | ||
+ | | 67.11 | ||
+ | | 65.06 | ||
+ | | 66.00 | ||
+ | | - | ||
+ | | - | ||
+ | | 42.94 | ||
| | | | ||
| | | | ||
Line 334: | Line 290: | ||
|} | |} | ||
− | ===Task | + | ====Task C: Annotating relations given gold entities==== |
+ | |||
+ | ====Task C relation only: Annotating relations given gold entities and related pairs==== | ||
+ | |||
+ | ====Task ABC: Temporal awareness evaluation==== | ||
+ | |||
+ | == Clinical TempEval 2015 == | ||
+ | * '''Clinical TempEval 2015''', ''Clinical TempEval'', 2015: [http://alt.qcri.org/semeval2015/task6/ web page] | ||
+ | |||
+ | === Performance measures === | ||
+ | |||
+ | === Results === | ||
+ | Tables show the best result for each system. Lower scoring runs for the same system are not shown. | ||
+ | |||
+ | ====Time expressions==== | ||
+ | {| width="100%" class="wikitable sortable" | ||
+ | |- | ||
+ | ! rowspan="2" | System name (best run) | ||
+ | ! rowspan="2" | Short description | ||
+ | ! rowspan="2" | Main publication | ||
+ | ! colspan="3" | Span | ||
+ | ! colspan="4" | Class | ||
+ | ! rowspan="2" | Software | ||
+ | ! rowspan="2" | License | ||
+ | |- | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! A | ||
+ | |- | ||
+ | | Baseline: memorize | ||
+ | | - | ||
+ | | - | ||
+ | | 0.743 | ||
+ | | 0.372 | ||
+ | | 0.496 | ||
+ | | 0.723 | ||
+ | | 0.362 | ||
+ | | 0.483 | ||
+ | | 0.974 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | KPSCMI: run 1 | ||
+ | | Rule-based | ||
+ | | - | ||
+ | | 0.272 | ||
+ | | 0.782 | ||
+ | | 0.404 | ||
+ | | 0.223 | ||
+ | | 0.642 | ||
+ | | 0.331 | ||
+ | | 0.819 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | KPSCMI: run 3 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | 0.693 | ||
+ | | 0.706 | ||
+ | | 0.699 | ||
+ | | 0.657 | ||
+ | | 0.669 | ||
+ | | 0.663 | ||
+ | | 0.948 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | UFPRSheffield-SVM: run 2 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | 0.741 | ||
+ | | 0.655 | ||
+ | | 0.695 | ||
+ | | 0.723 | ||
+ | | 0.640 | ||
+ | | 0.679 | ||
+ | | 0.977 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | UFPRSheffield-Hynx: run 5 | ||
+ | | Rule-based | ||
+ | | - | ||
+ | | 0.411 | ||
+ | | 0.795 | ||
+ | | 0.542 | ||
+ | | 0.391 | ||
+ | | 0.756 | ||
+ | | 0.516 | ||
+ | | 0.952 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | BluLab: run 1-3 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | 0.797 | ||
+ | | 0.664 | ||
+ | | 0.725 | ||
+ | | 0.778 | ||
+ | | 0.652 | ||
+ | | 0.709 | ||
+ | | 0.978 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | |} | ||
− | === | + | ====Event expressions==== |
+ | {| width="100%" class="wikitable sortable" | ||
+ | |- | ||
+ | ! rowspan="2" | System name (best run) | ||
+ | ! rowspan="2" | Short description | ||
+ | ! rowspan="2" | Main publication | ||
+ | ! colspan="3" | Span | ||
+ | ! colspan="4" | Modality | ||
+ | ! colspan="4" | Degree | ||
+ | ! colspan="4" | Polarity | ||
+ | ! colspan="4" | Type | ||
+ | ! rowspan="2" | Software | ||
+ | ! rowspan="2" | License | ||
+ | |- | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! A | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! A | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! A | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! A | ||
+ | |- | ||
+ | | Baseline | ||
+ | | Memorize | ||
+ | | - | ||
+ | | 0.876 | ||
+ | | 0.810 | ||
+ | | 0.842 | ||
+ | | 0.810 | ||
+ | | 0.749 | ||
+ | | 0.778 | ||
+ | | 0.924 | ||
+ | | 0.871 | ||
+ | | 0.806 | ||
+ | | 0.838 | ||
+ | | 0.995 | ||
+ | | 0.800 | ||
+ | | 0.740 | ||
+ | | 0.769 | ||
+ | | 0.913 | ||
+ | | 0.846 | ||
+ | | 0.783 | ||
+ | | 0.813 | ||
+ | | 0.966 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | BluLab: run 1-3 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | 0.887 | ||
+ | | 0.864 | ||
+ | | 0.875 | ||
+ | | 0.834 | ||
+ | | 0.813 | ||
+ | | 0.824 | ||
+ | | 0.942 | ||
+ | | 0.882 | ||
+ | | 0.859 | ||
+ | | 0.870 | ||
+ | | 0.994 | ||
+ | | 0.868 | ||
+ | | 0.846 | ||
+ | | 0.857 | ||
+ | | 0.979 | ||
+ | | 0.834 | ||
+ | | 0.812 | ||
+ | | 0.823 | ||
+ | | 0.941 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | |} | ||
− | === | + | ====Temporal relations==== |
+ | Phase 1: text only | ||
+ | {| width="100%" class="wikitable sortable" | ||
+ | |- | ||
+ | ! rowspan="2" | System name (best run) | ||
+ | ! rowspan="2" | Short description | ||
+ | ! rowspan="2" | Main publication | ||
+ | ! colspan="3" | To Document Time | ||
+ | ! colspan="6" | Narrative Containers | ||
+ | ! rowspan="2" | Software | ||
+ | ! rowspan="2" | License | ||
+ | |- | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | |- | ||
+ | | Baseline | ||
+ | | Memorize | ||
+ | | - | ||
+ | | 0.600 | ||
+ | | 0.555 | ||
+ | | 0.577 | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | Baseline | ||
+ | | TIMEX3 to closest EVENT | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | 0.368 | ||
+ | | 0.061 | ||
+ | | 0.104 | ||
+ | | 0.400 | ||
+ | | 0.061 | ||
+ | | 0.106 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | BluLab: run 2 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | 0.712 | ||
+ | | 0.693 | ||
+ | | 0.702 | ||
+ | | 0.080 | ||
+ | | 0.142 | ||
+ | | 0.102 | ||
+ | | 0.094 | ||
+ | | 0.179 | ||
+ | | 0.123 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | |} | ||
− | == | + | Phase 2: manual EVENTs and TIMEX3s |
− | + | {| width="100%" class="wikitable sortable" | |
− | + | |- | |
− | + | ! rowspan="2" | System name (best run) | |
+ | ! rowspan="2" | Short description | ||
+ | ! rowspan="2" | Main publication | ||
+ | ! colspan="3" | To Document Time | ||
+ | ! colspan="6" | Narrative Containers | ||
+ | ! rowspan="2" | Software | ||
+ | ! rowspan="2" | License | ||
+ | |- | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | ! P | ||
+ | ! R | ||
+ | ! F1 | ||
+ | |- | ||
+ | | Baseline | ||
+ | | Memorize | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | 0.608 | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | Baseline | ||
+ | | TIMEX3 to closest EVENT | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | 0.433 | ||
+ | | 0.162 | ||
+ | | 0.235 | ||
+ | | 0.469 | ||
+ | | 0.162 | ||
+ | | 0.240 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | | BluLab: run 2 | ||
+ | | Supervised machine learning | ||
+ | | - | ||
+ | | - | ||
+ | | - | ||
+ | | 0.791 | ||
+ | | 0.109 | ||
+ | | 0.210 | ||
+ | | 0.143 | ||
+ | | 0.140 | ||
+ | | 0.254 | ||
+ | | 0.181 | ||
+ | | - | ||
+ | | - | ||
+ | |- | ||
+ | |} | ||
==References== | ==References== | ||
+ | <references/> | ||
+ | Unsorted | ||
* UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., and Pustejovsky, J. [http://www.aclweb.org/anthology/S/S13/S13-2001.pdf Semeval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 1–9. | * UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., and Pustejovsky, J. [http://www.aclweb.org/anthology/S/S13/S13-2001.pdf Semeval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations]. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 1–9. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
* Laokulrat, N., Miwa, M., Tsuruoka, Y., and Chikayama, T. [http://www.aclweb.org/anthology/S/S13/S13-2015.pdf UTTime: Temporal relation classification using deep syntactic features]. In Second Joint Conference on Lexical and Computational Se- mantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 88– 92. | * Laokulrat, N., Miwa, M., Tsuruoka, Y., and Chikayama, T. [http://www.aclweb.org/anthology/S/S13/S13-2015.pdf UTTime: Temporal relation classification using deep syntactic features]. In Second Joint Conference on Lexical and Computational Se- mantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 88– 92. | ||
Revision as of 08:11, 31 March 2015
TempEval 2007
- TempEval, Temporal Relation Identification, 2007: web page
TempEval 2010
- TempEval-2, Evaluating Events, Time Expressions, and Temporal Relations, 2010: web page
TempEval 2013
- TempEval-3, Evaluating Time Expressions, Events, and Temporal Relations, 2013: web page
Performance measures
Results
Tables show the best result for each system. Lower scoring runs for the same system are not shown.
Task A: Temporal expression extraction and normalisation
System name (best run) | Short description | Main publication | Identification | Normalisation | Overall score | Software | License | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Strict matching | Lenient matching | Accuracy | |||||||||||
Pre. | Rec. | F1 | Pre. | Rec. | F1 | Type | Value | ||||||
HeidelTime (t) | rule-based | [1] | 83.85 | 78.99 | 81.34 | 93.08 | 87.68 | 90.30 | 90.91 | 85.95 | 77.61 | Download | GNU GPL v3 |
NavyTime (1,2) | rule-based | [2] | 78.72 | 80.43 | 79.57 | 89.36 | 91.30 | 90.32 | 88.90 | 78.58 | 70.97 | - | - |
ManTIME (4) | CRF, probabilistic post-processing pipeline, rule-based normaliser | [3] | 78.86 | 70.29 | 74.33 | 95.12 | 84.78 | 89.66 | 86.31 | 76.92 | 68.97 | Demo & Download | GNU GPL v2 |
SUTime | deterministic rule-based | [4] | 78.72 | 80.43 | 79.57 | 89.36 | 91.30 | 90.32 | 88.90 | 74.60 | 67.38 | Demo & Download | GNU GPL v2 |
ATT (2) | MaxEnt, third party normalisers | [5] | 90.57 | 69.57 | 78.69 | 98.11 | 75.36 | 85.25 | 91.34 | 76.91 | 65.57 | - | - |
ClearTK (1,2) | SVM, Logistic Regression, third party normaliser | [6] | 85.94 | 79.71 | 82.71 | 93.75 | 86.96 | 90.23 | 93.33 | 71.66 | 64.66 | Download | BSD-3 Clause |
JU-CSE | CRF, rule-based normaliser | [7] | 81.51 | 70.29 | 75.49 | 93.28 | 80.43 | 86.38 | 87.39 | 73.87 | 63.81 | - | - |
KUL (2) | Logistic regression, post-processing, rule-based normaliser | [8] | 76.99 | 63.04 | 69.32 | 92.92 | 76.09 | 83.67 | 88.56 | 75.24 | 62.95 | - | - |
FSS-TimEx | rule-based | [9] | 52.03 | 46.38 | 49.04 | 90.24 | 80.43 | 85.06 | 81.08 | 68.47 | 58.24 | - | - |
Task B: Event extraction and classification
System name (best run) | Short description | Main publication | Identification | Attributes | Overall score | Software | License | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
Strict matching | Accuracy | ||||||||||
Pre. | Rec. | F1 | Class | Tense | Aspect | ||||||
ATT (1) | [5] | 81.44 | 80.67 | 81.05 | 88.69 | 73.37 | 90.68 | 71.88 | |||
KUL (2) | [8] | 80.69 | 77.99 | 79.32 | 88.46 | - | - | 70.17 | |||
ClearTK (4) | [6] | 81.40 | 76.38 | 78.81 | 86.12 | 78.20 | 90.86 | 67.87 | Download | BSD-3 Clause | |
NavyTime (1) | [2] | 80.73 | 79.87 | 80.30 | 84.03 | 75.79 | 91.26 | 67.48 | |||
Temp: (ESAfeature) | X, 2013 | 78.33 | 61.61 | 68.97 | 79.09 | - | - | 54.55 | |||
JU_CSE | [7] | 80.85 | 76.51 | 78.62 | 67.02 | 74.56 | 91.76 | 52.69 | |||
FSS-TimeEx | [9] | 63.13 | 67.11 | 65.06 | 66.00 | - | - | 42.94 |
Task C: Annotating relations given gold entities
Task ABC: Temporal awareness evaluation
Clinical TempEval 2015
- Clinical TempEval 2015, Clinical TempEval, 2015: web page
Performance measures
Results
Tables show the best result for each system. Lower scoring runs for the same system are not shown.
Time expressions
System name (best run) | Short description | Main publication | Span | Class | Software | License | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | A | |||||
Baseline: memorize | - | - | 0.743 | 0.372 | 0.496 | 0.723 | 0.362 | 0.483 | 0.974 | - | - |
KPSCMI: run 1 | Rule-based | - | 0.272 | 0.782 | 0.404 | 0.223 | 0.642 | 0.331 | 0.819 | - | - |
KPSCMI: run 3 | Supervised machine learning | - | 0.693 | 0.706 | 0.699 | 0.657 | 0.669 | 0.663 | 0.948 | - | - |
UFPRSheffield-SVM: run 2 | Supervised machine learning | - | 0.741 | 0.655 | 0.695 | 0.723 | 0.640 | 0.679 | 0.977 | - | - |
UFPRSheffield-Hynx: run 5 | Rule-based | - | 0.411 | 0.795 | 0.542 | 0.391 | 0.756 | 0.516 | 0.952 | - | - |
BluLab: run 1-3 | Supervised machine learning | - | 0.797 | 0.664 | 0.725 | 0.778 | 0.652 | 0.709 | 0.978 | - | - |
Event expressions
System name (best run) | Short description | Main publication | Span | Modality | Degree | Polarity | Type | Software | License | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | A | P | R | F1 | A | P | R | F1 | A | P | R | F1 | A | |||||
Baseline | Memorize | - | 0.876 | 0.810 | 0.842 | 0.810 | 0.749 | 0.778 | 0.924 | 0.871 | 0.806 | 0.838 | 0.995 | 0.800 | 0.740 | 0.769 | 0.913 | 0.846 | 0.783 | 0.813 | 0.966 | - | - |
BluLab: run 1-3 | Supervised machine learning | - | 0.887 | 0.864 | 0.875 | 0.834 | 0.813 | 0.824 | 0.942 | 0.882 | 0.859 | 0.870 | 0.994 | 0.868 | 0.846 | 0.857 | 0.979 | 0.834 | 0.812 | 0.823 | 0.941 | - | - |
Temporal relations
Phase 1: text only
System name (best run) | Short description | Main publication | To Document Time | Narrative Containers | Software | License | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |||||
Baseline | Memorize | - | 0.600 | 0.555 | 0.577 | - | - | - | - | - | - | - | - |
Baseline | TIMEX3 to closest EVENT | - | - | - | - | 0.368 | 0.061 | 0.104 | 0.400 | 0.061 | 0.106 | - | - |
BluLab: run 2 | Supervised machine learning | - | 0.712 | 0.693 | 0.702 | 0.080 | 0.142 | 0.102 | 0.094 | 0.179 | 0.123 | - | - |
Phase 2: manual EVENTs and TIMEX3s
System name (best run) | Short description | Main publication | To Document Time | Narrative Containers | Software | License | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |||||
Baseline | Memorize | - | - | - | 0.608 | - | - | - | - | - | - | - | - |
Baseline | TIMEX3 to closest EVENT | - | - | - | - | 0.433 | 0.162 | 0.235 | 0.469 | 0.162 | 0.240 | - | - |
BluLab: run 2 | Supervised machine learning | - | - | - | 0.791 | 0.109 | 0.210 | 0.143 | 0.140 | 0.254 | 0.181 | - | - |
References
- ↑ Stro ̈tgen, J., Zell, J., and Gertz, M. Heideltime: Tuning english and developing spanish resources for tempeval-3. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 15–19.
- ↑ 2.0 2.1 Chambers, N. Navytime: Event and time ordering from raw text. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 73–77.
- ↑ Filannino, M., Brown, G., and Nenadic, G. ManTIME: Temporal expression identification and normalization in the Tempeval-3 challenge. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evalu- ation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 53–57.
- ↑ Chang, A., and Manning, C. D. SUTime: Evaluation in TempEval-3. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 78–82.
- ↑ 5.0 5.1 Jung, H., and Stent, A. ATT1: Temporal annotation using big windows and rich syntactic and semantic features. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 20–24.
- ↑ 6.0 6.1 Bethard, S. ClearTK-TimeML: A minimalist approach to tempeval 2013. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), vol. 2, Association for Computational Linguistics, Association for Computational Linguistics, pp. 10–14.
- ↑ 7.0 7.1 Kolya, A. K., Kundu, A., Gupta, R., Ekbal, A., and Bandyopadhyay, S. JU_CSE: A CRF based approach to annotation of temporal expression, event and temporal relations. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 64–72.
- ↑ 8.0 8.1 Kolomiyets, O., and Moens, M.-F. KUL: Data-driven approach to temporal parsing of newswire articles. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceed- ings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 83–87.
- ↑ 9.0 9.1 Zavarella, V., and Tanev, H. FSS-TimEx for tempeval-3: Extracting temporal information from text. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 58–63.
Unsorted
- UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., and Pustejovsky, J. Semeval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 1–9.
- Laokulrat, N., Miwa, M., Tsuruoka, Y., and Chikayama, T. UTTime: Temporal relation classification using deep syntactic features. In Second Joint Conference on Lexical and Computational Se- mantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013) (Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 88– 92.