Difference between revisions of "Noun-Modifier Semantic Relations (State of the art)"
(11 intermediate revisions by 3 users not shown) | |||
Line 3: | Line 3: | ||
* introduced in Nastase and Szpakowicz (2003) | * introduced in Nastase and Szpakowicz (2003) | ||
* subsequently used by many other researchers | * subsequently used by many other researchers | ||
− | * data available from [http://www.cs.utexas.edu/~mfkb/nn/vivi.tar.gz UT Repository] | + | * data available from [http://www.cs.utexas.edu/~mfkb/nn/data/vivi.tar.gz UT Repository] |
* information about data available from [http://www.site.uottawa.ca/~vnastase/noun_modifier_data.html Vivi Nastase] and [http://www.csi.uottawa.ca/~vnastase/papers/wn_roget.ps Nastase and Szpakowicz (2003)] | * information about data available from [http://www.site.uottawa.ca/~vnastase/noun_modifier_data.html Vivi Nastase] and [http://www.csi.uottawa.ca/~vnastase/papers/wn_roget.ps Nastase and Szpakowicz (2003)] | ||
Line 9: | Line 9: | ||
== Five superclasses == | == Five superclasses == | ||
− | * '''Causality:''' "cold virus" | + | * '''Causality:''' "cold virus", "onion tear" |
− | * '''Temporality:''' "morning frost" | + | * '''Temporality:''' "morning frost", "summer travel" |
− | * '''Spatial:''' "aquatic mammal" | + | * '''Spatial:''' "aquatic mammal", "west coast", "home remedy" |
− | * '''Participant:''' "dream analysis" | + | * '''Participant:''' "dream analysis", "mail sorter", "blood donor" |
− | * '''Quality:''' "copper coin" | + | * '''Quality:''' "copper coin", "rice paper", "picture book" |
Line 24: | Line 24: | ||
! 5-class F-measure | ! 5-class F-measure | ||
! 5-class accuracy | ! 5-class accuracy | ||
+ | ! 95% confidence for accuracy | ||
+ | |- | ||
+ | | Baseline | ||
+ | | Majority class (Participant) | ||
+ | | NA | ||
+ | | 43.3% | ||
+ | | 39.39-47.30% | ||
|- | |- | ||
| VSM | | VSM | ||
Line 29: | Line 36: | ||
| 43.2% | | 43.2% | ||
| 45.7% | | 45.7% | ||
+ | | 41.75-49.70% | ||
|- | |- | ||
| SVM+28 | | SVM+28 | ||
Line 34: | Line 42: | ||
| NA | | NA | ||
| 50.1% | | 50.1% | ||
+ | | 46.11-54.09% | ||
|- | |- | ||
| PERT | | PERT | ||
Line 39: | Line 48: | ||
| 50.2% | | 50.2% | ||
| 54.0% | | 54.0% | ||
+ | | 50.00-57.95% | ||
|- | |- | ||
| TiMBL+WordNet | | TiMBL+WordNet | ||
| Nastase et al. (2006) | | Nastase et al. (2006) | ||
| 51.5% | | 51.5% | ||
+ | | NA | ||
| NA | | NA | ||
|- | |- | ||
Line 49: | Line 60: | ||
| 54.6% | | 54.6% | ||
| 58.0% | | 58.0% | ||
+ | | 54.01-61.89% | ||
|- | |- | ||
|} | |} | ||
Line 59: | Line 71: | ||
* '''5-class F-measure''' = macroaveraged F-measure for the 5 superclasses | * '''5-class F-measure''' = macroaveraged F-measure for the 5 superclasses | ||
* '''5-class accuracy''' = accuracy for the 5 superclasses | * '''5-class accuracy''' = accuracy for the 5 superclasses | ||
+ | * '''95% confidence for accuracy''' = confidence interval calculated using [http://www.quantitativeskills.com/sisa/statistics/onemean.htm Wilson Test] | ||
* table rows sorted in order of increasing performance | * table rows sorted in order of increasing performance | ||
+ | * Baseline = always guess the majority class (Participant) | ||
* VSM = Vector Space Model | * VSM = Vector Space Model | ||
* LRA = Latent Relational Analysis | * LRA = Latent Relational Analysis | ||
Line 65: | Line 79: | ||
* TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information | * TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information | ||
* SVM+28 = Support Vector Machine + all 28 joining terms | * SVM+28 = Support Vector Machine + all 28 joining terms | ||
+ | |||
+ | |||
+ | == Other semantic relation test sets == | ||
+ | |||
+ | * [http://www.cl.cam.ac.uk/~do242/resources.html Diarmuid Ó Séaghdha: 1443 Compound Nouns] | ||
+ | * [http://www.apperceptual.com/semeval.html SemEval 2007 Task 4: Classification of Semantic Relations between Nominals] | ||
+ | * [http://semeval2.fbk.eu/semeval2.php?location=tasks#T11 SemEval 2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals] | ||
+ | * [http://semeval2.fbk.eu/semeval2.php?location=tasks#T12 SemEval 2010 Task 9: Noun Compound Interpretation Using Paraphrasing Verbs] | ||
Line 86: | Line 108: | ||
== See also == | == See also == | ||
+ | * [[Semantic relation identification]] | ||
* [[Noun compound repository]] | * [[Noun compound repository]] | ||
* [[Citations of the Diverse Noun Compound Dataset]] | * [[Citations of the Diverse Noun Compound Dataset]] |
Latest revision as of 05:16, 25 June 2012
- 600 noun-modifier pairs labeled with 30 classes of semantic relations
- 30 classes organized into 5 superclasses
- introduced in Nastase and Szpakowicz (2003)
- subsequently used by many other researchers
- data available from UT Repository
- information about data available from Vivi Nastase and Nastase and Szpakowicz (2003)
Five superclasses
- Causality: "cold virus", "onion tear"
- Temporality: "morning frost", "summer travel"
- Spatial: "aquatic mammal", "west coast", "home remedy"
- Participant: "dream analysis", "mail sorter", "blood donor"
- Quality: "copper coin", "rice paper", "picture book"
Table of results
Algorithm | Reference | 5-class F-measure | 5-class accuracy | 95% confidence for accuracy |
---|---|---|---|---|
Baseline | Majority class (Participant) | NA | 43.3% | 39.39-47.30% |
VSM | Turney and Littman (2005) | 43.2% | 45.7% | 41.75-49.70% |
SVM+28 | Nulty (2007) | NA | 50.1% | 46.11-54.09% |
PERT | Turney (2006a) | 50.2% | 54.0% | 50.00-57.95% |
TiMBL+WordNet | Nastase et al. (2006) | 51.5% | NA | NA |
LRA | Turney (2006b) | 54.6% | 58.0% | 54.01-61.89% |
Explanation of table
- Algorithm = name of algorithm
- Reference = where to find out more about given algorithm and experiments
- 5-class F-measure = macroaveraged F-measure for the 5 superclasses
- 5-class accuracy = accuracy for the 5 superclasses
- 95% confidence for accuracy = confidence interval calculated using Wilson Test
- table rows sorted in order of increasing performance
- Baseline = always guess the majority class (Participant)
- VSM = Vector Space Model
- LRA = Latent Relational Analysis
- PERT = Pertinence
- TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information
- SVM+28 = Support Vector Machine + all 28 joining terms
Other semantic relation test sets
- Diarmuid Ó Séaghdha: 1443 Compound Nouns
- SemEval 2007 Task 4: Classification of Semantic Relations between Nominals
- SemEval 2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals
- SemEval 2010 Task 9: Noun Compound Interpretation Using Paraphrasing Verbs
References
Nastase, Vivi and Stan Szpakowicz. (2003). Exploring noun-modifier semantic relations. In Fifth International Workshop on Computational Semantics (IWCS-5), pages 285–301, Tilburg, The Netherlands.
Nastase, Vivi, Jelber Sayyad Shirabad, Marina Sokolova, and Stan Szpakowicz. (2006). Learning noun-modifier semantic relations with corpus-based and Wordnet-based features. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pages 781-787. Boston, Massachusetts.
Nulty, Paul. (2007). Semantic classification of noun phrases using web counts and learning algorithms. In Proceedings of the ACL 2007 Student Research Workshop (ACL-07), pages 79-84. Prague, Czech Republic.
Turney, Peter D. (2005). Measuring semantic similarity by latent relational analysis. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05), pages 1136–1141, Edinburgh, Scotland.
Turney, Peter D. and Michael L. Littman. (2005). Corpus-based learning of analogies and semantic relations. Machine Learning, 60(1–3):251–278.
Turney, P.D. (2006a). Expressing implicit semantic relations without supervision. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (Coling/ACL-06), Sydney, Australia, pp. 313-320.
Turney, P.D. (2006b). Similarity of semantic relations. Computational Linguistics, 32 (3), 379-416.