Difference between revisions of "Noun-Modifier Semantic Relations (State of the art)"

From ACL Wiki
Jump to: navigation, search
m (Reverted edits by Creek (talk) to last revision by Pdturney)
 
(11 intermediate revisions by 3 users not shown)
Line 3: Line 3:
 
* introduced in Nastase and Szpakowicz (2003)
 
* introduced in Nastase and Szpakowicz (2003)
 
* subsequently used by many other researchers
 
* subsequently used by many other researchers
* data available from [http://www.cs.utexas.edu/~mfkb/nn/vivi.tar.gz UT Repository]
+
* data available from [http://www.cs.utexas.edu/~mfkb/nn/data/vivi.tar.gz UT Repository]
 
* information about data available from [http://www.site.uottawa.ca/~vnastase/noun_modifier_data.html Vivi Nastase] and [http://www.csi.uottawa.ca/~vnastase/papers/wn_roget.ps Nastase and Szpakowicz (2003)]
 
* information about data available from [http://www.site.uottawa.ca/~vnastase/noun_modifier_data.html Vivi Nastase] and [http://www.csi.uottawa.ca/~vnastase/papers/wn_roget.ps Nastase and Szpakowicz (2003)]
  
Line 9: Line 9:
 
== Five superclasses ==
 
== Five superclasses ==
  
* '''Causality:''' "cold virus"
+
* '''Causality:''' "cold virus", "onion tear"
* '''Temporality:''' "morning frost"
+
* '''Temporality:''' "morning frost", "summer travel"
* '''Spatial:''' "aquatic mammal"
+
* '''Spatial:''' "aquatic mammal", "west coast", "home remedy"
* '''Participant:''' "dream analysis"
+
* '''Participant:''' "dream analysis", "mail sorter", "blood donor"
* '''Quality:''' "copper coin"
+
* '''Quality:''' "copper coin", "rice paper", "picture book"
  
  
Line 24: Line 24:
 
! 5-class F-measure
 
! 5-class F-measure
 
! 5-class accuracy
 
! 5-class accuracy
 +
! 95% confidence for accuracy
 +
|-
 +
| Baseline
 +
| Majority class (Participant)
 +
| NA
 +
| 43.3%
 +
| 39.39-47.30%
 
|-
 
|-
 
| VSM
 
| VSM
Line 29: Line 36:
 
| 43.2%
 
| 43.2%
 
| 45.7%
 
| 45.7%
 +
| 41.75-49.70%
 
|-
 
|-
 
| SVM+28
 
| SVM+28
Line 34: Line 42:
 
| NA
 
| NA
 
| 50.1%
 
| 50.1%
 +
| 46.11-54.09%
 
|-
 
|-
 
| PERT
 
| PERT
Line 39: Line 48:
 
| 50.2%
 
| 50.2%
 
| 54.0%
 
| 54.0%
 +
| 50.00-57.95%
 
|-
 
|-
 
| TiMBL+WordNet
 
| TiMBL+WordNet
 
| Nastase et al. (2006)
 
| Nastase et al. (2006)
 
| 51.5%
 
| 51.5%
 +
| NA
 
| NA
 
| NA
 
|-
 
|-
Line 49: Line 60:
 
| 54.6%
 
| 54.6%
 
| 58.0%
 
| 58.0%
 +
| 54.01-61.89%
 
|-
 
|-
 
|}
 
|}
Line 59: Line 71:
 
* '''5-class F-measure''' = macroaveraged F-measure for the 5 superclasses
 
* '''5-class F-measure''' = macroaveraged F-measure for the 5 superclasses
 
* '''5-class accuracy''' = accuracy for the 5 superclasses
 
* '''5-class accuracy''' = accuracy for the 5 superclasses
 +
* '''95% confidence for accuracy''' = confidence interval calculated using [http://www.quantitativeskills.com/sisa/statistics/onemean.htm Wilson Test]
 
* table rows sorted in order of increasing performance
 
* table rows sorted in order of increasing performance
 +
* Baseline = always guess the majority class (Participant)
 
* VSM = Vector Space Model
 
* VSM = Vector Space Model
 
* LRA = Latent Relational Analysis
 
* LRA = Latent Relational Analysis
Line 65: Line 79:
 
* TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information
 
* TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information
 
* SVM+28 = Support Vector Machine + all 28 joining terms
 
* SVM+28 = Support Vector Machine + all 28 joining terms
 +
 +
 +
== Other semantic relation test sets ==
 +
 +
* [http://www.cl.cam.ac.uk/~do242/resources.html Diarmuid Ó Séaghdha: 1443 Compound Nouns]
 +
* [http://www.apperceptual.com/semeval.html SemEval 2007 Task 4: Classification of Semantic Relations between Nominals]
 +
* [http://semeval2.fbk.eu/semeval2.php?location=tasks#T11 SemEval 2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals]
 +
* [http://semeval2.fbk.eu/semeval2.php?location=tasks#T12 SemEval 2010 Task 9:  Noun Compound Interpretation Using Paraphrasing Verbs]
  
  
Line 86: Line 108:
 
== See also ==
 
== See also ==
  
 +
* [[Semantic relation identification]]
 
* [[Noun compound repository]]
 
* [[Noun compound repository]]
 
* [[Citations of the Diverse Noun Compound Dataset]]
 
* [[Citations of the Diverse Noun Compound Dataset]]

Latest revision as of 05:16, 25 June 2012

  • 600 noun-modifier pairs labeled with 30 classes of semantic relations
  • 30 classes organized into 5 superclasses
  • introduced in Nastase and Szpakowicz (2003)
  • subsequently used by many other researchers
  • data available from UT Repository
  • information about data available from Vivi Nastase and Nastase and Szpakowicz (2003)


Five superclasses

  • Causality: "cold virus", "onion tear"
  • Temporality: "morning frost", "summer travel"
  • Spatial: "aquatic mammal", "west coast", "home remedy"
  • Participant: "dream analysis", "mail sorter", "blood donor"
  • Quality: "copper coin", "rice paper", "picture book"


Table of results

Algorithm Reference 5-class F-measure 5-class accuracy 95% confidence for accuracy
Baseline Majority class (Participant) NA 43.3% 39.39-47.30%
VSM Turney and Littman (2005) 43.2% 45.7% 41.75-49.70%
SVM+28 Nulty (2007) NA 50.1% 46.11-54.09%
PERT Turney (2006a) 50.2% 54.0% 50.00-57.95%
TiMBL+WordNet Nastase et al. (2006) 51.5% NA NA
LRA Turney (2006b) 54.6% 58.0% 54.01-61.89%


Explanation of table

  • Algorithm = name of algorithm
  • Reference = where to find out more about given algorithm and experiments
  • 5-class F-measure = macroaveraged F-measure for the 5 superclasses
  • 5-class accuracy = accuracy for the 5 superclasses
  • 95% confidence for accuracy = confidence interval calculated using Wilson Test
  • table rows sorted in order of increasing performance
  • Baseline = always guess the majority class (Participant)
  • VSM = Vector Space Model
  • LRA = Latent Relational Analysis
  • PERT = Pertinence
  • TiMBL+WordNet = Tilburg Memory Based Learner + WordNet-based representation with word sense information
  • SVM+28 = Support Vector Machine + all 28 joining terms


Other semantic relation test sets


References

Nastase, Vivi and Stan Szpakowicz. (2003). Exploring noun-modifier semantic relations. In Fifth International Workshop on Computational Semantics (IWCS-5), pages 285–301, Tilburg, The Netherlands.

Nastase, Vivi, Jelber Sayyad Shirabad, Marina Sokolova, and Stan Szpakowicz. (2006). Learning noun-modifier semantic relations with corpus-based and Wordnet-based features. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pages 781-787. Boston, Massachusetts.

Nulty, Paul. (2007). Semantic classification of noun phrases using web counts and learning algorithms. In Proceedings of the ACL 2007 Student Research Workshop (ACL-07), pages 79-84. Prague, Czech Republic.

Turney, Peter D. (2005). Measuring semantic similarity by latent relational analysis. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI-05), pages 1136–1141, Edinburgh, Scotland.

Turney, Peter D. and Michael L. Littman. (2005). Corpus-based learning of analogies and semantic relations. Machine Learning, 60(1–3):251–278.

Turney, P.D. (2006a). Expressing implicit semantic relations without supervision. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (Coling/ACL-06), Sydney, Australia, pp. 313-320.

Turney, P.D. (2006b). Similarity of semantic relations. Computational Linguistics, 32 (3), 379-416.


See also