Difference between revisions of "Resources for English"

From ACL Wiki
Jump to navigation Jump to search
(38 intermediate revisions by 7 users not shown)
Line 1: Line 1:
Most of the early additions have been moved here from the [http://www.aclweb.org/universe ACL NLP/CL Universe].
+
For languages other than English, see [[List of resources by language]].
  
 +
See also [[Multilingual resources]].
 +
 +
<!-- Please keep this list in alphabetical order -->
 +
* [[Corpora (English)|Corpora]]
 +
* [[Dictionaries (English)|Dictionaries]]
 +
* [[Generation grammars]]
 +
* [[Geographical words (English)|Geographical words]]
 +
* [[Knowledge collections and datasets (English)|Knowledge collections and datasets]]
 +
* [[Lexicons (English)|Lexicons]]
 +
* [[Subject specific resources (English)|Subject specific resources]]
 +
* [[Tools and Software for English|Tools and Software]]
 +
* [[Uncategorized resources]] - ''please help in categorizing''
 +
 +
==Other resource lists==
 +
* [[Lists of resources|Other lists of resources]]
 +
 +
==Additional information==
 +
<!-- Please keep this list in alphabetical order -->
 +
 +
* [[Anthology Statistics]]
 
* [[Bibliographies]]
 
* [[Bibliographies]]
 +
* [[Blogs]]
 
* [[Books]]
 
* [[Books]]
* [[Lists of resources]]
+
* [[Conferences]]
* [[Corpora]]
 
 
* [[Courses]]
 
* [[Courses]]
* [[Dictionaries]]
 
 
* [[Journals]]
 
* [[Journals]]
* [[Software]]
+
* [[Newsgroups, mailing lists|Newsgroups and mailing lists]]
 
+
* [[Papers]]
 
 
==LANGUAGE==
 
*[http://www1.cs.columbia.edu/~mdiab/software/ASVMTools_2.0.tar.gz Basic Arabic Processing Tools]
 
 
 
*[http://www.dunglish.nl/ Dunglish]
 
 
 
*[http://www.up.univ-mrs.fr/veronis/donnees/index.html French Stopword List]
 
 
 
*[http://www.ivrix.org.il/projects/spell-checker/ Hebrew Spellchecker]
 
 
 
*[http://www.up.univ-mrs.fr/tresoc/ Le TrÈsor de la Langue Langue d'Oc]
 
 
 
*[http://actarus.atilf.fr/morphalou/ Lexique Morphalou]
 
 
 
*[http://online.anu.edu.au/asianstudies/ahcen/proudfoot/MCP/ Malay Concordance Project]
 
 
 
*[http://earth-info.nga.mil/gns/html/cntry_files.html Names Files of Selected Countries]
 
 
 
*[http://orleans.lti.cs.cmu.edu/Reap/ REAP Project: Reader-Specific Lexical Practice for Improved Reading Comprehension]
 
 
 
*[http://people.cs.uchicago.edu/~dinoj/icsi97syl.disc.gz Syllable-Level Conversational English Transcriptions]
 
 
 
*[http://spraakbanken.gu.se/lb/ The Bank of Swedish - A Linguistic Reference Database of G&ouml;teborg University]
 
 
 
*[http://www.sil.org/mexico/pub/vimsa.htm The Mariano Silva y Aceves Series]
 
 
 
*[http://www.unine.ch/info/clef/ UniNE stopword list for Portuguese]
 
 
 
*[http://geonames.usgs.gov/domestic/download_data.htm United States Geographic Names]
 
 
 
*[http://www.valencianlanguage.com/ Valencianlanguage.com]
 
 
 
 
 
==MAILING==
 
*[http://ling.ohio-state.edu/HPSG/Majordomo.html HPSG Mailing List]
 
 
 
*[http://www.eamt.org/mt-list.html MT List]
 
 
 
*[https://mailman.rice.edu/pipermail/funknet/2002-September/002331.html Natural Semantic Metalanguage List]
 
 
 
*[http://www.sigir.org/sigirlist/issues/ SIG-IRList Archives]
 
 
 
*[http://www.hd.uib.no/ The CORPORA list]
 
 
 
==ONLINE==
 
*[http://www.ldc.upenn.edu/exploration/survey.html A Survey of Open Language Archives]
 
 
 
*[http://www.siggen.org/resources/ ACL SIGGEN Resources Wiki]
 
 
 
*[http://www.cs.kun.nl/agfl/ AFGL Parser Generator]
 
 
 
*[http://www.let.rug.nl/~vannoord/alp/ Algorithms for Linguistic Processing]
 
 
 
*[http://www.a-i.com/ Artificial Intelligence NV (Ai)]
 
 
 
*[http://www.eprints.org/ Author/Institution Self-Archiving]
 
 
 
*[http://www.chinesecomputing.com Chinese Computing]
 
 
 
*[http://www.copernic.com/ COPERNIC 2000]
 
 
 
*[http://www.linguateca.pt/corpografo CorpÛgrafo]
 
 
 
*[http://www.cis.upenn.edu/~dbikel/#stat-parser Dan Bikel's Parser]
 
 
 
*[http://java.sun.com/docs/books/tutorial/i18n/text/boundaryintro.html Detecting Text Boundaries]
 
 
 
*[http://www.catchword.com/era Educational Research Abstracts]
 
 
 
*[http://emotion-research.net/wiki/Databases Emotional Databases]
 
 
 
*[http://lingo.stanford.edu/erg.html English Resource Grammar]
 
 
 
*[http://www.cjk.org/cjk/samples/chincomc.htm English-Chinese Chinese-English Dictionary of Computer Terms]
 
 
 
*[http://www.freelangonline.com/ Freelangonline - many on-line dictionaries + more]
 
 
 
*[http://dmoz.org/Computers/Software/Information_Retrieval/ Information Retrieval]
 
 
 
*[http://ir.dcs.gla.ac.uk/resources.html IR resources]
 
 
 
*[http://odin.prohosting.com/hkkim/cgi-bin/kaeps/ Korean Accented English Pronunciation Simulator]
 
 
 
*[http://www.kwicfinder.com/KWiCFinder.html KwicFinder Web Concordancer and Online Research Tool]
 
 
 
*[http://www.academiaisla.com/acadi/gen/0_en.html LANGUAGE LINKS]
 
 
 
*[http://www.link.cs.cmu.edu/lexfn/ Lexical FreeNet]
 
 
 
*[http://www-lfg.stanford.edu/lfg/ilfga LFG Database: List of Names]
 
 
 
*[http://lse.umiacs.umd.edu/ Linguist's Search Engine]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/TIGER/ Linguistic Interpretation of a German Corpus]
 
 
 
*[http://www.link.cs.cmu.edu/ LINK GRAMMAR PARSER]
 
 
 
*[http://www.eturner.net/linkgrammar-wn/ LinkGrammar-WN project]
 
 
 
*[http://www.psy.uwa.edu.au/MRCDataBase/uwa_mrc.htm MRC Psycholinguistic Database]
 
 
 
*[http://nl.ijs.si/ME/V3/ Multext East Resources, Version 3]
 
 
 
*[http://multiwordnet.itc.it/english/home.php MultiWordNet]
 
 
 
*[http://www.comp.nus.edu.sg/~rpnlpir/ Natural Language Processing / Information Retrieval Software Repository]
 
 
 
*[http://nlsh.sourceforge.net/ NLSH: Natural Language Shell]
 
 
 
*[http://www.irisa.fr/Omphalos/ Omphalos Context-Free Language Learning Competition]
 
 
 
*[http://ysomeya.hp.infoseek.co.jp/ Online Business Letter Corpus KWIC Concordancer]
 
 
 
*[http://www.grsampson.net/RLeafAnc.html Parse Evaluation]
 
 
 
*[http://pie.usna.edu Phrases in English and the British National Corpus]
 
 
 
*[http://pygoogle.sourceforge.net/ PyGoogle: A Python Interface to the Google API]
 
 
 
*[http://www.ai.uga.edu/mc/PythonForNewbieLinguists.html Python Programming Tutorial]
 
 
 
*[http://corpus.leeds.ac.uk/internet.html Query to Internet Corpora]
 
 
 
*[http://www.lexmasterclass.com/exercises/regex/index.html Regular Expression Exercises]
 
 
 
*[http://www.sims.berkeley.edu/~hjiang1/eng_chi_resources.html Resources for English-Chinese CLIR]  
 
 
 
*[http://www.cs.technion.ac.il/~gabr/resources/resources.html Resources for Text, Speech and Language Processing]  
 
 
 
*[http://www.sfu.ca/rst/ Rhetorical Structure Theory (RST)]
 
 
 
*[http://www.philol.msu.ru/rus/galya-1 Russian Phonetics on the Web]
 
 
 
*[http://www.clres.com/SensSemRoles.html Senseval-3 Task: Automatic Labeling of Semantic Roles]
 
 
 
*[http://www.clres.com/SensWNDisamb.html Senseval-3 Task: Word-Sense Disambiguation of WordNet Glosses]
 
 
 
*[http://www.cs.unt.edu/~rada/wa/ Sentence Alignment and Word Alignment: Projects, Papers, Evaluation, Etc.]
 
 
 
*[http://ontoweb-lt.dfki.de/knowledge_index.htm SIG5 OntoWeb]
 
 
 
*[http://sara.natcorp.ox.ac.uk/lookup.html Simple Search of BNC-World]
 
 
 
*[http://www.sigsem.org/ Special Interest Group on Computational Semantics]
 
 
 
*[http://www.grsampson.net/RSue.html SUSANNE Analytic Scheme]
 
 
 
*[http://www.telemakus.net/ Telemakus: Mining and Mapping Research Findings to Promote Knowledge Discovery]
 
 
 
*[http://www.clef-campaign.org/ The Cross-Language Evaluation Forum]
 
 
 
*[http://www.robotwisdom.com/web/biography.html The Internet Timelines Project]
 
 
 
*[http://www.fb10.uni-bremen.de/anglistik/langpro/NLG-table/NLG-table-root.htm The John Bateman and Michael Zock's list of Natural Language Generation Systems]
 
 
 
*[http://www.RosettaProject.org/ The Rosetta PrOject]
 
 
 
*[http://www.cis.upenn.edu/~xtag The XTAG Project]
 
 
 
*[http://www.TSrali.com/ TransSearch]
 
 
 
*[http://www-nlpir.nist.gov/projects/trecvid/ TREC Video Retrieval Evaluation Page]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html TreeTagger - a language independent part-of-speech tagger]
 
 
 
*[http://www.biomedcentral.com/info/about/datamining= Using BioMed Central's Open Access Full Text Corpus for Data Mining Research]
 
 
 
*[http://view.byu.edu/ VIEW: Variation in English Words and Phrases]
 
 
 
*[http://beta.visl.sdu.dk VISL Tagger and Parser]
 
 
 
*[http://www.webir.org/ Web IR & IE]
 
 
 
*[http://elib.cs.berkeley.edu/docfreq/ Web Term Document Frequency Form (Berkeley)]
 
 
 
*[http://www.niederlandistik.fu-berlin.de/cgi-bin/web-conc.cgi WEB-CONC]
 
 
 
*[http://www.webcorp.org.uk/ WEBCORP]
 
 
 
*[http://www.webexp.info/ WebExp2 Experimental Software]
 
 
 
*[http://www.comp.lancs.ac.uk/ucrel/bncfreq/ Word Frequencies in Written and Spoken English (Based on the British National Corpus)]
 
 
 
*[http://tcc.itc.it/research/textec/topics/disambiguation/wordnetdomains.html WordNet Domains]
 
 
 
*[http://www.ldc.upenn.edu/exploration/expl2000/papers/ Workshop on Web-Based Language Documentation and Description Papers]
 
 
 
*[http://www.gmi.org/wlms/ World Language Mapping System]
 
 
 
==PAPERS==
 
*[http://www.comp.leeds.ac.uk/eric/iwcs.ps A Domain-Independent Semantic Tagger for the Study of Meaning Associations in English Text]
 
 
 
*[http://www3.interscience.wiley.com/cgi-bin/abstract/104525215/ABSTRACT Automatic Construction of English/Chinese Parallel Corpora]
 
 
 
*[ftp://ftp.icsi.berkeley.edu/pub/techreports/ Berkeley - Technical Reports]
 
 
 
*[http://sslmit.unibo.it/~baroni/publications/lrec2004/bootcat_lrec_2004.pdf BootCaT: Bootstrapping Corpora and Terms from the Web]
 
 
 
*[http://acl.ldc.upenn.edu/I/I05/I05-2015.pdf Building an Annotated Japanese-Chinese Parallel Corpus]
 
 
 
*[http://www.ucl.ac.uk/english-usage/diachronic/index.htm DCPSE: Creating a Parsed and Searchable Diachronic Corpus of  Present-Day Spoken English]
 
 
 
*[http://www.ims.uni-stuttgart.de/info/EPapers.html Electronically available papers (list at Univ. of Stuttgart)]
 
 
 
*[http://www.cis.upenn.edu/~cliff-group/94/reports.html Linc Lab (U. Penn) technical reports (not on-line)]
 
 
 
*[http://www.let.rug.nl/~tanja/ Linguistic Knowledge and Word Sense Disambiguation]
 
 
 
*[http://ir.shef.ac.uk/cloughie/papers.html Measuring Text Reuse]
 
 
 
*[http://www.linguistics.rub.de/~kiss/publications/publications.html#boundaries Paper on Sentence Boundary Disambiguation]
 
 
 
*[http://www.dfki.de/lt/papers/cl-abstracts.html Papers of the DFKI CL Department]
 
 
 
*[http://www1.cs.columbia.edu/nlp/theses.html PhD Theses (Columbia Natural Processing Language Group)]
 
 
 
*[http://www.itri.brighton.ac.uk/ucnlg/Proceedings/index.html Proceedings of the Corpus Linguistics 2005 Workshop on Using Corpora for Natural Language Generation]
 
 
 
==SOFTWARE==
 
 
 
 
 
 
 
 
 
 
 
==TOOLS==
 
*[http://sslmit.unibo.it/~baroni/bootcat.html BootCaT Toolkit: Simple Utilities to Bootstrap Corpora and Terms from the Web]
 
 
 
*[http://chasen.aist-nara.ac.jp/hiki/ChaSen/ ChaSen]
 
 
 
*[http://nlp.cs.jhu.edu/~gsm/pd_demo Dendrogram Demo]
 
 
 
*[http://www.olst.umontreal.ca/dicoeng.html DiCo Lexical Database OLST]
 
 
 
*[http://www.wolfson.ox.ac.uk/~peet/eatshow.htm Edinburgh Associative Thesaurus]
 
 
 
*[http://www.smi.ucd.ie/hyppia/ HYPPIA]
 
 
 
*[http://xlex.uni-muenster.de/ M&uuml;nster Tagging Project]
 
 
 
*[http://nltk.sourceforge.net Natural Language Toolkit (NLTK)]
 
 
 
*[http://perso.wanadoo.fr/rosavram/ NooJ]
 
 
 
*[http://www.informatics.susx.ac.uk/research/nlp/rasp/ Robust Accurate Statistical Parsing (RASP)]
 
 
 
*[http://www.searchtools.com/ Search Tools for Web Sites and Intranets]
 
 
 
*[http://www.lsi.upc.edu/~surdeanu/swirl.html SwiRL Semantic Role Labeler]
 
 
 
*[http://view.byu.edu VIEW (Variation in English Words and Phrases)]
 
 
 
==UNCATEGORIZED==
 
*[http://www40.brinkster.com/dictionarium/index.html dictionarium]
 
 
 
*[http://www.mat.upm.es/~aries ARIES Natural Language Tools]
 
 
 
*[http://www.york.ac.uk/services/library/subjects/langint.htm Language and Linguistic Science information sources]
 
 
 
*[http://www.de.elra.research.ec.org/ The RELATOR language resources server]
 
  
*[ftp://parcftp.xerox.com/pub/ Xerox PARC FTP site.]
+
[[Category:Resources by language|English]]

Revision as of 12:04, 7 September 2012