Resources for English
Jump to navigation
Jump to search
A
- 11-761 Language and Statistics, course at CMU, Spring 1997
- 1963 Time Magazine corpus
- 2000 NIST Speaker Recognition Evaluation Corpus
- 3rd NASSLLI: North American Summer School in Logic, Language and Information
- 3rd North American Summer School in Logic, Language and Information
- A Survey of Open Language Archives
- A Syntactically Annotated Corpus of German Newspaper Texts
- ACM Transactions on Information Systems
- AFGL Parser Generator
- AL FRESCO Interactive System (at the DFKI NLP archive)
- AL FRESCO Interactive System (at the DFKI NLP archive)
- ALE -- Attribute Logic Engine (at the DFKI NLP archive)
- ALEP (at the DFKI NLP archive)
- ALEP (at the DFKI NLP archive)
- AMALGAM project
- AMERICAN NATIONAL CORPUS FIRST RELEASE
- ARIES Natural Language Tools
- AV parser (at the DFKI NLP archive)
- Addison Wesley Longman higher education
- Agglutination on the Basis of Corpus Information
- Alignment of bilingual corpora performed with EasyAlign
- Alpino Treebank
- Alternative dictionaries
- Alvey Natural Language Tools (at the DFKI NLP archive)
- Alvey Natural Language Tools (at the DFKI NLP archive)
- American English SpeechDat-Car
- American English SpeechDat-Car
- An Empirical Grammar of the English Verb System
- Annotated list of resources on statistical NLP and corpus-based CL
- Arabic Newswire Part 1
- Aramedia
- Arbora Tree Delivery Service
- Architectures and Mechanisms for Language Processing
- Artificial Intelligence NV (Ai)
- Author/Institution Self-Archiving
- Automatic English Sentence Segmentation
- Automatic Term Extraction System
B
- BIM LOQUI (at the DFKI NLP archive)
- BNC Indexer
- BNC Online Service
- BNCweb a web-based interface to the British National Corpus
- BRITISH NATIONAL CORPUS - WORLD EDITION
- Bancos de dados e Ferramentas de an\`alise
- Base form reduction and search form production (at the DFKI NLP archive)
- Bayes Net Toolbox for Matlab
- Bayes Net Toolbox for Matlab
- Bayesian Network tools in Java (BNJ)
- Bayesian Network tools in Java (BNJ)
- Bibliographic Search Page, Univ. of Essex
- Bibliography for Phonetics/Speech Technology
- Bibliography to the book "Artificial Intelligence: A Modern Approach by Russell and Norvig
- Bigram Statistics Package
- Bilingual Dictionary French Arabic
- Bilingual Speech: A Typology of Code-Mixing
- Bookmarks for Corpus-based Linguists
- Brain and Language
- Brainhat Natural Language Processing
- Brill Tagger (Supervised, Trainable)
- British English Example Pronunciations (BEEP) (at the DFKI NLP archive)
C
- CAT2 (at the DFKI NLP archive)
- CAT2(at the DFKI NLP archive)
- CELEX - The Dutch Center for Lexical Information
- CEPRIL - Portugese Segmenter
- CEPRIL aligner
- CFG parser (at the DFKI NLP archive)
- CHARON (at the DFKI NLP archive)
- CHARON (at the DFKI NLP archive)
- CLaRK System
- CLaRK System
- CMU Sphinx Group: Open Source Speech Recognition Engines
- COGNATE (at the DFKI NLP archive)
- COMPARA corpus
- COMPULEXIS (at the DFKI NLP archive)
- COMPUTER SPEECH AND LANGUAGE
- COPERNIC 2000
- CORPUS DEL ESPANOL
- COSMAS II
- CPAN Lingua EN Sentence Splitter
- CPAN Lingua HE Sentence Splitter
- CPAN Suffix Tree Module
- CREA
- CREA
- CS674: Natural Language Processing (Cornell U., Spring 2000)
- CSLI LinGO Lab (Stanford)
- CSPAN Sentence Splitter
- CUF (at the DFKI NLP archive)
- Cambridge Learner Dictionary
- Canoo.net - German Dictionaries and Grammars
- Cascadilla Press
- Centre for Disease Control - Chinese, French, Japanese, Spanish info on SARS
- Chilibot: NLP based miner for gene/protein/keyword relationships
- Chilibot: NLP based miner for gene/protein/keyword relationships
- Code from James Allen's "Natural Language Understanding" (code at CMU)
- Code from Michael Covington's "NLP for Prolog Programmers" (code at CMU)
- Cognition, a journal from Elsevier Science
- Collections of texts and corpora
- Collective (Chaotic - Emergent) Language
- Comlex Syntax (Syntactic Dictionary of English)
- Common Lisp Hypermedia Server
- Comprehensive Perl Archive Network
- Computational Linguistics, James Pustejovsky, Brandeis University
- Computational Linguistics
- Computer Speech and Language
- Computer Speech, Text and Internet Technology
- Concollate
- Context Feature Structure System (at the DFKI NLP archive)
- Context Feature Structure System (at the DFKI NLP archive)
- Corpus building for minority languages
- Corpus de referencia de la lengua Espanola contemporanea: corpus oral peninsular
- Corpus de referencia de la lengua Espanola contemporanea: corpus oral peninsular
- Corpus del Espanol
- Corpus del Espanol
- Corpus of Spoken Professional English
- Corpus of spoken Bulgarian
- Course in Corpus Linguistics, Tony McEnery & Andrew Wilson
- Cranfield collection
- Czech National Corpus
D
- DCG workbench (at the DFKI NLP archive)
- DECtalk (at the DFKI NLP archive)
- DISCO chart parser (at the DFKI NLP archive)
- DITO -- DIagnostic TOol for german syntax (at the DFKI NLP archive)
- DTREG decision tree generator
- Dan Bikel's Parser
- Danish news corpus
- Data Harmony, Document Management Software
- Debian free software community
- Delphes Technologies International, natural language processing.
- Delphes Technologies International
- Demos of dependency database, parser, and other tools
- Demos, University of Alberta, Canada
- Dialogue Diversity Corpus
- Dictionaries for International Ispell
- Dictionary Maintenance Programs (at the DFKI NLP archive)
- Dictionary Maintenance Utilities (at the DFKI NLP archive)
- Dictionary site, Bucknell University
E
- ECHO - Eurodicautom (multilingual technical dictionary)
- EGG -- editor for GPS-grammars (at the DFKI NLP archive)
- ELSNET: Paper and Electronic Publications
- ELU (at the DFKI NLP archive)
- ELU (at the DFKI NLP archive)
- EMILLE corpus
- ENGCG (at the DFKI NLP archive)
- ENGCG (at the DFKI NLP archive)
- ESTEAM (ESPRIT 316) (at the DFKI NLP archive)
- ETAI - Electronic Transactions on Artificial Intelligence
- ETL parser (at the DFKI NLP archive)
- EVAR, ERNEST (at the DFKI NLP archive)
- Educational Research Abstracts
- Electronic Text Center -- University of Virginia
- Embedded MT Systems: Leveraging for Real World Applications
- English Intonation in the British Isles -The IViE Corpus
- English Resource Grammar
- English stop words (from SMART)
- English to Estonian
- English-Chinese Chinese-English Dictionary of Computer Terms
- English-Truespel (USA Accent) Text Conversion Tool
- Envisioning Machine Translation in the Information Future 4th Conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca, Mexico, October 10-14, 2000 Proceedings
- Ergane
- EuroWordNet
- EuroWordNet
- Evolutionary Web Development
- Experimental Corpus Query System (University of Stuttgart, Germany)
- Experimental machine translation system (at the DFKI NLP archive)
- Exploring Words and Phrases from the British National Corpus
F
- FJGH--grammar (at the DFKI NLP archive)
- FLEMMV31 - Inflectional morphology parser for French
- FLEMMV31 - Inflectional morphology parser for French
- FUF and SURGE (at the DFKI NLP archive)
- Finite State Automata Utilities v6
- Finnish text bank
- Foundations of Computational Linguistics by Roland Hausser
- FreeLing 1.1
- Freelangonline - many on-line dictionaries + more
- Freelangonline - many on-line dictionaries + more
- Freelangonline - many on-line dictionaries + more
- French Foreign Ministry's magazine
G
- GENIA Project Home Page
- GENIA corpus version 3.0p
- GILENA: A Natural-Language Interfaces Generator (at the DFKI NLP archive)
- GLOTTO (at the DFKI NLP archive)
- GPSG parser (at the DFKI NLP archive)
- GPSG tools (at the DFKI NLP archive)
- GPSG--tools (at the DFKI NLP archive)
- GRAMTSY (at the DFKI NLP archive)
- GTU -- Grammatik-Test-Umgebung (at the DFKI NLP archive)
- GULP -- Graph Unification Logic Programming (at the DFKI NLP archive)
- German Morphology Browser
- Geta-run (at the DFKI NLP archive)
- GlossaNet
- Grammar Workbench (at the DFKI NLP archive)
- Grammar Writer's Workbench for Lexical Functional Grammar
H
- HAITIAN CREOLE ELECTRONIC TEXTS
- HANDBOOK OF AUSTRALIAN LANGUAGES
- HCRC Map Task Corpus XML annotations
- HPSG Mailing List
- Haitian Creole corpus -Teknoloji pou lang kreyol
- Hansard French-English parallel corpus
- Hansards Corpus - Searchable
- Hdrug (at the DFKI NLP archive)
- Hebrew Morphological Parser
- Hidden Markov Model Toolkit
- How to Use IT (MS-DOS) (at the DFKI NLP archive)
- How to Use IT (Mac) (at the DFKI NLP archive)
- Hyphenation and Spell-checking (at the DFKI NLP archive)
I
- IBM's Speech Recognition Modules
- ICE corpora
- ICOPOST
- IJECE
- ILA multilingual toolkit (at the DFKI NLP archive)
- IMS Corpus Toolbox, Univ. of Stuttgart
- IMS Corpus Workbench (CWB)
- IR and IE on the web
- IR list
- IR resources
- ISI rewrite decoder
- ISI rewrite decoder
- ISI's version of the RSTTool
- Infogistic - NLProcessor Interactive Demo
- Infogistics: NLProcessor Interactive Demo
- Information Extraction Towards Scalable, Adaptable Systems
- Information Processing and Management
- Information Retrieval (Journal)
- Information Retrieval: Data Structures and Algorithms
- Instructor's Manual for Syntactic Theory: A Formal Introduction
- InterBASE (at the DFKI NLP archive) (at the DFKI NLP archive)
- Internet Grammar of English
- IntraText - The missing link between text and hypertext (TM)
J
- JAPE (Joke Analysis and Production Engine) (at the DFKI NLP archive)
- JIM III (at the DFKI NLP archive)
- JPSG parser and CU-prolog (at the DFKI NLP archive)
- James Allen - Natural Language Understanding (source code)
- JavaBayes - v0.346
- JavaBayes - version 0.346
- Journal of Intelligent Information Systems
- Journal of Memory and Language
- Journal of Natural Language Engineering
- Journal of Phonetics
- Journal of the American Society for Information Science
K
- KOMET (at the DFKI NLP archive)
- KOREKTOR 2.0 (at the DFKI NLP archive)
- KPML
- KRIS -- Knowledge Representation and Inference System (at the DFKI NLP archive)
- KURA 1.0
- KWiCFinder
- Kiel University's Institute on Phonetics and Speech Procesing
- Kirrkirr 4.0 Dictionary Program
- Kluwer Academic Publishers
- Kluwer series in Text, Speech and Language Technology
- Knowledge Representation for Natural Language Processing in Implemented Systems
- Korean morphological analyzer and part-of-speech tagger
- Kwicfinder
L
- LANGUAGE LEARNING CENTER - ACADEMIC CORPUS
- LANGUAGE LINKS
- LANGUAGE LINKS
- LEXICOGRAPHY AND THE OED: Pioneers in the Untrodden Forest
- LFG Database: List of Names
- LFG parser for Turkish (at the DFKI NLP archive)
- LIBSVM -- A Library for Support Vector Machines
- LIBSVM: A Library for Support Vector Machines
- LINGUIST (at the DFKI NLP archive)
- LINK GRAMMAR PARSER
- Lacio Web Corpora
- Language Identification Tools
- Language, Journal of the Linguistic Society of America
- Latin Home Page
- Le corpus BAF (French and English)
- Learner Behaviour on the Internet
- Lecture Notes in Computer Science Vol. 1835
- Lecture Notes in Computer Science Vol. 1835
- Lemur Toolkit Download
- Lexical FreeNet
- Lexical information for German
- Lexical information for German
- Leximancer
- LingPipe
- Linguist's Search Engine
- Linguistic Data Consortium, University of Pennsylvania
- Linguistic DataBase (at the DFKI NLP archive)
- Linguistic Interpretation of a German Corpus
- Linguistic Kernel Processor (LKP) (at the DFKI NLP archive)
- Linguistic Kernel Processor (LKP) (at the DFKI NLP archive)
- Linguistics resources list at Princeton University
- LinkGrammar-WN project
- Links to Linguistic and Related Information (University of Passau)
- List of English stopwords
- List of on-line dictionaries (from lai.com)
- List of resources (at University of Stuttgart)
- List of resources (at University of Toronto)
- List of stop words
- Log-likelihood calculator
- Log-likelihood calculator
- Logos Translation Software and LogosClient (at the DFKI NLP archive)
- Longdo Thai Dictionary
M
- MCHART (at the DFKI NLP archive)
- MODALYS (at the DFKI NLP archive)
- MONKEY (at the DFKI NLP archive)
- MORLEX - A lexical database for French
- MT List
- Machine Learning Journal Special Issue on Natural Language Learning
- Machine Translation
- Magic (at the DFKI NLP archive)
- Managing Gigabytes, by Witten, Moffat, and Bell
- Mapping WordNet Versions 1.6 and 2.0
- Medlars collection
- Mike Scott's Web - Wordsmith Tools
- Miscellaneous Word Lists from Oxford University
- Moby Database
- Module for splitting text into sentences
- Moss -A System for Detecting Software Plagiarism
- Moss: A System for Detecting Software Plagiarism
- MtRecode - Character conversion program
- MtScript - The Multext multi-lingual text editor
- MtStr - Multilingual string library
- Multext-East Project
- Multi-Paradigm Programming in Oz for NLP
- Multilingual PC software
- Multilingual Text Tools and Corpora
- Multiword Expression Resources
N
- NAACL-Supported Two-Week Summer School in Human Language Technologies
- NAACL-Supported Two-Week Summer School in Human Language Technologies
- NAUDA generation component (at the DFKI NLP archive)
- NEGRA Corpus
- NEGRA Corpus
- NL Builder 5.0 (TM) (at the DFKI NLP archive)
- NLL (at the DFKI NLP archive)
- NLP: User Modeling 2001
- NLSH: Natural Language Shell
- NLTK - Natural Language Toolkit
- NMSU Natural Language Processing Tools
- NUGGET (R) (at the DFKI NLP archive)
- Name lists from US census
- Natlanco
- Natlanco
- Natural Language (TM) (at the DFKI NLP archive)
- Natural Language Computing: An English Generative Grammar in Prolog
- Natural Language Engineering
- Natural Language Processing / Information Retrieval Software Repository
- Natural Language Processing Systems
- Natural Language Processing for French
- Natural Language Processing for Online Applications
- NeXTeNS - Dutch Extension for Text to Speech
- Newspapers on the Internet
- Nexing Corpus
0
- ONLINE LINGUISTICS JOURNAL
- OPUS - an open source parallel corpus
- ORFO (at the DFKI NLP archive)
- Omphalos Context-Free Language Learning Competition
- On-line books at CMU
- OpenRCT Home
- Opus, a commercial biology text mining system
- Oxford Text Archive Corpus of Italian Newspapers
P
- P--TRA (at the DFKI NLP archive)
- PAKTUS (at the DFKI NLP archive)
- PAPPI (at the DFKI NLP archive)
- PARLANCE / Learner (at the DFKI NLP archive)
- PAULA (at the DFKI NLP archive)
- PC--KIMMO definition files for turkish morphology (at the DFKI NLP archive)
- PC-Translator (at the DFKI NLP archive)
- PENMAN (at the DFKI NLP archive)
- PLAIN+ (at the DFKI NLP archive)
- PLEUK (at the DFKI NLP archive)
- POLYSEMY: Theoretical and Computational Approaches
- POPEL (at the DFKI NLP archive)
- PROFGLOT (at the DFKI NLP archive)
- Pangloss (at the DFKI NLP archive)
- Paper on Sentence Boundary Disambiguation
- Parser (at the DFKI NLP archive)
- Perl interface to WordNet
- Perl: extending your pos-tagger using regular expressions, Dan Jurafsky
- Phono (at the DFKI NLP archive)
- Phono- Sound Change Model Software
- Phrases in English and the British National Corpus
- Phrases in English
- Phrases in English
- PlayMoBild (at the DFKI NLP archive)
- PlayMoBild (at the DFKI NLP archive)
- Polish subcorpus of the International Corpus of Learner English
- Probability Theory: The Logic Of Science (by E. T. Jaynes, Washington University, Saint Louis)
- Project: Pytalk
- Prosogram
- Prosogram
- Protege Project
- Public registry of the Council of the EU
- Publically available POS tagger
- Pulavan (at the DFKI NLP archive)
- Pundit (at the DFKI NLP archive)
- Python Programming Tutorial
Q
- QDATR (at the DFKI NLP archive)
- QPATR (at the DFKI NLP archive)
- Qanda: Open source question answering system
- Qanda: Open source question answering system
R
- Réacc - reaccenting software
- README for the daemonized version of Collins' Parser
- REAL WORLD LINGUISTICS 101
- RHET (at the DFKI NLP archive)
- ROBUSTNESS IN LANGUAGE AND SPEECH TECHNOLOGY
- RST LaTeX (Reitter IT and Media)
- Ramon Piero Center for Research
- Release of RSTTool: RSTTool 2.7
- Research-lab.com
- Resource for high-quality tools supporting multi-lingual communication
- Resource for professional-quality language translation tools.
- Resources for Text, Speech and Language Processing
- Restricted English Corpus from Dr. Caroline Lyon for PhD
- Reuters Corpus
- Robot Karaoke
- Rule Engine for the Java Platform
- Russell and Norvig - AI
- Russian Corpora
- Russian Corpus Page
- Russian Corpus Site
- Russian Corpus Site
- Russian Newspaper Corpus
- Russian Newspaper Corpus
- Russian Phonetics on the Web
- Russicon Resources
S
- SATZ--Adaptive Sentence Boundary Detector
- SCISOR / NLToolset (at the DFKI NLP archive)
- SEMBLEX (at the DFKI NLP archive)
- SIG-IRList Archives
- SLG (at the DFKI NLP archive)
- SLG (at the DFKI NLP archive)
- SNOOP (at the DFKI NLP archive)
- SNePS (at the DFKI NLP archive)
- SOFTISSIMO
- STAMP (at the DFKI NLP archive)
- STEMMA (at the DFKI NLP archive)
- STEMMA (at the DFKI NLP archive)
- SUNDIAL (at the DFKI NLP archive)
- Saarland University, Computational Linguistics
- Saarland University, Computational Linguistics
- Sanskrit Library
- Search Tools for Web Sites and Intranets
- Senseval-3 Task: Automatic Labeling of Semantic Roles
- Senseval-3 Task: Word-Sense Disambiguation of WordNet Glosses
- Sequence learning: Paradigms, Algorithms and Applications
- Short intensive course: Texts, Discourse and Corpora: Corpora in Linguistics and Related Fields
- Slovene-English Parallel Corpus
- Software - The chunklink script, by Sabine Buchholz
- Software Tools for NLP
- Software Tools for NLP
- Software for the Extraction of N-ary Textual Associations (SENTA)
- Special Interest Group on Computational Semantics
- Speech and Language Processing, by Daniel Jurafsky and James Martin
- Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio
- Speech in Noisy Environments 2 (SPINE2 CODED) Coded Audio
- St. Jerome Publishing
- Statistical Natural Language Processing: Models and Methods
- Studies in Language and Linguistics
- Survey of Electronic Corpora (by Jane A. Edwards, file at CMU)
- Survey of English Usage, University College, London
- Survey of the State of the Art of Human Language Technology
- Susanne: Annotated American English Corpus
- Swedish to English
- Swedish to Finnish
- Syntactic Theory: A Formal Introduction by Ivan Sag and Thomas Wasow
- Syntactic dependency parser for English
- Syntactica (at the DFKI NLP archive)
- System for evaluation of anaphoric relations (at the DFKI NLP archive)
T
- TAG--GEN (at the DFKI NLP archive)
- TECHDOC (at the DFKI NLP archive)
- TELRI Research Archive of Computational Tools and Resources
- TEMPOS (at the DFKI NLP archive)
- TFS (Typed Feature Structure) system (at the DFKI NLP archive)
- TIGERSearch - tools for linguistic text exploration
- TIGERSearch - tools for linguistic text exploration
- TREC Video Retrieval Evaluation Page
- TUG (at the DFKI NLP archive)
- Tamil Part-of-speech tagger (at the DFKI NLP archive)
- Telcordia Latent Semantic Indexing Demo Machine
- Term Rewrite System for non-confluent TRS's (at the DFKI NLP archive)
- Terminology for more than 15 languages
- Texas A&M University Linguistics Course Listings
- Text Encoding Initiative --Tools
- Text, Speech and Dialogue Third International Workshop, TSD 2000 Brno, Czech Republic, September 13-16, 2000 Proceedings
- The ALPAC Report
- The BNC Index (for the BNCWorld Edition)
- The Bank of Swedish - A Linguistic Reference Database of Göteborg University
- The Brooklyn-Geneva-Amsterdam-Helsinki Parsed Corpus of Old English
- The Brooklyn-Geneva-Amsterdam-Helsinki Parsed Corpus of Old English
- The CLAWS tagging service
- The Childes Corpus - Children's language
- The Cross-Language Evaluation Forum
- The Dialogue Diversity Corpus
- The HALogen Natural Language Generation system
- The IMS Corpus Toolbox Webpage
- The Internet Timelines Project
- The Java Open Source Spell Checker
- The Java Open Source Spell Checker
- The John Bateman and Michael Zock's list of Natural Language Generation Systems
- The LUCY Corpus - Documentation
- The Language of Word Meaning
- The Lexical Semantics of a Machine Translation Interlingua
- The Mariano Silva y Aceves Series
- The Moby Corpus
- The NLP Dictionary
- The Naming Company
- The Negra Corpus - German Syntax annotated
- The Ninth Text REtrieval Conference (TREC 9) Conference Proceedings
- The Omicron Inforium
- The Oslo Corpus of Bosnian Texts
- The PLUG Word Aligner - PWA
- The RELATOR language resources server
- The Rosetta PrOject
- The XTAG Project
- Tools developed at Columbia University (FUF, Surge, Crep, Segmenter, Verber, Xtract)
- [www.torch.ch Torch3]
- TransSearch
- TreeTagger - a language independent part-of-speech tagger
- TreeTalk: Memory - Based Grapheme - Phoneme Conversion Demo
- Treebank tokenization scheme
- Turbo Lingo
- Type description language (at the DFKI NLP archive)
U
- UBS (at the DFKI NLP archive)
- UBS -- UnifikationsBasierte Sprache (at the DFKI NLP archive)
- UBS -- UnifikationsBasierte Sprache (at the DFKI NLP archive)
- UCREL Semantic Analysis System
- UN declaration of human rights in multiple languages
- UNITEX
- UniNE stopword list for Portuguese
- Universal Grammar in Prolog
- Useful links about parallel corpora, by Olivier Kraif
V
- VERTEX - A chart parser for unification grammars (French)
- VISL - Visual Interactive Syntax Learning
- VISL Tagger and Parser
- Valencianlanguage.com
- Verbot preview 4.0
- Verbot preview 4.0
- Versioning Machine 2.0
- Virtual Language Centre's Web Concordancer
- Visual Text - reference documentation
- VisualText