SUMTIME-METEO is a parallel corpus of naturally occurring weather forecast texts and the numerical data they are based on. The corpus has 1045 parallel data-text units and is available as a Microsoft Access database and as CSV (comma-separated values) text files exported from the Access database. The download zip file also includes documentation (PDF). The SUMTIME-METEO corpus is sometimes referred to simply as the SUMTIME corpus.

The textual portion of the corpus consists of human-written weather forecasts (marine forecasts intended for offshore oil rigs in the North Sea); these are actual forecasts written by professional forecasters for real clients. The data portion of the corpus consists of numerical weather predictions (of wind speed, temperature, precipitation, etc) that the human forecasters examined when they wrote the forecasts. The forecasts were written between 26-June-2000 and 10-May-2002.

