We present ConsentCanvas, a system which structures and “texturizes” End-User License Agreement (EULA) documents to be more readable. The system aims to help users better understand the terms under which they are providing their informed consent. ConsentCanvas receives unstructured text documents as input and uses unsupervised natural language processing methods to embellish the source document using a linked stylesheet. Unlike similar usable security projects which employ summarization techniques, our system preserves the contents of the source document, minimizing the cognitive and legal burden for both the end user and the licensor. Our system does not require a corpus for training.
The ConsentCanvas system demonstrates a novel application of several natural language processing methods used together, and has a working implementation. The system itself is an extensible framework for the visual embellishment of plaintext documents, and contains a Python implementation of a non-trivial algorithm for the identification of meaningful variable-length phrases.