Document & Collection Analysis Engine
Assemble document collection
- on .html or .txt filename extensions
- include full path and filename
Preliminary document analysis
- parse text into its constituent words
- compute total number of occurrences of each word in document (DOCFREQ)