lex-ugnormal-cb100: Words from Unitex+Regra with frequency > 100 on Corpus Brasileiro. Main lexicon from UGCNormal. 

freq-cgu: Words from Unitex (full) + Regra (full) and their frequencies on all UGC samples (Twitter (full), UOL (full), Buscapé (full), Mercado Livre (full)). Merged with UGNormal's lexicon.

unitex-full-clean+enelvo-ja-corrigido: unitex-full + words annotated as already correct on TUB-UGC.
