the number of equally likely words that can occur at an arbitrary position in a document. A lower perplexity[Topic Model] Perplexity is a standard performance measure used to evaluate models of text data. It measures a model’s ability to generalise and predict new documents: the perplexity is an indication of the number of equally likely words that can occur at an arbitrary position in a document. A lower perplexity therefore indicates better generalisation. We calculate perplexity on the test corpus... therefore indicates better generalisation. We calculate perplexity[Topic Model] Perplexity is a standard performance measure used to evaluate models of text data. It measures a model’s ability to generalise and predict new documents: the perplexity is an indication of the number of equally likely words that can occur at an arbitrary position in a document. A lower perplexity therefore indicates better generalisation. We calculate perplexity on the test corpus... on the test corpusIn linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed). They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory....
C∗ containing M∗ documents as follows:
A. De Waal, E. Barnard, Evaluating topic models with stability, 19th Annu. Symp. Pattern Recognit. Assoc. South Africa. (2008) 79–84. http://researchspace.csir.co.za/dspace/handle/10204/3016.