To minimise the risk of including random concatenations of words, rare spellings or mistakes, any word or expression had to appear in the corpus at least 40 times to merit inclusion in the final, chronologically ordered set.
ECONOMIST: Science invades the humanities