abstract:In machine learning and information retrieval, the cluster hypothesis is an assumption about the nature of the data handled in those fields, which takes various forms. In information retrieval, it states that documents that are clustered together "behave similarly with respect to relevance to information needs".