Abstract:
In this paper we present a proposal including collocations into the pre-processing of the text mining, which we use for the fast news article recommendation and experiments based on real data from the biggest Slovak newspaper. The news article section can be predicted based on several article’s characteristics as article name, content, keywords etc. We provided experiments aimed at comparison of several approaches and algorithms including expressive vector representation, with considering most popular words collocations obtained from Slovak National Corpus.