Abstract:
We present a method for automatic generation of sentiment aware lexicon based on 500’000 recent online user reviews about restaurants and hotels. The experiments include an implementation of a web crawler, part of speech tagger, and an algorithm for sentiment analysis and expansion of the lexicon. The experiments are focused on the domain of restaurant and hotel reviews, which offers a large set of relatively simply and short texts that are also self-tagged by their authors and limited to a set of topics like service, food quality, ambience, etc. We found out that the area is prominent for building and testing machine learning algorithms for opinion mining and sentiment analysis and this is proven by our results which achieve and accuracy of up to 88%. The research is based on very recent consumer generated data.