Research at Sofia University >
OpenAIRE >
S3T 2009 >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10506/655

Title: Impact of Ngrams-based indexing on XML retrieval
Authors: Ben Aouicha, Mohamed
Tmar, Mohamed
Boughanem, Mohand
Keywords: Ngrams
XML retrieval
Issue Date: 2009
Publisher: Demetra EOOD
Citation: Ben Aouicha, Mohamed, Tmar, Mohamed, Boughanem, Mohand (2009), Impact of Ngrams-based indexing on XML retrieval, Proceedings of International Conference on SOFTWARE, SERVICES & SEMANTIC TECHNOLOGIES, October 28-29, 2009, Sofia, Bulgaria, ISBN 978-954-9526-62-2, p. 29
Abstract: We present in this paper a statistical approach of term clustering. This approach is based on a statistical analysis of NGrams shared by a pair of terms and is inspired from the t f × idf criterion commonly used in information retrieval. Being statistical, the approach is completely independent from the lexical and grammatical characteristics of the language in which documents to be indexed are written. Classical indexing is often based on stemming, which consists of transforming a term into its radical. This allows to provide large issues for customized information access. As for us, we consider that this can be made by building term clusters and perform information retrieval based on this concept. This approach is used for XML retrieval, therefore some experiments have been undertaken into a dataset provided by INEX to show its impact compared to Porter stemming method.
URI: http://hdl.handle.net/10506/655
ISBN: 978-954-9526-62-2
Appears in Collections:S3T 2009

Files in This Item:

File Description SizeFormat
S3T2009_08_MBenAouicha_MTmar_MBoughanem.pdf324.46 kBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback