Skip Navigation

The Computer Journal 1998 41(8):589-601; doi:10.1093/comjnl/41.8.589
© 1998 by British Computer Society
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Witter, D. I.
Right arrow Articles by Berry, M. W.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Downdating the Latent Semantic Indexing Model for Conceptual Information Retrieval

D. I. Witter and M. W. Berry

Department of Computer Science, University of Tennessee, Knoxville, TN, 37996-1301, USA Email: berry{at}cs.utk.edu

Due to the growth of large data collections, information retrieval or database searching is of vital importance. Lexical matching techniques may retrieve irrelevant or inaccurate results because of synonyms and polysemous words, so effective concept-based techniques are needed. One such technique is latent semantic indexing (LSI) which uses a vector-space approach by identifying documents whose content is related to the user's query in order of similarity. LSI uses the singular value decomposition (SVD) of term-by-document matrix to encode the terms and documents in a vector-space model. Existing methods for removing terms or documents from the term-document space are either time consuming or do not sufficiently change the term-document relationships. This paper presents a new method for downdating, downdating the reduced model (or DRM) method, and discusses its implementation into the LSI++ software environment. The DRM method can be used to assess the effect that a term or document has on the clustering of relevant information in a collection and for the incorporation of user feedback in the existing LSI model. Implementing the DRM method within LSI++ not only provides downdating functionality, but is less time consuming than recomputing the SVD when removing a term, document or both. The DRM method is a viable algorithm for dynamic information modeling and retrieval.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.