Citation
Share
Date
Abstract
We are storing an increasing amount of digital information to support our work and life activities. Emails, documents, articles, bookmarks, e-books, images, audio and video are being persisted not only in our hard drives but also in the storage provided by the newer Web 2.0 services. All this information constitutes some sort of personal digital memory, where the key feature is to be able to find what we are looking for as fast as possible. Information requests in these personal repositories are mostly related to the refinding of documents we have already read/seen. Pulí text search will do the job when we remember the exact words, which is getting harder nowadays; so we may remember a concept associated in our memory to the information object we are looking for. To sol ve this problem, concept based query expansión is the solution. With query expansión, our chances to find the document requested increase. But if the list of results is big, probably it will take some time to lócate it. In repositories like the Personal Digital Library system, which targets an optimal use in portable devices, it is important to have an adequate ranking of the documents to provide a better searching experience. The analysis of the linking structure of documents is a good ranking factor for general search engines. But in the context of personal collections, this factor is not available; so, more information is needed about the document and the tagging approach seems like the best option to provide more criteria to rank appropiately the list of results. This research proposes an architecture of a conceptual information retrieval system for personal repositories, which includes tagging as the semantics provider mechanism and query expansión as the recall enabler. The architecture includes an additional feature to gather more semantics which is the indexing of concepts using latent semantic analysis.