In follow up to my submission for Module 5, two questions have been posed:
(1) Do you think that in the future we will still use controlled vocabulary as it used in the library to organize its catalog?
I still believe that controlled vocabularies such as the LCSH, thesauri, etc. will still have their places in organizing library catalogs in the future, because of the advantages they will continue to impart including connections to broader/narrower/similar topics, improving recall, etc. However, I think the expense of learning and maintaining controlled vocabularies, the user un-friendliness, and the lack of currentness in controlled vocabulary maintenance is difficult to overcome when we – and libraries — are constantly inundated with new amounts of information that are difficult to assess. What I think would be interesting is a way to augment controlled vocabularies with user-supplied tagging so as to add new, additional ways that patrons can identify and access additional, relevant information. The social aspect of patron-supplied tagging might have the added benefit of more greatly engaging the library’s patrons, and remind them that libraries can provide added value for their research in comparison to the open web.
This is a topic, coincidentally, that I am contemplating exploring for my paper for this class – whether folksonomies can be effectively used in public libraries. Some of the reading I’ve done already indicates that ILS such as the SiriDynix Horizon Portal and a system from EOS International allow end user-specific customization such as personal lists and I am interested in learning more.
Second question, your opinion on Signals (2001) Modern Information Retrieval article?
With the explanations of each IR model covered in this article already summarized in my prior post, I’ll focus my assessment of the article on the techniques and applications for for evaluating search effectiveness. What has struck me to this point is that our reading has focused heavily on “what documents should be relevant,” but the fact of the matter is “what documents are relevant” is very much in the eye of the researcher. So, for example, with tf-idf the frequency of a term in a particular document as compared to its uniqueness within the document set should make that document more relevant to the researcher and thus places it higher in the answer set rankings, but at the end of the day that researcher’s opinion is what matters. That is why the section on Query Modification resonated with me. With relevance feedback taken into consideration, that subjective element is taken into consideration in the user’s next search. I feel like this has been incorporated into Google’s search results since this article was published in 2001, because it seems like once I do one or two searches on Google and click on a few documents that a second or third search eliminates some of the documents that interpret my search terms in ways that are contrary to my intent. It’s almost as if Google’s search engine has taken into account in my next search that “OK, she wants results that use the term “record” as a noun and to mean “the plastic album,” rather than as a verb so as to mean to make a copy of something.” Which is a little scary because when I am logged into my Gmail account, those user preferences are attributed to me as a named person, but user privacy is another subject for another day.