Sunday, December 23, 2007

Web 3.0 and medicine Make way for the semantic web

Dean Giustini, UBC biomedical branch librarian
This time last Christmas, medical blogs and RSS feeds were the hot technology topics, and we were debating the merits of newer models of scholarly publishing in web 2.0, such as open access and medical wikis.1 Can web 3.0 be here already?
Recently, a neurologist devised an apt medical metaphor for web 3.0. He suggested that, "The development of the graphical web from its early days in 1995 to the social web of late 2007 is comparable to the developing brain." He went on to say that, "Whereas web 1.0 and 2.0 were embryonic, formative technologies, web 3.0 promises to be a more mature web where better ‘pathways’ for information retrieval will be created, and a greater capacity for cognitive processing of information will be built." (Personal communication, A Wong, 2007.)
So what is web 3.0, and why is it called the semanticweb (table)? Although both terms are used interchangeably, they convey slightly different, if complementary, views of the new web. The web 3.0 label is often used as a marketing ploy for "the next big thing." An important feature of web 3.0 is that it enables computers to talk to each other so that they can perform the tasks necessary for us to do our work. However, a primary feature of web 3.0 is that it uses metadata—data about data. This will transform the web into a giant database, and organise it along the lines of PubMed, or one of our trusted medical library catalogues.2
Somehow, the term semantic web has escaped the reproach of web 3.0, perhaps because it was coined by the respected web expert Sir Tim Berners-Lee in his landmark paper in Scientific American.3 His ideas continue to have tremendous salience. Berners-Lee’s view is that we need to use semantic annotation to express the meaning latent in web documents, by drawing out inferences in documents deep within the web. As a pioneer in search technology, and director of the World Wide Web Consortium, Berners-Lee maintains that access to a global "web of data"—what weaves the entire web together into a coherent whole—should help to solve humankind’s most complex problems.4
To understand why we need web 3.0, let’s examine the current state of the web. Currently, access to endless reams of unorganised information in web 2.0 shifts the online habits of doctors to searching, not finding. Consequently, medical librarians believe that it is necessary to build better mechanisms for information retrieval.5 6 As a colleague said to me recently, "we need findengines, not search engines."
In medicine, finding the best evidence has become increasingly difficult, even for librarians. Despite its constant accessibility, Google’s search results are emblematic of an approaching crisis with information overload, and this is duplicated by Yahoo and other search engines. Consequently, medical librarians are leading doctors back to trusted sources, such as PubMed, Clinical Evidence, and the Cochrane Library, and even taking them to their library bookshelves instead. Unless better channels of information are created in web 3.0, we can expect the information glut to continue.
Web 3.0 is likely to have a big effect on medicine in 2008. In bioinformatics, it will become more common to process ever larger amounts of data. In fact, experts in bioinformatics already search for data from disparate systems, and they have started to build rich semantic relations into information tools for knowledge discovery. Finally, greater capacity for creating knowledge in medicine will be possible if we have the will to publish clinical data openly and transparently, and subject it to scrutiny.7
Developing a more personalised healthcare system will be an important challenge for doctors in web 3.0. In an era of greater personalisation, treating patients’ health problems according to their genetic profiles will depend on using the latest information technologies.8 Even the treatment of new diseases and warning systems for natural disasters will benefit from the merging of epidemiological datasets with virtual, three dimensional tools like Google Earth. Making the search for health information efficient and responsive to patients’ needs will also help reduce the costs of medical treatment.
Social software enthusiasts may well find that the new web will be fertile ground for the creation of knowledge. Although already popular, wikis may well serve as platforms for the exploration of web 3.0. One innovative wiki—Wikiproteins—is already using semantic technologies. In contrast to other wikis, Wikiproteins imports data mined from several of the world’s leading biomedical databases, such as PubMed, UniProt, and the National Library of Medicine. Its integrated entries are a useful combination of genetic information and scientific literature. Notably, the confluence of databases in Wikiproteins yields more than two million factual associations for data mining and over five billion associated pairs.9
Each new version of the web should be a better iteration of its predecessor, and web 3.0 should be no exception. In medicine, we should focus on the ability to locate trusted clinical information, while creating the means to produce new knowledge. Information retrieval in web 3.0 should be based less on keywords than on intelligent ontological frameworks, such as the National Library of Medicine’s Unified Medical Language System, Medline’s trusted MeSH vocabulary, or some other tool. The National Library of Medicine is working on automated indexing, which may be part of the solution for searching the biomedical web.10 Finally, as we move further into the digital age, our trusted print libraries must continue to be well funded and should not be forgotten in the midst of the intelligent web.
The question of whether http://del.icio.us and www.connotea.org—two popular social tagging sites—will be useful in web 3.0 remains doubtful.11 Social tagging or "indexing" has limitations because of poor control of synonyms, homonyms, spelling conventions, and other linguistic variations. Think about the myriad ways we describe a heart attack; these variations have enormous implications for searching and require control to optimise retrieval. A smarter medical web is coming. Its two most exciting features will be the better organisation of documents and a deeper use of the knowledge base in medicine. In terms of searching, the semantic web should resemble a library catalogue, where documents are described and given meaningful access points for easy retrieval. However, in getting to web 3.0, let’s aim for something better than the current web, not the incoherent mess of web 2.0. Logically, web 3.0 should bring order to the 21st century web in the same way that Dr John Shaw Billings’s Index Medicus brought order to medical research back in the 19th century.12 As a medical librarian, I sincerely hope that web 3.0 will return us to some of the time honoured principles of my profession.
Glossary
Data mining—a process of knowledge discovery or retrieval of hidden information from data banks and clusters of databases
Mashup—a web application or site that mixes content from multiple sources
RSS (really simple syndication)—a format for sharing content between different websites
Semantic web—a project that intends to create a universal medium for information exchange from 2008 and beyond by putting documents with computer processable meaning (semantics) on the world wide web
Semantics—a term derived from the Greek to give signs, meaning, or to make significant. Semantics refers to aspects of meaning as expressed in language or other systems of signs
Social tagging—the application of freely chosen labels, or tags, to web documents, web pages, and photo sharing sites, such as www.flikr.com
Web 3.0—a term used to describe the evolution of the web, and our responses to it, in finding and organising new information
Wiki—a website or similar online resource that allows users to add and edit medical information collectively
Dean Giustini, UBC biomedical branch librarian
1 University of British Columbia Biomedical Branch Library, Diamond Healthcare Centre and Vancouver Hospital, BC, Canada V5Z 1M9

Giustini D. How web 2.0 is changing medicine. BMJ 2006;333:1283-4.[Free Full Text]
Cho A, Giustini D. The semantic web as a large searchable catalogue: a librarian’s perspective. Semantic Report. 2007. www.semanticreport.com/index.php?option=com_content&task=view&id=52&Itemid=79.
Berners-Lee T, Hendler J, Lassila O. The semantic web. a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Sci Am 2001www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21.
World Wide Web Consortium. Semantic Web Health Care and Life Sciences Interest Group. 2007. www.w3.org/2001/sw/hcls/.
Robu I, Robu V, Thirion B. An introduction to the semantic web for health sciences librarians. J Med Libr Assoc 2006;94:198-205.[ISI][Medline]
Lorence DP, Spink A. Semantics and the medical web: a review of the barriers and breakthroughs in effective healthcare query. Health Info Libr J 2004;21:109-16.[CrossRef][Medline]
Willinsky J, Murray S, Kendall C, Palepu A. Doing medical journals differently: open medicine, open access and academic freedom. Can J Commun 2007. http://pkp.sfu.ca/node/776.
Cho A, Giustini D. Back to the future: viewing health librarianship through the semantic lens of web 3.0. Canadian Health Libraries association (in press).
Mesko B. Web 3.0 and medicine. ScienceRoll blog. 2007. http://scienceroll.com/2007/04/06/web-30-and-medicine/.
Aronson AR, Bodenreider O, Chang HF. The NLM indexing initiative. Proc AMIA Symp 2000:17-21.
Kamel Boulos MN, Wheeler S. The emerging web 2.0 social software: an enabling suite of sociable technologies in health and health care education. Health Info Libr J 2007;24:2-23.[CrossRef][ISI][Medline]
Lyndenberg HM. John Shaw Billings: creator of the National Medical Library, and its catalogue. Chicago: American Library Association, 1924.

No comments: