How can we make the results returned by search engines more relevant?
Making search engines more relevant
Does the right information reduce the number of searches thereby saving time and money?
While the secrets of modern search engine algorithms are some of the most closely guarded in the modern world, City, University of London research has contributed to search functions that deliver relevant results quickly - saving individuals, business and government organisations time and money.
The model produced in their work outperforms other methods in benchmark tests and helps users to access better quality information billions of times every day.
Evidence from a variety of sources shows that the work has had a significant economic impact nationally and internationally. Many software companies have benefited from the work, including multinationals (Microsoft) and UK small and medium sized enterprises (SMEs) (Grapeshot) and those who use the services of such software, including Reed Recruitment, MyDeco and UNESCO. Getting the right information to people efficiently and reducing the number of searches performed saves time and money and has a wide range of benefits for individuals and society.
What did we explore and how?
Professor Stephen Robertson, Professor of Information Science, who has been with the University since 1978, developed the theory put into practice - within software - by Research Fellow Stephen Walker, who was at City from 1988 to 1998.
The pair used probability theory to rank information according to the user's information need, in a function known as BM25 (BM standing for Best Match, 25 being the function number in software). The breakthrough came at the 1994 Text Retrieval Conference, showing that BM25 could significantly improve search results compared to other search ranking models.
Further work was carried out with Microsoft Research Cambridge, UK, to refine the ranking function, and research continues at the Department of Computer Science with researchers including Dr Andrew MacFarlane, on the optimization of filtering queries.
Microsoft uses a form of this ranking function in its Bing search engine, which was first introduced in 2005 and is now the second largest web search engine after Google.
Benefits and influence of the research
Commercial search engine companies do not disclose the inner workings of their search algorithms, but they followed the development of BM25 closely – their attendance at the presentation of the research showed this, and it is likely they have been at least influenced by this research.
As well as powering web searches, this technology enables the advertising activity linked to these searches, which through Bing and other large search engines comprises an $8.4 billion market worldwide. With better ranking and more relevant documents promoting greater user acceptance and therefore greater advertising revenue, the City BM25 model has helped to deliver widespread commercial success.
BM25 has also been adopted by widely-used open source software, including Apache Lucene , which is used on nearly two thirds of all websites.
The BM25 matching function has had a significant economic impact nationally and internationally. Its reach has benefited many software companies, such as Microsoft and others who use the software, such as the Financial Times and UNESCO.
The model has been implemented successfully by small businesses to provide better search services for their clients. Muscat Limited used the probabilistic algorithms to develop internet and intranet search systems in the 1990s; software which still powers the Yell search engine today in 2014.
Consumers accessing the BBC, Nokia, NASA and a hundred other global websites, used the probability ranking Muscat search engine on a daily basis, to access the depths of the internet using simple and forgiving query constructs. Today the co-founders of Muscat use similar algorithm approaches to power behind-the-scenes advertising optimization - using the words on pages to contextually query 'best match' advertising during the time it takes a consumer to load the web-page.
- Professor Stephen Robertson
- Dr Andrew MacFarlane
More about this research
- School of Mathematics, Computer Science and Engineering
- Related academic: Dr Andrew Macfarlane
- Status: Completed
- Topics: Computer science and informatics
- Industry/sector: Software and the internet
- Funder: EPSRC
- Project partners/stakeholders: City University London, United Kingdom (Lead Research Organisation) and ESRC, United Kingdom (Co-funder)
- Publication link: External link