City Research influences and inspires
  1. Research impact
  2. Statistics and facts
  3. Research areas and groups
  4. City's research strategy
  5. Research integrity
  6. Ethics
  1. Case studies
Research

How can we make the results returned by search engines more relevant?

Making search engines more relevant

Does the right information reduce the number of searches thereby saving time and money?

While the secrets of modern search engine algorithms are some of the most closely guarded in the modern world, City, University of London research has contributed to search functions that deliver relevant results quickly - saving individuals, business and government organisations time and money.

The model produced in their work  outperforms other methods in benchmark tests and helps  users to  access  better quality  information  billions of  times  every day.

Evidence from  a variety  of sources  shows  that the  work  has had  a  significant economic  impact  nationally and internationally.  Many  software companies  have  benefited from  the  work, including  multinationals (Microsoft)  and  UK small  and  medium sized enterprises  (SMEs)  (Grapeshot) and  those  who use the  services  of such  software,  including Reed  Recruitment,  MyDeco  and UNESCO. Getting the right information to people efficiently and reducing the number of searches performed saves time and money and has a wide range of benefits for individuals and society.

What did we explore and how?

Professor Stephen Robertson, Professor of Information Science, who has been with the University since 1978, developed the theory put into practice - within software - by Research Fellow Stephen Walker, who was at City from 1988 to 1998.

The pair used probability theory to rank information according to the user's information need, in a function known as BM25 (BM standing for Best Match, 25 being the function number in software). The breakthrough came at the 1994 Text Retrieval Conference, showing that BM25 could significantly improve search results compared to other search ranking models.

Further work was carried out with Microsoft Research Cambridge, UK, to refine the ranking function, and research continues at the Department of Computer Science with researchers including Dr Andrew MacFarlane, on the optimization of filtering queries.

Microsoft uses a form of this ranking function in its Bing search engine, which was first introduced in 2005 and is now the second largest web search engine after Google.

Benefits and influence of the research

Commercial search engine companies do not disclose the inner workings of their search algorithms, but they followed the development of BM25 closely – their attendance at the presentation of the research showed this, and it is likely they have been at least influenced by this research.

As well as powering web searches, this technology enables the advertising activity linked to these searches, which through Bing and other large search engines comprises an $8.4 billion market worldwide. With better ranking and more relevant documents promoting greater user acceptance and therefore greater advertising revenue, the City BM25 model has helped to deliver widespread commercial success.

BM25 has also been adopted by widely-used open source software, including Apache Lucene , which is used on nearly two thirds of all websites.

The BM25 matching function has had a significant economic impact  nationally and  internationally.  Its reach has benefited many software companies, such as  Microsoft and others who use the software, such as the Financial Times and UNESCO.

The model has been implemented successfully by small businesses to provide better search services for their clients. Muscat Limited used the probabilistic algorithms to develop internet and intranet search systems in the 1990s; software which still powers the Yell search engine today in 2014.

Consumers accessing the BBCNokiaNASA and a hundred other global websites, used the probability ranking Muscat search engine on a daily basis, to access the depths of the internet using simple and forgiving query constructs. Today the co-founders of Muscat use similar algorithm approaches to power behind-the-scenes advertising optimization - using the words on pages to contextually query 'best match' advertising during the time it takes a consumer to load the web-page.

The researchers

  • Professor Stephen Robertson
  • Dr Andrew MacFarlane

More about this research

  • School of Mathematics, Computer Science and Engineering
  • Related academic: Dr Andrew Macfarlane
  • Status: Completed
  • Topics: Computer science and informatics
  • Industry/sector: Software and the internet
  • Funder: EPSRC
  • Project partners/stakeholders: City University London, United Kingdom (Lead Research Organisation) and ESRC, United Kingdom (Co-funder)
  • Publication link: External link