Document and query expansion for information retrieval on building regulations

Ruben Krupier*, Ioannis Konstas, Alasdair J.G. Gray, Farhad Sadeghineko, Richard Watson, Bimal Kumar

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Downloads (Pure)

Abstract

Regulations and test criteria for building products are captured in hundreds of interrelated documents. It can be daunting to figure out which of these documents contain information that is relevant to your building project or product. In this paper, we describe work on an Information Retrieval (IR) system that aims to search through the contents of building regulations. Based on practitioner interviews we develop a small dataset of user-queries for which we would like to return relevant passages of documents. We explore several approaches to Query Expansion (QE) and Document Expansion (DE), taking into account the scarcity of openly available knowledge sources in our small technical domain. We show that IR performance can be greatly improved using QE and DE, and retrieve a top-3 relevant result for up to 85% of out queries. We share our IR dataset and the code to replicate our approach.
Original languageEnglish
Title of host publicationProceedings of the 30th EG-ICE
Subtitle of host publicationInternational Conference on Intelligent Computing in Engineering
Place of PublicationLondon
PublisherUniversity College London
Chapter13
Pages1-12
Number of pages12
Publication statusPublished - 4 Jul 2023
Event30th EG-ICE: International Conference on Intelligent Computing in Engineering - University College London, London, United Kingdom
Duration: 4 Jul 20237 Jul 2023
https://www.ucl.ac.uk/bartlett/construction/research/virtual-research-centres/institute-digital-innovation-built-environment/30th-eg-ice

Conference

Conference30th EG-ICE: International Conference on Intelligent Computing in Engineering
Country/TerritoryUnited Kingdom
CityLondon
Period4/07/237/07/23
Internet address

Cite this