Semantics Based Web Ranking Using a Robust Weight Scheme

R. Vishnu Priya, V. Vijayakumar, Longzhi Yang

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)
28 Downloads (Pure)

Abstract

In this paper, HTML tags and attributes are used to determine different structural position of text in a web page. Tags- attributes based models are used to assign a weight to a text that exist in different structural position of web page. Genetic algorithms (GAs), harmony search (HS), and particle swarm optimization (PSO) algorithms are used to select the informative terms using a novel tags-attributes and term frequency weighting scheme. These informative terms with heuristic weight give emphasis to important terms, qualifying how well they semantically explain a webpage and distinguish them from each other. The proposed approach is developed by customizing Terrier and tested over the Clueweb09B, WT10g, .GOV2 and uncontrolled data collections. The performance of the proposed approach is found to be encouraging against five baseline ranking models. The percentage gain of approach achieved is 75-90%, 70-83% and 43-60% in P@5, P@10 and MAP, respectively.
Original languageEnglish
Article number4
Pages (from-to)47-63
Number of pages17
JournalInternational Journal of Web Portals
Volume11
Issue number1
DOIs
Publication statusPublished - Jan 2019

Fingerprint

Dive into the research topics of 'Semantics Based Web Ranking Using a Robust Weight Scheme'. Together they form a unique fingerprint.

Cite this