Enhanced intelligent text categorization using concise keyword analysis

Amir Mohammad Shahi*, Biju Issac, Jashua Rajesh Modapothala

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Supervised learning is a popular approach to text classification among the research community as well as within software development industry. It enables intelligent systems to solve various text analysis problems such as document organization, spam detection and report scoring. However, the extremely difficult and time intensive process of creating a training corpus makes it inapplicable to many text classification problems. In this research, we explored the opportunities of addressing this pitfall by studying the ontological characteristics of document categories and grouping them under virtual super-categories to narrow down the search for a suitable category. Applying this method showed that classifier performance has greatly improved despite the relatively small size of the training corpus.

Original languageEnglish
Title of host publicationICIMTR 2012 - 2012 International Conference on Innovation, Management and Technology Research
PublisherIEEE
Pages574-579
Number of pages6
ISBN (Electronic)9781467306546
ISBN (Print)9781467306553
DOIs
Publication statusPublished - 12 Jul 2012
Event2012 International Conference on Innovation, Management and Technology Research, ICIMTR 2012 - Malacca, Malaysia
Duration: 21 May 201222 May 2012

Conference

Conference2012 International Conference on Innovation, Management and Technology Research, ICIMTR 2012
Country/TerritoryMalaysia
CityMalacca
Period21/05/1222/05/12

Keywords

  • Categorization
  • Corporate Sustainability Report
  • Feature Selection
  • Global Reporting Initiative
  • Machine Learning
  • Supervised Learning
  • Text Ontology

Fingerprint

Dive into the research topics of 'Enhanced intelligent text categorization using concise keyword analysis'. Together they form a unique fingerprint.

Cite this