News Category Classification Using Distinctive Bag of Words and ANN Classifier

Amritpal Singh, Sunil Kumar Chhillar

Abstract


Category classification, for news, is a multi-label text classification problem. The goal is to assign one or more categories to a news article. A standard technique in multi-label text classification is to use a set of binary classifiers. For each category, a classifier is used to give a “yes” or “no” answer on if the category should be assigned to a text. Some of the standard algorithms for text classification that are used for binary classifiers include Naive Bayesian Classifiers, Support Vector Machines, artificial neural networks etc. In this distinctive bag of words have been used as feature set based on high frequency word tokens found in individual category of news. The algorithm presented in this work is based on a keyword extraction algorithm that is capable of dealing with English language in which different news categories i.e. Business, entertainment, politics, sports etc. has been considered. Intra-class news classification has been carried out in which Cricket and Football in sports category has been selected to verify the performance of the algorithm. Experimental results shows high classification rate in describing category of a news document.

Full Text:

PDF

References


Bing Liu “Sentiment Analysis and Opinion Mining” Human Language Technologies, ISBN: 9781608458851, pp: 1-167, 2012

Chee-Hong Chan Aixin Sun Ee-Peng Lim, “Automated Online News Classification with Personalization,” in Proc. 4th international conference on Asian Digital Libraries (ICADL2001), pp: 320-329, Bangalore, December 2001

Choiru Za’in, Mahardhika Pratama, Edwin Lughofer, Sreenatha G. Anavatti, “Evolving type-2 web news mining” Published in: Applied Soft Computing, Volume 54, pp: 200–220, May 2017

Liang-Chih Yu, Jheng-Long Wu, Pei-Chann Chang, Hsuan-Shou Chu, “Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news” Published in: Knowledge-Based Systems, Volume 41, pp: 89–97, March 2013

Yang huiRao, Jingsheng Lei, Liu Wenyin, Qing Li, Mingliang Chen, “Building emotional dictionary for sentiment analysis of online news” Published in: World Wide Web, Issue 4, Volume 17, pp :723–742, July 2014

L. Cui, F. Meng, Y. Shi, M. Li and A. Liu, “A Hierarchy Method Based on LDA and SVM for News Classification,” in Proc. 2014 IEEE International Conference on Data Mining Workshop, ICDM Workshops 2014, pp: 60-64, Shenzhen, China, 14 December 2014

E. Kiliç, M. R. Tavus and Z. Karhan, “Classification of breaking news taken from the online news sites,” in Proc. 2015 23nd Signal Processing and Communications Applications Conference (SIU), ISBN: 978-1-4673-7387-6, pp: 363-366, Malatya, Turkey, 16-19 May 2015

Z. Li-juan, Z. Feng, P. Qing-qing, Y. Xin and Y. Zheng-tao, “A classification method of Vietnamese news events based on maximum entropy model,” in Proc. 2015 34th Chinese Control Conference (CCC), ISBN: 978-1-4673-7443-9, pp: 3981-3986, Hangzhou, China, 28-30 July 2015

Yu-Chen Wei, Yang-Cheng Lu, Jen-Nan Chen, Yen-Ju Hsu, “Informativeness of the market news sentiment in the Taiwan stock market” Published in: The North American Journal of Economics and Finance, Volume 39, pp: 158–181, January 2017




DOI: https://doi.org/10.23956/ijermt.v6i6.288

Refbacks

  • There are currently no refbacks.