Instant Message Classification In Finnish Cyber Security Themed Free-Form Discussion

International Journal On Cyber Situational Awareness (IJCSA)

ISSN: (Print) 2057-2182 ISSN: (Online) 2057-2182

DOI: 10.22619/IJCSA

Published Semi-annually. Est. 2014


Dr Cyril Onwubiko, Chair – Cyber Security & Intelligence, E-Security Group, Research Series, London, UK; IEEE UK & Ireland Section Secretary

Associate Editors:

Professor Frank Wang, Head of School / Professor of Future Computing, Chair IEEE Computer Society, UK&RI, School of Computing, University of Kent, Canterbury, UK

Dr Thomas Owens, Senior Lecturer & Director of Quality, Department of Electronic and Computer Engineering, Brunel University, London, UK

Instant Message Classification in Finnish Cyber Security Themed Free-Form Discussion

Samir Puuska, Matti J. Kortelainen, Viljami Venekoski and Jouko Vankka


Instant messaging enables rapid collaboration between professionals during cyber security incidents. However, monitoring discussion manually becomes challenging as the number of communication channels increases. Failure to identify relevant information from the free-form instant messages may lead to reduced situational awareness. In this paper, the problem was approached by developing a framework for classification of instant message topics of cyber security–themed discussion in Finnish. The program utilizes open source software components in morphological analysis, and subsequently converts the messages into Bag-of-Words representations before classifying them into predetermined incident categories. We compared Support vector machines (SVM), multinomial naïve Bayes

(MNB), complement naïve Bayes classification methods (CNB) with fivefold cross-validation. A combination of SVM and CNB achieved classification accuracy of over 85%, while multiclass SVM achieved 87% accuracy. The implemented program recognizes cyber security related messages in IRC chat rooms and categorizes them accordingly.

Keywords: natural language processing, machine learning, language technology, text classification, classifiers, Finnish, instant messaging, cyber security. 

ISSN: 2057-2182

Volume 1. No. 1

DOI: 10.22619/IJCSA.2016.100105

Date: Nov. 2016

Reference to this paper should be made as follows: Puuska, S., Kortelainen, M.J., Venekoski, V., & Vankka, J. (2016). Instant Message Classification in Finnish Cyber Security Themed Free-Form Discussion. International Journal on Cyber Situational Awareness, Vol. 1, No. 1, pp97-109.

PDF Download