Keystroke Authentication on Enhanced Needleman Alignment Algorithm

Abstract

An important point for computer systems is the identification of users for authentication. One of these identification methods is keystroke dynamics. The keystroke dynamics is a biometric measurement in terms of keystroke press duration and keystroke latency. However, several problems are arisen like the similarity between users and identification accuracy. In this paper, we propose innovative model that can help to solve the problem of similar user by classifying user’s data based on a membership function. Next, we employ sequence alignment as a way of pattern discovery from the user’s typing behaviour. Experiments were conducted to evaluate accuracy of the proposed model. The results show high performance compared to standard classifiers in terms of accuracy and precision.

Share and Cite:

Bamatraf, S. , Bamatraf, M. and Hegazy, O. (2014) Keystroke Authentication on Enhanced Needleman Alignment Algorithm. Intelligent Information Management, 6, 211-221. doi: 10.4236/iim.2014.64021.

The same data set is applied to the Weka [38] experimental package. Various classifiers from different categories (trees, decision tables, decision rules, ANN, etc.) were experimented, in order to be compared with the proposed method under identical conditions.

Table 1 illustrates the results in terms of accuracy and precision computed as given in equations (1) and (2) respectively. The same results are also depicted in Figure 9 to be more visible.

In general, the experimental results indicate that the performance of the proposed system (Enhanced NM-W) is more efficient in terms of accuracy and precision compared to the other classifiers. The accuracy of our pro-

Figure 8. Timing information types.

Table 1. Classifier results using Weka tool vs enhanced approach.

Figure 9. Classifier results of Weka and our enhanced approach.

posed system is better at approximately 30% over the nearest accuracy which is achieved by BayesNet. Similarity, the precision of our proposed system is better by about 40% over the nearest precision which is also achieved by BayesNet.

6. Discussion and Limitations

The main reason behind the far better results of the proposed model over other classifiers is due to several reasons. One reason is the nature of the data in terms of quantity and the problem domain, in some classifiers that are tree, table, and rule based when entropy is calculated for data with more than 40 attributes and about 50 exemplars and set as the base for the tree root or decision it leads in most of the cases into unbalanced judgment as there is high similarity between several users if most of attributed are used in the construction of the model, leading to high misclassification. One more reason is with nominal values such classifiers performance is lower compared to continuous data, when the raw data is applied to the classifiers it showed closed results to the proposed model. Moreover we used Weka classifier that deals with nominal data; it may be possible some other classifiers (out of our scope or knowledge) can generate similar results. Any how the proposed model results evidently proves the applicability of the model in similar domains, more datasets can also be experimented in the future with the proposed model.

However, there are some limitations for our approach must be considered. One limitation with the proposed model is the nature of data the technique can deal with; it can't be applied directly to continuous data. Another limitation lies in the nature of such classification problem as the relation between attributes where in some cases some keystrokes must be ignored in some users and kept for the rest sequence alignment skips such cases with penalties not effecting the judgment of relating such sequence to a user, where other classifiers usually considers the selection process to the whole data. Even though such feature is an advantage in the other hand it is a limitation for other type of nominal data. Moreover a problem lies with nature of the Needleman alignment regarding the local-minima trap.

7. Conclusion

This work handles the problem of how to authenticate users efficiently based on their keystroke behaviour. The method creates a unique signature for each user using a membership function as a sequence of letters. Hence, we utilize the sequence alignment Needleman-Wunsch algorithm to get more accurate value of authentication process. Furthermore, Blosum matrix is reconstructed to increase the similarity degree based of the convergence degree of letters in the keyboard. The experiments proved that Needleman is very promising in extracting user patterns with accuracy rate 80% and precision rate 90.3%. A comparison with other classifiers proved that the proposed approach achieves significantly better results.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Revett, K. (2007) A Bioinformatics Based Approach to Behavioural Biometrics. Proceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies, Jeju City, 11-13 October 2007, 665-670.
http://dx.doi.org/10.1109/FBIT.2007.143
[2] Joyce, R. and Gupta, G.K. (1990) Identity Authentication Based on Keystroke Latencies. Communications of the ACM, 33, 168-176.
http://dx.doi.org/10.1145/75577.75582
[3] Chaudhari, R.D., Pawar, A.A. and Deore, R.S. (2013) The Historical Development of Biometric Authentication Techniques: A Recent Overview. International Journal of Engineering Research & Technology (IJERT), 2, 3921-3928.
[4] Nanavati, S., Thieme, M. and Nanavati, R. (2002) Biometrics: Identity Verification in a Networked World. John Wiley & Sons.
[5] Voth, D. (2003) Face Recogniition Technology. IEEE Intelligent Systems, 3, 4-7.
[6] Shu, W. and Zhang, D. (1998) Automated Personal Identification by Palmprint. Optical Engineering, 37, 2359-2362.
http://dx.doi.org/10.1117/1.601756
[7] Shaughnessy, D.O. (1986) Speaker Recognition. IEEE ASSP Magazine, 3, 4-17.
http://dx.doi.org/10.1109/MASSP.1986.1165388
[8] Tappert, C. (1984) Adaptive On-Line Handwriting Recognition. Proceedings of Seventh International Conference on Pattern Recognition, Montreal, 30 July-2 August 1984, 1004-1007.
[9] Herbst, N.M. and Liu, C.N. (1977) Automatic Signature Verification Based on Accelerometry. IBM Journal of Research and Development, 21, 245-253.
http://dx.doi.org/10.1147/rd.213.0245
[10] Lin, D.-T. (1997) Computer-Access Authentication with Neural Network Based Keystroke Identity Verification. Proceedings of the International Conference on Neural Networks, Houston, 9-12 June 1997, 174-178.
[11] Kumar, P. and Sahoo, G. (2013) Survey On Bioinformatics And Computational Biology. International Journal of Engineering Research & Technology (IJERT), 2, 108-114.
[12] Dayhoff, M.O., Schwartz, R.M. and Orcutt, B.C. (1978) A Model of Evolutionary Change in Proteins. Atlas of Protein Sequence and Structure, 5, 345-351.
[13] Henikoff, S. and Henikoff, J.G. (1992) Amino Acid Substitution Matrices from Protein Blocks. Proceeding of the National Academy of Sciences of the United States of America, 89, 10915-10919.
[14] Needleman, S.B. and Wunsch, C.D. (1970) A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. Journal of Molecular Biology, 48, 443-453.
http://dx.doi.org/10.1016/0022-2836(70)90057-4
[15] Eger, S. (2013) Sequence Alignment with Arbitrary Steps and Further Generalizations, with Applications to Alignments in Linguistics. Information Sciences, 237, 287-304.
http://dx.doi.org/10.1016/j.ins.2013.02.031
[16] Wangsuk, K. and Anusas-amornkul, T. (2013) Trajectory Mining for Keystroke Dynamics Authentication. Procedia Computer Science, 24, 175-183.
http://dx.doi.org/10.1016/j.procs.2013.10.041
[17] Stefan, D., Shu, X. and Yao, D. (2012) Robustness of Keystroke-Dynamics Based Biometrics against Synthetic Forgeries. Computers & Security, 31, 109-121.
[18] Alpar, O. (2014) Keystroke Recognition in User Authentication Using ANN Based RGB Histogram Technique. Engineering Applications of Artificial Intelligence, 32, 213-217.
[19] Smith, T.F. and Waterman, M.S. (1981) Identification of Common Molecular Subsequences. Journal of Molecular Biology, 147, 195-197.
http://dx.doi.org/10.1016/0022-2836(81)90087-5
[20] Higgins, D.G. and Sharp, P.M. (1988) CLUSTAL: A Package for Performing Multiple Sequence Alignment on a Microcomputer. Gene, 73, 237-244.
http://dx.doi.org/10.1016/0378-1119(88)90330-7
[21] Dayhoff, M.O. and Foundation, N.B.R. (1979) Atlas of Protein Sequence and Structure. National Biomedical Research Foundation.
[22] Karnan, M. and Krishnaraj, N. (2010) BioPassword—Keystroke Dynamic Approach to Secure Mobile Devices. IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), Coimbatore, 28-29 December 2010, 1-4.
[23] Gaines, R.S. (1980) Authentication by Keystroke Timing: Some Preliminary Results. Rand, Santa Monica.
[24] Young, J.R. and Hammon, R.W. (1989) Method and Apparatus for Verifying an Individual’s Identity. United States Patent.
[25] Hu, J., Gingrich, D. and Sentosa, A. (2008) A k-Nearest Neighbor Approach for User Authentication through Biometric Keystroke Dynamics. Proceedings of IEEE International Conference on Communications, Beijing, 19-23 May 2008, 1556-1560.
[26] Rodrigues, R.N., Yared, G.F.G., do N. Costa, C.R., Yabu-Uti, J.B.T., Violaro, F. and Ling, L.L. (2006) Biometric Access Control through Numerical Keyboards Based on Keystroke Dynamics. Proceedings of the 2006 International Conference on Advances in Biometrics, Hong Kong, 5-7 January 2006, 640-646.
[27] Bleha, S.A., Knopp, J. and Obaidat, M.S. (1992) Performance of the Perceptron Algorithm for the Classification of Computer Users. Presented at the Proceedings of the 1992 ACM/SIGAPP Symposium on Applied Computing: Technological Challenges of the 1990’s, Kansas City, 1992.
[28] Bleha, S.A. and Obaidat, M.S. (1993) Computer Users Verification Using the Perceptron Algorithm. IEEE Transactions on Systems, Man, and Cybernetics, 23, 900-902.
http://dx.doi.org/10.1109/21.256563
[29] Bleha, S., Slivinsky, C. and Hussien, B. (1990) Computer-Access Security Systems Using Keystroke Dynamics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12, 1217-1222.
http://dx.doi.org/10.1109/34.62613
[30] Bleha, S.A. and Obaidat, M.S. (1991) Dimensionality Reduction and Feature Extraction Applications in Identifying Computer Users. IEEE Transactions on Systems, Man and Cybernetics, 21, 452-456.
http://dx.doi.org/10.1109/21.87093
[31] Obaidat, M.S. (1995) A Verification Methodology for Computer Systems Users. Proceedings of the 1995 ACM Symposium on Applied Computing, Nashville, 26-28 February 1995, 258-262.
http://dx.doi.org/10.1145/315891.315976
[32] Haider, S., Abbas, A. and Zaidi, A.K. (2000) A Multi-Technique Approach for User Identification through Keystroke Dynamics. IEEE International Conference on Systems, Man, and Cybernetics, 2, 1336-1341.
[33] Gutiérrez, F.J., Lerma-Rascón, M.M., Salgado-Garza, L.R. and Cantú, F.J. (2002) Biometrics and Data Mining: Comparison of Data Mining-Based Keystroke Dynamics Methods for Identity Verification. Proceedings of the 2nd Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence, Yucatán, 22-26 April 2002, 460-469.
[34] Krause, M. and Tipton, H.F. (2011) Handbook of Information Security Management. Vol. 5, CRC Press LLC, Boca Raton.
[35] Eisner, R., Poulin, B., Szafron, D., Lu, P. and Greiner, R. (2005) Improving Protein Function Prediction Using the Hierarchical Structure of the Gene Ontology. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, San Diego, 14-15 November 2005, 1-10.
[36] Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D.S., Poulin, B., Anvik, J., Macdonell, C. and Eisner, R. (2004) Predicting Subcellular Localization of Proteins Using Machine-Learned Classifiers. Bioinformatics, 20, 547-556.
http://dx.doi.org/10.1093/bioinformatics/btg447
[37] Killourhy, K. and Maxion, R. (2009) Keystroke Dynamics-Benchmark Data Set.
www.cs.cmu.edu/~keystroke
[38] Holmes, G., Donkin, A. and Witten, I.H. (1994) WEKA: A Machine Learning Workbench. Proceedings of the 1994 2nd Australian and New Zealand Conference on Intelligent Information Systems, Brisbane, 29 November-2 December 1994.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.