Applied Machine Learning Techniques on Selection and Positioning of Human Resources in the Public Sector ()
1. Introduction
Strategic Human Resources (HR) Management theory is based on the principle that human capital as a strategic asset leading to competitive advantages (Becker & Huselid, 2006) hence, any related to HR processes must be based on justified and accurate decision-making systems, to meet the maximum of employees’ innovation and creativity capabilities. In this framework, proper selection and allocation of employees require an outline of accurate and timely expected qualifications in order to establish long term benefits to the organizations (Avdimiotis, 2016). Positioning the right people in the right places is the real benefit of long-term successful organizations (Collins, 2001). Computer Science technology can be a strategic tool for effective management. A Human Resources Information System (HRIS) is able to acquire, store, handle, analyze, retrieve data and distribute all the information computed in previous steps regarding an organization’s human resources. HRIS includes hardware and software, but also includes people/employees, forms, policies, procedures and mainly, data. Artificial intelligence (AI) is defined as the ability of systems to interpret external data correctly, to learn from such data and to use these lessons to achieve specific goals and tasks through flexible adaptation (Kaplan & Haenlein, 2019). A related survey found that 78% of managers would trust AI advices in the decision-making process (Kolbjørnsrud et al., 2016).
The ability to predict the best matching of human resources to appropriate positions, is very important for the strategic administration of human resources. The aim of this paper is to use machine learning algorithms to predict the proper selection and positioning of human resources in the public sector. That is a recent area of study. The main contribution is the development of a framework, based on machine learning algorithms that will provide a reliable tool for both recruitment selection and evaluation for proper positioning and promotion of employees. This tool will support human resources departments to make accurate and objective decisions about employees’ allocation.
The paper is organized as follows: In Section 2, studies that used machine learning for human resources’ selection, are reviewed. Section 3, provides the used research methodology for the specific experiment. In Section 4, various machine learning algorithms are compared and measures of model performance are presented to find the best solution for the best matching of employees to appropriate positions. Section 5, provides total results and discussion and the model with the best total performance is proposed. Finally, in Section 6 conclusions and prepositions for further research are presented.
2. Literature Review
Data mining is defined as the process of discovering patterns in a variety of data and solves problems from a large number of data. Machine learning (ML) is a field of scientific study that gives computers the ability to learn from data, through the study and building of algorithms, operating by constructing models and highlighting interrelationships through learning from historical relationships of the data, leading to data-based predictions. In machine learning, we have a set of input variables (x) used to determine an output variable (y). There is a relationship between input variables and output variables and the goal of ML is to quantify this relationship (Witten et al., 2017). One of the machine learning tasks, in order to predict a target variable in previously unseen data, is classification (Mohammad et al., 2015). Classification is based on examining an uncategorized object and assigning it to a predefined set of classes. The items to be classified are represented by the inputs in the database and the sorting is done according to the assignment of each input to the predefined categories. This technique is used to create models that can classify new data, the classification of which is unknown. To do this, all available data is divided into a set of training data and a set of validation data. In the first step, a classification algorithm is used to analyze the data in order to construct the model. In the second stage, the model uses the test data to calculate its accuracy. With this procedure we can predict a class field with the help of other fields which are the calculation parameters. The most basic phase of any algorithm is training, where the algorithm uses as input a set of training data (training set) to achieve its purpose, the creation of new knowledge (Witten et al., 2017).
Human resources (HR) management and employee recruitment are analyzed in many research papers. These research works consider the specific topics mainly in the private sector, while in the public sector, research literature is very poor. Private sector’s procedures involve modern tools, like appraisal visualization, chat boxes, data collections from social platforms (LinkedIn, Facebook, Twitter etc.), external channels data pools (Faliagka et al., 2012; Kulkarni & Che, 2019). In the public sector, evaluation and recruitment assessment procedures should be based upon undisputable and common to all candidates’ criteria, therefore any social media platforms should be excluded in order to ensure reliability, meritocracy, equality, transparency and the privacy of recruitment and allocation processes.
A decision support system based on knowledge (Knowledge driven DSS) supports the decision makers, incorporating techniques from the Artificial Intelligence’s field. It uses rules that lead inductively in drawing a conclusion based on facts that have been stored before. These systems can use ad hoc procedures, which equate the behavior of the system to that of an experts’ scope (Mitakos, 2015).
Machine learning has been used in workforce analytics for predicting candidate suitability for a particular job position with classification methods. Kulkarni and Che (2019) reported the great importance of AI implementation in talent recruitment, concluding in extensive use of proper machine learning techniques to support HR managers. A model was created providing a talent acquisition framework, which is based on a performance indicator that measures and monitors the development of talent over time. In that model, the profile of the candidates who refused a job position was taken into account, as well as the reasons for rejecting a job offer, which are related to the salary, the place of work, the certain job position, the offer of a better job by competitors etc. Using machine learning techniques for classification, such as decision trees, support vector machines (SVM) and Naïve Bayes, and knowing candidate’s qualifications, salaries and their performance indicators score, it was possible to predict whether the selected candidate is likely to accept or reject the job offer as well as the reason why a selected candidate rejected the job offer (Deshpande et al., 2007).
Making use of data mining, classification methods and post-processing algorithms, using linear regression with normalization, linear support vector machines and Naïve Bayes algorithms, based on features, derived from the job title (resulting from the job description by the employees themselves), HR information (job address, experience, payment), social tags (status update, blogs, communities, forums, bookmarks, calendars) and work products, can predict the job role and match each employee to the best position, so that the specialized skills required of employees ensure success in their work (Varshney et al., 2014). The Naïve Bayes algorithm was also used to create a framework for the selection and placement of human resources objectively, according to defined criteria, related to training, interview, age and experience related to the company’s quality standards (Khairina et al., 2017). In order to extract the rules to assist personnel selection decisions, an empirical study was also conducted to create a data mining framework to explore the association rules between personnel characteristics and work behavior, based on decision trees. More precisely algorithm CHAID, C4.5 was used to make predictions about work behavior including performance, as well as stay in the company, using as input data, available data at the selection stage, such as age, gender, marital status, educational background, work experience and recruitment channels. The results showed that, variables related to school tiers, academic titles, and experience, were the main characteristics related to the predicted goals. Employees with higher qualifications, higher educational level, such as postgraduate or doctoral and employees with one or more years of work experience, performed better than other employees but their resignation rate was higher than other employees. Furthermore, results showed that employees hired by internal channels were more likely to perform better than those hired by external channels (Chien & Chen, 2008). In order to overcome the problem of lack of a proper framework while recruiting new employees, Thakur et al. (2015) in a software industry, proposed machine learning techniques based on rules produced by the use of Random Forest algorithm classifications on three main results (Good, Average, Poor). They used many parameters depending on candidates’ skills as inputs to define the suitable results on performance as output. They gave the idea of setting a proper selection framework and they concluded that attributes like programming skills, domain specific knowledge and analytical skills must be tested, as they have a significant prediction power on the performance evaluation of a person. In another work, Azar et al. (2013) tried to connect working performance to some critical skills (effective factors) of employees, using machine learning algorithms, without obvious success, but they identified five features, valuable for promotion purposes: area of employment, education level, exam score, interview score and work experience. More specifically, they tried to both classify and connect personal employees’ performance based on several parameters like exams or job characteristics (either quantitative or qualitative) to final work performance, using three different algorithms (QUEST, CHAID, CART). They concluded that input variables must be carefully selected. The selection of a proper classification algorithm is also important for increasing accuracy. Machine learning models also have been used (Luo et al., 2019) to predict the performance of candidates for a particular job, using unsupervised latent models to estimate the relation between employees’ skills and work either for initial candidate selection or for assessing job performance. The aim of their research was to create a framework that can estimate an employee’s ability based on a set of activities he/she should perform, compared to the time other employees need for the same set of activities.
3. Research Methodology
The initial goal of the research was to identify the most appropriate algorithm to support the employees’ evaluation for selection and promotion decision making process. Towards this, a primary quantitative survey was conducted based on a questionnaire divided into two parts. The first part was seeking to find the desirable qualifications of the following position types: Director/ Head of the Department/ and Employee type A (entry position can anyone take over in an organization)/ Employee type B (senior employees). The latter part was seeking to identify the existing qualifications of currently working places holders. Regarding the methodology of questionnaire building, this was based on the criteria set, used by the Supreme Staff Selection Board which is, in charge of the recruitment of the public sector in Greece. The questionnaire went through the validation process to determine whether it measures accurately the variables examined (Pampouktsi et al., 2020). The design strategy of the research instrument was based on Garcia et al. (2009) principles of simplicity, accuracy, clarity, feasibility, and construct coherence. In relation to the validation process upon the variety of statistical tools and techniques required, in this particular case, it was based on a scheme including:
1) Cronbach’s Alpha test (which reached a value higher than 0.7),
2) Exploratory Factor Analysis (EFA) and
3) Confirmatory Factor Analysis (CFA).
The researchers decided that it was most appropriate to proceed to the general validation methodology followed by several researchers such as Yu and Hsu (2013) and Moreno et al. (2014), who conducted an EFA and a Differential Item Functioning (DIF) analysis verifying the statistical significance of the uploaded items. The primary research provided all necessary data required to shape a dataset upon which the classification algorithms run.
3.1. Data Collection and Employees’ Qualifications
3.1.1. Data and Features Collection
The dataset consists of data collected through a survey conducted in the public sector in Greece, regarding employees’ qualifications, as well as their current job position. The sample of employees represented almost all education levels and specialties, such as agriculture engineers, mechanical engineers, computer engineers, administration officers etc. More specifically, the data concerned:
➢ Degree level: graduation from university or technological institute.
➢ Degree value: the grade received in the basic degree title.
➢ MSc holders: MSc diploma, relative to their job.
➢ PhD holders: PhD diploma, relative to their job.
➢ Graduation from National School of Public Administration.
➢ Seminars attended during the last 10 years.
➢ European languages: How many (and at which level) European languages they know.
➢ Appraisal score: The achieved score of the employees in the previous appraisal procedure. The appraisal score is the mean of scores each employee achieved in his/her evaluation for job knowledge, job relations and behavior, job effectiveness and leadership abilities.
➢ Computer knowledge certification.
➢ Total experience: How many years of working experience employees got.
➢ Years of experience in an authority position.
➢ Interview score: What score employees achieved in the interview for their position.
➢ Participation in committees: in how many committees employees participated.
➢ Research work: how many published research papers employees have.
➢ What position employees hold in hierarchy (Director, Head of Department, new or senior employee).
The conceptual experimental framework is depicted in Figure 1.
3.1.2. Simulation of Employees’ Qualifications to Job Specification
The collected data are related to the main types of positions in public sector and more precisely, Director, Head of department, Employee A and Employee B. Employee A, is the entry and basic position a candidate can take over in an organization and the succeeding position levels are, Employee B, Head of Department and Director. The skills (required or desirable) taken into account for these positions, according to job specifications are:
1) Director: university degree level, degree value, Master diploma, PhD diploma (desirable), national school for public administration (desirable), number of seminars, knowledge of European languages, appraisal mean score, experience in years, authority position in years, interview score, participation in group work and committees, number of research work (desirable) and computer certification.
2) Head of Department: university degree level, degree value, Master diploma (desirable), PhD diploma (desirable), national school for public administration (desirable), number of seminars, knowledge of European languages, appraisal mean score, experience in years, authority position in years (desirable), interview score, participation in team work and committees (desirable), number of research works (desirable) and computer certification.
Figure 1. Conceptual experimental framework of the performed tests.
3) Employee A: university degree level, degree value, Master diploma (desirable), PhD diploma (desirable), national school for public administration (desirable), number of seminars, knowledge of European languages (desirable), experience in years (desirable) and computer certification.
4) Employee B: university degree level, degree value, Master diploma (desirable), PhD diploma (desirable), national school for public administration (desirable), number of seminars, knowledge of European languages, experience in years (desirable) and computer certification.
3.2. Database Preparation
3.2.1. Data Import
Database was based on data imported in an MS-Excel spreadsheet. After a first cycle evaluation comparing employees’ qualifications to position specification, we chose 1010 instances with a good pairing, to act as a true training set. The employees’ profiles were represented through required and desirable qualifications that formed the attributes (traits) of the datasets.
3.2.2. Questionnaire Assessment and Weighting of Selected Criteria
Employees’ opinion on selected criteria was collected in the first part of the questionnaire. The means of their answers acted as weighting coefficients. We have collected 196 full answers, an acceptable number for reliable statistical procedures. The coefficients’ weight was different for each position (Director, Head of department and employees positions). All coefficients were then standardized.
3.2.3. Total Score Calculation
Total score of each employee was calculated using a linear model, that includes the qualifications of each employee multiplied by the corresponding weighting coefficient for the specific job position. Then, all personal data have been recorded and exported in a CSV (Comma Separated Values) file for further processing.
3.3. Machine Learning Implementation
Supervised machine learning schemata were employed for predicting the matching degree between the employees’ qualifications and the job profile. The employees profile was represented as a concatenated feature-value vector, and classifiers were trained on employees’ data to predict the employee-position matching degree. The WEKA (Waikato Environment for Knowledge Analysis) machine learning workbench (Witten et al., 2017) was used for running the classification experiments on the selected dataset. The inputs consisted of the above mentioned attributes, plus the achieved total score of each employee. As an output the position type among the four main positions was selected.
3.4. Machine Learning Algorithms Classification
From a large pool of available classifications algorithms, we selected a subset of them, based on the following criteria:
· forecasting performance: i.e. the ability of the system to correctly identify the position level given the input attributes,
· model explainability: i.e. the ability of a human domain expert (usually from the Human Resources area) to understand the indicators that led the system towards providing one over another prediction,
· academic recognition: i.e. the existence of the algorithm over at least 10 years, as an indicator of reliability and wide-acceptance.
Amongst many candidate classification algorithms, based on various theoretical foundations such as statistics, neuroscience, kernel functions, etc, four algorithms have been prevailing based on the aforementioned criteria in this study, which are also widely-known in the machine learning community.
1) J48 is a decision tree induction algorithm initially developed as C4.5 (Quinlan, 1996). C4.5 generates a decision tree based on information gain on the attributes assessment in the available training dataset. More specifically, the attribute/attributes whose values discriminate most clearly the training examples according to their class label is/are identified in each iteration (training cycle). The algorithm stops when there are no further attributes to explore or when all the training examples are separated satisfactory. Additionally, J48 incorporates two tree pruning methodologies: a) Subtree replacement and it replaces a node in a decision tree with the corresponding leaf, if the given subtree does not help classification accuracy. This pruning process starts from the higher leaves of the formed tree, and moves bottom up toward the root. b) Subtree raising in which a node may replace other nodes while it is moved towards the root. This type of pruning most of the times has insignificant effect on decision tree models (Witten et al., 2017).
2) Random Forest, a meta-learning classification algorithm that runs iteratively and in each iteration a decision tree is induced from a randomly selected subset of the features. The input vector is run through multiple decision trees. The number of iterations is pre-defined. The final classification error is the mean error over all iterations. Random Forest is an extension of the decision tree classifier, as many classification trees are grown to classify a new object from an input vector and each tree gives a classification. The forest chooses the class which has the maximum number of votes. They usually can achieve high performance. Furthermore although the final trained model can learn complex relationships, the decision boundaries that are built during training are easy to understand (Breiman, 2001).
3) Naïve Bayes, a probabilistic classifier based on the assumption of conditional independence, which assumes that the appearance of a particular feature given the class value is unrelated to the appearance of the other feature within the dataset. Though not valid in reality, this assumption has been proven to cope well with several classification problems. The reason that Naïve Bayes often works so well is that it simplifies predictive modeling problems, can be coded easily and makes quick predictions. This algorithm needs a small amount of training data to determine the parameters necessary for classification. Thanks to the hypothesis of independent variables, there’s no need to estimate the entire covariance matrix but only the differentiations of the variables for each class (Russell & Norvig, 2009).
4) Sequential Minimal Optimization (SMO), an ameliorated algorithm for training support vector machines. It cuts in pieces a large quadratic programming (QP) optimization problem into smaller problems. SMO solves the smallest optimization problem at every step. The inner loop of the algorithm is expressed in a short amount of C code, rather than invoking an entire library QP routine. Even though more optimization sub-problems are solved, each sub-problem is so fast to be solved that the overall QP problem is solved very quickly (Platt, 1998). SMO uses the sequential minimal optimization algorithm for training a support vector classifier, using polynomial or Gaussian kernels (Keerthi et al., 2001).
It’s important to note here that various other algorithms (MLP, KStar, AdaBoost, etc.), were tested on our dataset for their accuracy and predictive ability, without satisfactory results.
A machine learning model takes inputs and makes predictions. In our research, data of 1010 employees’ qualifications were supplied. Our dataset consisted of thirty five (35) Directors, two hundred and six (206) Head of Department, seven hundred and twenty one (721) Employees B and forty eight (48) Employees A. The percentage of each position of the training dataset, representing class balance, is presented in Figure 2.
Classification of attributes was based on decision trees (simple or boosted), Bayesian algorithms (Naϊve Bayes), and support vector machines (Figure 3) in order to develop a new framework, not only for human resources selection but also for proper positioning including authority positions.
The goal of a classifier is to learn the decision boundaries between the various class labels (position types), based on the given training data and then utilizing that learned model to predict the value of each class of a previously unseen dataset, usually referred as test set. To determine the generalization ability of a model would need to measure the average risk for the set of all possible data objects. In real life applications, this is not feasible, so we estimate the risk by counting it for a test set. Model selection based on testing trained models on a single test set
Figure 2. Class labels’ distribution diagram.
does not avoid the risk of overfitting, which means that the learned model has adjusted its behavior towards accurately predicting all training instances but fails when new examples are given to it. According to best practices, a more accurate estimation of the empirical risk can be obtained with k-fold cross-validation (CV) (Jankowski & Grabczewski, 2008).
In our main experiments, in order to optimize algorithms’ performance we ran the appropriate meta-learning modules. This led us to choose either the default values, or the automatically proposed parameter values. The optimization refers to improved accuracy and calculated errors (especially root mean-squared error) and in general, improved performance of algorithm classification. We should note however that in some cases, the default parameters of each algorithm superseded any other combination, which can be expected, since these settings have been decided upon experimenting with many datasets from the WEKA contributors. The Cross-Validation parameters selection meta-learner of WEKA showed slight improvement only for the J48 algorithm.
4. Classification Results
In our main experiments, the 10-fold cross-validation training strategy was used. In this technique we split the set of available data into n parts and perform n training and test processes (each time the test set is one of the parts and the training set consists of the rest of the data). Average test risk can be a good estimate of real generalization ability of the tested algorithm, especially when the whole cross-validation is performed several times (each time with different data split) and n is appropriately chosen (Jankowski & Grabczewski, 2008). In our experiments, the original sample was randomly partitioned into ten subsamples. Dataset was separated into training and validation sets in a ratio of 90% to 10%. For testing the model, one out of ten subsamples was kept as validation data, and the other nine were used as training data. The cross-validation process was then repeated ten times, with each of the ten subsamples being used only once. Results were averaged across the ten experiments. In general, we provided all of the training data to the following learning algorithms and let the learning algorithms to discover the mapping between the inputs and the output class label that minimizes the prediction error.
4.1. Decision Trees
We use decision trees mainly for classification and forecasting cases. They are represented by the rules IF-THEN-ELSE, starting from the root of the tree and ending in its leaves. The characteristics of the problem are included in the tree nodes. The tree nodes are described by logical conditions using single features. The nodes of a tree are characterized by the names of the features, while the edges are named with the possible values that a feature can take and the leaves with the different classes (Murthy, 1998).
4.1.1. Results of Classification for J48 Algorithm
Table 1 presents the basic optimized parameters used for the J48 algorithm demanding subtrees and pruned branches. These parameters were considered satisfactory according to the nature of our data.
Figure 4 provides the results for J48 classification including precision and recall of training (learning) for the certain algorithm leading to four classes corresponding to the four types of main job positions. The best results were retrieved for position Director reaching 1.0 for recall, with very good calculations (over 0.95) for Head of Department and Employee B. Employee A showed satisfactory recall at 0.85. Precision of true predicted values was high and at least 0.94 for all cases. According to J48 classifier output, the correctly classified instances were 979, the incorrectly 31, concerning employees of B level, A level and Head of department and thus forming a grey zone in classification. The mean absolute error was 0.0244 and the root mean squared error was 0.1173. High recall is very useful for prediction purposes. In our findings, it was very high for Director, Head of Department and Employee B positions, and satisfactory for employee A. In combination with high precision, this algorithm is considered very useful for classification and prediction purposes. This algorithm is considered suitable for interpretation of the forecasting process by a human expert.
Figure 4. Precision and recall for J48 classification algorithm.
Table 1. J48 algorithm parameter values.
According to the J48 tree (Figure 5), the most important attributes were the experience in authority years and the total score, based on employees’ qualifications for the specific position. The total score was considered by the algorithm the most important attribute of all. According to the algorithm estimations, total score could distinguish easily the Director (>0.6825) and Employee A (≤0.2474) positions, while the experience in authority is useful for distinguishing Employee B and Head of department. Authority years > 0 may distinguish between Head of Department and Employee of B level, but when authority years are zero, then total score proved to be a safe criterion again (Head of Department needs > 0.4805 for promotion reasons). When authority years are above 2, then the combination of authority years and total score > 0.3493 may easily distinguish between the two classes (positions). Always the total score of an employee is very important for the position will take over. Accuracy of the constructed model reached 96.93%.
4.1.2. Results of Classification for Random Forest Algorithm
Table 2 tabulates the basic parameters for the Random Forest algorithm (default values of WEKA were accepted as satisfactory).
Figure 5. The tree produced by classification algorithm J48.
Table 2. Random forest parameter values.
Figure 6 provides the results for Random Forest classification (precision and recall) of training of the certain algorithm. Precision was again high and over 0.90 for all classes (positions), while recall was over 0.90 only for Head of department and Employe B and around 0.80 for the two other classes. The recall values were poor for two classes weakening its predictive ability. This classification algorithm managed to distinguish different positions satisfactory leading to 960 correctly classified instances. The incorrectly classified instances were 50, concerning 7 Directors, 18 Head of Departments, 16 Employees B and 9 Employees A. The mean absolute error was 0.0559 and the root mean squared error was 0.1465, while accuracy of the constructed model reached 95.04%.
Figure 6. Precision and recall for random forest classification algorithm.
According to the produced model, the most important attributes based on average impurity decrease and the number of nodes using that attributes are: the total score, the bachelor grade, the experience in authority positions, the PhD degree, the seminars, the appraisal score and the total experience years.
4.2. Bayesian Algorithms
Bayesian classification is a traditional and widely used machine learning technique, based on the application of Bayes’ theorem. By its principle, it counts the probability of an event occurring, through the probability of another already occurred event. In a Bayesian classifier, the goal is to choose the most likely class from a set of possible labels and not to determine the actual probability of a single one (Province, 2015).
Results of Classification for Naïve Bayes Algorithm
Table 3 presents the basic parameters for Naïve Bayes algorithm (default values of WEKA were accepted as satisfactory).
Figure 7 provides the results for classification (precision and recall) of training of the certain NB algorithm. Precision was over 0.85 for almost all classes (positions), except for Employee A (0.712). The highest value was achieved for Employee B (0.982). Recall was generally high and reached 1.0 for Director. Only Employee A showed a recall value lower than 0.9 (0.875). The predictive power of this algorithm is also satisfactory since recall values are very high. This classification algorithm managed to distinguish different positions satisfactory leading to 941 correctly classified instances (69 incorrectly). The mean absolute error (MAE) was 0.0401 and the root mean squared error (RMSE) was 0.1742, while accuracy of classification estimations reached 93.16%.
Figure 7. Precision and Recall for Naïve Bayes (NB) classification algorithm.
Table 3. Naïve Bayes Parameter values.
4.3. Support Vector Machines
Support vector machines (SVM) were introduced by Boser et al. (1992), to solve both classification and regression problems. In classification problems according to the parameters used, can produce models with different types of decision margins. Margins can be linear or non-linear. A linear SVM is a hyperplane that separates a set of positive examples from one set of negatives, maximizing the margin in the area of characteristics, the distance of the hyperplane from the nearest positive or negative examples (Schölkopf et al., 1998). The complexity of the margins does not lead to poor generalization, because the margins optimization takes care of the correct placement of the margins. Support vector machines minimize the empirical risk both for classification and regression problems (Jankowski & Grabczewski, 2008).
Results of Classification for SMO Algorithm
Table 4 presents the basic parameters for the SMO algorithm (default values of WEKA). Figure 8 provides the results of classification (precision and recall) of training of the SMO algorithm. Precision was again very good (between 0.82 for Employee A and 0.93 for Director and Employee B), while recall was over 0.70 for all classes (positions) except for Employee A (it was low reaching only 0.48). SMO showed disappointing results for recall and weak predictive ability for Employee A. This classification algorithm managed to distinguish different positions satisfactory and the correctly classified instances were 934. The mean absolute error was 0.2565 and the root mean squared error was 0.3221, while accuracy of the constructed model reached 92.47%.
Figure 8. Precision and Recall for SMO classification algorithm.
5. Experimental Results and Discussion
In this work, supervised machine learning techiniques (based on decision trees, support vector machines and bayesian algorithms) were used to predict future best selection and positioning of human resources. In order to find the best employees’ position, a general-purpose ability model learning framework was developed for each algorithm, which facilitated the learning process from multiple dimensions by combining observable variables as well as hidden patterns embedded in the employees’ formal qualifications. All classification algorithms led to four classes, for every potential positioning in the organization (Director, Head of Department, Employee A and Employee B). The mean accuracy of classification, using the 10-fold cross-validation method was very high, 96.93% for J48, 95.04% for Random Forest, 93.16% for Naïve Bayes and 92.47% for SMO. Concerning error estimations, it was found that algorithm J48 showed the lowest error values. More precisely, mean absolute error (MAE) value was 0.0244 and root mean squared error (RMSE) value was 0.1173. This finding may lead to better predictions using the J48 algorithm in comparison to the rest algorithms. The highest MAE and the highest RMSE was calculated for the SMO algorithm (0.2561 and 0.3221 respectively). Even though in many previous research works SVM seems to outperform base classifiers, in our case SVM did not manage to behave like this. Upon applying different kernel choices and regularization parameters (which resulted in a very long training time) the lower performance indicated that a very large number of the training data ended up as support vectors. Furthermore, the no free lunch theorem says that there is noa-priori superiority for any classifier system over the others, so the best classifier for a particular task is itself task-dependent.
Τhe metric F-Measure provides an overall estimate of the models, as it combines two other metrics, recall and accuracy. The metric F-Measure is essentially its harmonic mean of recall and accuracy (Gaber et al., 2007). Figure 9, displays the F-Measure classification results of the main aforementioned machine learning algorithms (J48, Random Forest, Naϊve Bayes and SMO). The J48
Figure 9. F-Measure values and comparisons for J48, Random Forest, Naïve Bayes and SMO algorithms.
algorithm produces the best results in both precision and recall followed by Random forest and Naïve Bayes. On the other hand, SMO produces the worst results in comparison to the other algorithms, exhibiting lower values for recall.
Our results indicated that J48 gives more accurate classification and thus computes the proper work positions better, followed by Random Forest. The parameters of the rest algorithms were similar and also satisfactory (but with lower pairing accuracy).
As previously mentioned, other methods were also studied. We focused on implementing neural networks and more specifically Multilayer Perceptron (MLP). However, the F-Measure value failed to reach the levels of the other algorithms, it was below 0.9 (0.8822) resulting in the poorest performance. This was mainly attributed to the low recall values for the case of the Head of Department position, which reached a significant low of 0.625. Mean recall was 0.844, exhibiting low predictive ability and very poor in comparison to the rest algorithms. Precision was satisfactory (close to 0.93) and better than Naïve Bayes and SMO, but lower than J48 and similar to Random Forest. Because of the fact that in our experiment Multilayer Perceptron showed poor predictive ability in comparison to tree-based algorithms, we chose not to present in detail its results. Another reason is that MLP operates as a “black box” in the sense that upon performing a prediction, there is nothing to be provided to a human expert in order to reason on which factors the prediction was made.
In similar experiments Chien and Chen (2008) using machine learning techniques based on decision trees and more specifically on CHAID classifier algorithm, tried to set some objective rules on recruitment of the best personnel for a high technology industry. They described five classes of different work positions (job or work description) based on certain qualifications and demands, with satisfactory confidence (63% - 96%). Another work based on Naïve Bayes algorithm was used to classify methods of selection of new personnel from Human Resources departments with encouraging results (Khairina et al., 2017). This work presented unclear effectiveness and precision that seems relatively low. Varshney et al. (2014), in their work on IBM salesmen’ data reported accuracy around 80% using HR information and job title (the main attributes). Azar et al. (2013) reported accuracy between 60% and 80%, depending on the chosen algorithm and the parameters used (with main features: province of employment, education level, exam score, interview score and work experience). All prior experiments were conducted in the private sector.
In our analysis the most significant attributes with the highest prediction power in estimating the suitability of a person for selecting employees for authority (higher) positions or for recruitment are associated with their total score which includes all his/her qualifications (required and desirable), in combination with the job’s description requirements. For promotion purposes the most important attribute is total score of employees in association to previous experience in such positions. For selecting employees for recruitment, is the total score based on employees’ qualifications referred to the present work (a total of typical and objective qualifications, most of them being the selection basis for civil servants in Greek public sector). Our proposal is to place the right people in the right position, rapidly and with high accuracy, in order to save resources and assist the decision support system.
The precision of our experiments was very high (over 0.9) for all classifiers and especially for J48, which also showed the greatest F-measure (that combines precision and good pairing) over 0.90. Accuracy for J48 was also very high, reaching 97% of correctly classified instances. Furthermore the time for building the model is much shorter using the J48 algorithm (only 0.01 seconds). As a result the proposed model here called as the Employees’ Evaluation for Recruitment and Promotion Algorithm Model (EERPAM), constructed by J48 algorithm. J48 showed more than promising classification results and high predictive ability, for personnel selection and re-allocation purposes. High precision, pairing and accuracy allowed us to classify successfully the 1010 employees in four different classes (job positions) and make predictions for future personnel selection. Our results showed higher accuracy and predictive power than previous research on machine learning selection of human resources (Varshney et al., 2014; Azar et al., 2013; Khairina et al., 2017). The practical significance of our research through the proposed learning based model, is the prediction of proper selection and positioning of employees among many candidates, horizontally (job position matching) or vertically in authority positions, evaluating employees formal qualifications and job description’ specifications.
6. Conclusion and Future Research
In this work supervised machine learning is used to predict the best matching of employees in the public sector. Results produced by the conducted experiments were promising with the J48 algorithm providing the best classification results. From all used attributes the most important are, total score of each employee and his/her experience in authority positions.
Our novelty approach differentiates from previous work because:
1) We used only fully measurable criteria (features).
2) Total score as a basic feature, was calculated as a feature vector of formal qualifications (employee’s skills) in accordance with the specific job description requirements.
3) The specific weight of each criterion used, was defined by employees’ opinion of public sector, in accordance with each main job position.
4) This research was conducted in the public sector as a novel approach for selection and positioning of civil servants.
5) The proposed machine learning model determined the boundaries of each of the four main positions, leading to classification prediction accuracy over 90% for better positioning and matching of candidates.
The main contribution of our research is that machine learning managed successfully to predict the proper matching between employees and positions, providing a reliable tool for personnel evaluation and proper selection, positioning and promotion of candidates, supporting human resources departments to make accurate and objective decisions. The total scheme proposed here, may be the basis of a framework for proper personnel recruitment and positioning (horizontal positioning) and also for promotion in authority positions (vertical positioning). The limitations of our research are related to the sample of 1010 instances in our dataset that included University and Technical Institutions graduate employees. From this viewpoint, the proposed model may not be appropriate for secondary or lower level-graduate personnel. Regarding further research opportunities we will evaluate the performance of the proposed model using its outcomes to compare expected classes with existing employees’ datasets in order to predict the best fitting candidacy and evaluate the appropriateness in current position according to the job descriptions.