1. Introduction
The blood of an organism contains many components. It was one of the tasks of biomedical research to find out the cells in the blood, for instance, using some methods rapidly and accurately screening of red blood cells, white blood cells, platelets, etc. [1] [2] [3]. The application of artificial intelligence technology in the biomedical field provided a critical research theory for the development of biomedical. Using artificial intelligence recognition technology is one of the methods to screening out different types of cells in biological blood for example, adopting machine learning and deep learning methods to screen diverse cells types [4] [5]. The previous means have an application value for screening and identifying different cells types, but it was not perfect. How to screen diverse cells quickly, efficiently, and accurately is one of the research fields of biomedicine.
Activation functions were a critical part of the design of a neural network. It helps the neural network to learn the complex patterns in the data, just like the neuron-based model in the human brain. The activation function obtains information from the front neuron and then transmits it to the next neuron [6]. As shown in Figure 1, in the neuron, Input is also applied to a function after weighting and summing. This function was the activation function. The activation function was an introduction to increase the nonlinearity of the neural network model. Each layer without the activation function was equivalent to matrix multiplication. However, the neuron information of the previous layer cannot transfer to the neuron information of the next layer. The function can realize the transmission of neuron information, so the activation function was an indispensable part of an artificial neural network. To improve the computational performance of neural networks, people have studied the activation functions in neural networks. Common activation functions include Sigmoid [7], Tanh [8], SiLU [9], Hardswish [10], Mish [11], MemoryEfficientMish, etc.
The SiLU activation function was also called the Swish activation function. The Swish was an activation function proposed by the Google team in recent years and was composed of the previous activation function. The expression of the Swish is
, and
is the Sigmoid. Because the saturation of the Sigmoid tends to cause the gradient to disappear, learn from the effect of ReLU, when
, then
, but when
, then
, the general trend of the function is similar to ReLU but more complicated than ReLU [12]. It is a reflection in the function by adding a few hyper-parameters and then showing more characteristics. Adding a hyperparameter to Swish makes the function expression
, and we say that
can be a constant or a trainable parameter.
Although this Swish nonlinearity improves accuracy, its cost is non-zero in an embedded environment. It is much more expensive to calculate the Sigmoid function on a mobile device. The author of MobileNetV3 used Hardswish & Hardsigmoid to replace the Sigmoid layer in ReLU6 & SE-block. But only in the latter
Figure 1. Summary diagram of activation function.
half of the network did ReLU6 be replaced with Hardswish, because the author found that the Swish can only use in a deeper network layer to reflect its advantages.
The Mish activation function is a new activation function proposed by Diganta Misra et al. It surpasses ReLU in some tasks. From YoLov1 to YoLov5, the accuracy has been improving. For example, YoLov4 has a qualitative leap compared to YoLov3 with mAP. One of the reasons is to replace LeakyReLU with the Mish activation function. In the paper, Diganta Misra described that the widely used activation functions are ReLU, Tanh, Sigmoid, Leaky ReLU, and Swish. For instance, in Squeeze Excite Net-18 for CIFAR 100 classification, the network with Mish had an increase in Top-1 test accuracy by 0.494% and 1.671% as compared to the same network with Swish and ReLU, respectively. It shows that the Mish activation function has certain advantages over others. The formula of the Mish activation function is
. Softplus is proposed by Yoshua Bengio. In the paper [13],
is
, and the value range is
. Softplus can regard as the smoothness of ReLU. The Mish activation function image is shown in Figure 2.
The MemoryEfficientMish activation function was the first derivative of the Mish activation function, and its formula was
(1)
(2)
(3)
Figure 2. Image of Mish activation function.
In this paper, the activation function was the research object mainly. Two kinds of neural network morphology in YoLov5 are used as an experimental basis to conduct experiments on different activation functions and explore the accuracy influence of different kinds of activation functions on cell recognition.
2. Methods
2.1. Mish_PLUS Activation Function
Given the excellent performance of the Mish activation function in YoLo, this article expands the Mish activation function to obtain the Mish_PLUS activation function and use the Mish_PLUS activation function to the YoLov5s and YoLov5m neural network structures. The formula of the Mish_PLUS activation function is like (6). In the Mish_PLUS activation function
, the original Mish formula is changed, and
is addition, where
means
. The image of
is Figure 3.
(4)
(5)
(6)
It can be seen from the comparison of Figure 2 and Figure 3 that when x approaches zero to the Mish activation function approaches zero from a negative value. The Mish_PLUS activation function approaches zero from a positive value.
2.2. Sigmoid_Tanh Activation Function
At present, there are dozens of activation functions widely used in neural network structures, and different activation functions have different roles in neural
Figure 3. Image of Mish_PLUS activation function.
networks. And their advantages and disadvantages are also so. Commonly used activation functions include: Sigmoid, Tanh, ReLU [14], Leaky ReLU [15], ELU [16], SELUs [17], GELUs [18], PreLU [19], MaxOut [20], RReLU [21], etc., and some activation functions are expanded based on the original activation function to obtain a variant of the activation function. In this article, the classic Sigmoid and Tanh are mainly studied.
The Sigmoid formula and derivative are:
(7)
(8)
The Tanh formula and derivative are:
(9)
(10)
The source of formula (7)-(10) can be obtained from references [7] [8]. The Sigmoid function was proposed by Jun Han et al. It is a mathematical logic function, such as formula (7). Since the visualized curve is S-shaped, called S-curve, as shown in Figure 4. Used in the neural network for the output of hidden layer neurons, the value range is (0, 1). It can map a real number to the interval of (0, 1), so it was two classifications in training. The difference in image features is more complicated or the difference is not special large, the effect was better. The advantage of Sigmoid as an activation function is that the curve presented is relatively smooth, and it was easy to derive during the calculation process, as shown in formula (8). The disadvantage is that the activation function has a large amount of calculation. When backpropagating to find the error gradient, the derivation involves division. When backpropagating, the gradient disappears easily, and the training of the deep network cannot complete. The hyperbolic tangent function is a kind of it, called Tanh. In the neural network, Tanh is used as the activation function of the neuron to transmit information. It is a non-linear function. The formula is shown in (9), and its derivative is shown in (10). The Tanh function converts the final result of the fitted curve to the interval of (−1, 1). The maximum negative number is infinitely close to −1, and the maximum positive number is infinitely close to 1. Tanh solves whether the output of Sigmoid is zero centers, but there is still a saturation problem (Figure 5).
Figure 4. Image of Sigmoid activation function.
Figure 5. Image of Tanh activation function.
Given the characteristics of the Sigmoid activation function and the Tanh activation function, in this article, the Sigmoid activation function and the Tanh activation function are combined for experimentation. That is to say, the Sigmoid activation function and the Tanh activation function are multiplication to obtain a new activation function, the formula as shown in (11), its value range is (−1, 1), and the obtained visualization curve in Figure 6. We can see from Figure 6,
,
;
,
.
(11)
3. Results
In this paper, on the one hand, inspired by the Mish activation function, the Mish activation function is extended based on the Mish activation function; on the other hand, the Sigmoid activation function and the Tanh activation function choose from the widely used activation functions. Function as the research object, multiply the Sigmoid activation function and Tanh activation function to get a new activation function for the experiment. The codes of Mish_PLUS activation function and Sigmoid_Tanh activation function are in Table 1.
In this paper, YoLov5s and YoLov5m as the neural network structure, and the function realized was the recognition function of white blood cells, red blood cells, and platelets. There is a dataset of blood cells photos, originally open-sourced by https://github.com/cosmicad/dataset. In this paper, the dataset was exported via roboflow.ai on February 23, 2021. There are 874 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets. The following sections were the parameter results obtained under the conditions of different activation functions.
Table 2 shows the results of different activation functions under the neural
Figure 6. Image of Sigmoid_Tanh activation function.
(a) (b)
Table 1. Mish_PLUS activation function and Sigmoid_Tanh activation function code. (a) Mish_PLUS activation function code; (b) Sigmoid_Tanh activation function code.
Table 2. Parameter results based on the structure of YoLov5s neural network.
network structure YoLov5s. It can be seen from Table 1 that using YoLov5s as the network structure of this article, the neural network has a total of 283 layers, and the activation functions are SiLU function, Hardswish function, Mish function, MemoryEfficientMish function, Mish_PLUS function, and Sigmoid_Tanh function. Each training has a total of 7,068,936 parameters, and the number of floating-point operations is 16.4GFLOPS. We have a comparison of the results of different activation functions under the neural network structure YoLov5s.
We use different activation functions to identify cell types, and the recall and precision are representation, by curves with different colors. Each activation function is training 200 times. Knowledge about recall and precision can obtain from [22]. Figure 7 and Figure 8 are the curves of recall and precision under different activation functions. It can see from Figure 7 that the green curve under the Hardswish activation function shows the worst effect, and the curve change is unstable. When the training reaches 128 times, it displays a turning point in the recall, and the recall increases before the 128 times training. The recall dropped sharply after 128 training sessions and did not change much. The blue curve represents the Sigmoid_Tanh activation function, which presents the best effect. The curve maintains a steady upward trend. When the training reaches
Figure 7. The recall curve changes under different activation functions based on YoLov5s.
Figure 8. Precision curve changes under different activation functions based on YoLov5s.
140 times, the recall fluctuates, but after 140 times, the recall still shows an upward trend. The recall under the other five activation functions is better. It can see from Figure 8 that the green curve under the Hardswish activation function shows the worst effect, and the curve change is unstable. When the training reaches 128 times, it displays a turning point in the precision, and the precision increases before the 128 times training. The accuracy rate drops and fluctuates unstable after 128 times. The blue curve represents the Sigmoid_Tanh activation function, which presents the best effect. The curve maintains a steady upward trend. When the training reaches 140 times, the precision rate fluctuates, but after the 140 times training, the precision still shows an upward trend. The precision rate is better under the other five activation functions.
Figure 9 shows the mean average precision (mAP) of different activation functions for different cell types under the YoLov5s neural network structure. It can see from Figure 9 that the blue curve under the action of the Sigmoid_Tanh
Figure 9. Comparison of average precision under different mean average precision (mAP) based on YoLov5s.
activation function presents a better effect, and the mAP of different types of cell recognition obtains better scores. The second is the result obtained by the orange curve under the action of the Mish_PLUS activation function. The green curve under the activation function Hardswish presents a poor effect, and the mAP of different types of cell recognition obtains a poor score.
3.1. Cell Recognition Results
Under different activation functions, we conduct experiments on the identification of cell types. Figure 10 shows the cell recognition results under activation functions based on the YoLov5s neural network structure. It can see from the sparse degree of cell-type recognition in Figure 10 that the cell type recognition effect under the Hardswish activation function was the worst, and the cell type recognition effect under the Sigmoid_Tanh activation function was the best. The cell type recognition under the Mish_PLUS activation function can also achieve better results, but it was not as good as the cell type recognition under the MemoryEfficientMish activation function.
Table 3 shows the parameter results of different activation functions under the neural network structure YoLov5m. It can be seen from Table 3 that using YoLov5m as the neural network structure of this article, the neural network has a total of 391 layers, and the activation functions are SiLU function, Hardswish function, Mish function, MemoryEfficientMish function, Mish_PLUS function, and Sigmoid_Tanh function. There are 21,064,488 parameters in each training, and the number of floating-point operations was 50.4GFLOPS. We have a comparison of the results of different activation functions under the neural network structure YoLov5m.
For the diverse cell types recognition, each activation function had has been trained 200 times to obtain the recall and precision of the types. Figure 11 and Figure 12 are the curves of recall and precision under different activation functions. It can see from Figure 11 that the red curve represents the result under the
Figure 10. Cell recognition results under different activation functions based on YoLov5s.
Table 3. Parameter results based on the structure of YoLov5m neural network.
Figure 11. The change of recall curve under different activation functions based on YoLov5m.
Figure 12. Precision curve changes under different activation functions based on YoLov5m.
Mish activation function, and the effect was the worst and unstable. The green curve represents the result under the Sigmoid_Tanh activation function, and the result was the best. The curve maintains a steady upward trend. When the training reaches 90 times, the recall rate fluctuates badly, but after 90 times, the recall rate still shows an upward trend. In general, the recall rate was better than the other five activation functions. It can see from Figure 12 that the red curve under the Mish activation function has the worst effect, and the curve change was unstable. The green curve under the Sigmoid_Tanh activation function presents the best result. The curve maintains a steady upward trend which was generally better than the precision of the other five activation functions.
Figure 13 shows the mAP of different activation functions for different cell types under the YoLov5s neural network structure. It can see from Figure 13 that the green curve under the action of the Sigmoid_Tanh activation function
Figure 13. Comparison of mAP under different activation functions based on YoLov5m.
presents a better effect, and the mAP obtains a better score. The red curve under the Mish activation function gets a poor result and the mAP of different types of cell recognition to a poor score.
3.2. Cell Recognition Results Based on YoLov5m
Figure 14 shows the cell recognition results under different activation functions based on the YoLov5m neural network structure. It can see from the sparse degree of cell-type recognition in Figure 14 that the cell type recognition effect under the Mish activation function was the worst, and the cell type recognition effect under the Sigmoid_Tanh activation function was the best. The cell type recognition under the Mish_PLUS activation function can also achieve better results, but it was not as good as the cell type recognition under the Sigmoid_Tanh activation function.
4. Discussion
As a necessary condition in the neural network structure, the activation function directly affects results on certain functions. In recent years, the research on the activation function has been intensification. Common activation functions include: Sigmoid, Tanh, ReLU, LReLU, PReLU, Swish, etc. The activation function was also widely used. For example, the activation function used to mobile robots as a robot recognition and the algorithm for understanding the scene; in drone technology, the drone needs to face a complex environment, and the activation function algorithm in machine learning plays an important role in the understanding of the drone scene.
This paper mainly studies the activation function, but there were limitations in cell recognition. Mainly as follows:
1) This paper takes cells as recognition objects and tests the performance of recognition cells through different activation functions. If you replace other recognition objects, maybe not be able to achieve the desired effect.
Figure 14. Cell recognition results under different activation functions based on YoLov5m.
2) There are many common activation functions. There were only compares and tests of the extended activation function with the four activation functions tapes, and does not compare with other common activation functions.
3) Due to the limitation of experimental equipment, there was only a simple training, which has certain restraints for the training results.
4) The research in this paper focuses more on recognition, which needs to experiment under a high-power microscope in practical application.
Finding an efficient and suitable activation function is the subject of future research. In the future, the activation function needs further research.
5. Conclusion
This paper focuses on finding a suitable activation function for research. Using YoLov5s and YoLov5m as the basic structure of the neural network, the activation functions SiLU, Hardswish, MemoryEfficientMish, Mish, Mish_PLUS, Sigmoid_Tanh were a test. The Mish_PLUS activation function was an improved form of Mish, and Sigmoid_Tanh combines the Sigmoid activation function and Tanh activation function to a new Sigmoid_Tanh activation function. The data set used in this article was the BCCD.v4-416x416 data set, which realizes the function of identifying red blood cells, white blood cells, and platelets. The test results show that the Sigmoid_Tanh activation function obtained in this paper can play a positive role in cell recognition.