1. Introduction
The SIR model [1] - [6] is a simple compartmental model of infectious diseases developed by Kermack and McKendrick [1] in 1927. It considers three compartments:
S, the set of susceptible individuals;
I, the set of the infectious (or currently positive) individuals, who have been infected and are capable of infecting susceptible individuals;
R, the set of the removed individuals, namely people who recovered (healed, H subset) from the disease or deceased due to the disease (D subset), the former assumed to remain immune afterwards.
The SIR model does not consider at all the sub-compartments H and D; instead the SIRD model simply assumes them to constitute a partition of R, fractionally fixed over time, so that, actually compared to the SIR model, nothing substantially changes in the dynamics of the epidemic progression.
It is assumed that births and non-epidemic-related deaths can be neglected in the epidemic timescale and that the incubation period is negligible, too. Indicating with letters not in bold the cardinality of each of the compartments, it is taken
(1)
where
is an initial time, usually with
.
The model introduces two parameters, β and γ, having dimension of a frequency. Saying t the time variable, γ is defined as the fractional removal rate
of individuals from the infectious compartment. Since SI is understood as the number of possible contacts among the infectious and the susceptible individuals, β/N is defined as the fractional decrease rate
of the number of individuals in the susceptible compartment: it expresses therefore the fractional increment rate of the number of infectious individuals, that is the increment rate of the infectious compartment I, after subtraction of the rate of people entering the removed compartment R.
Usually one introduces the following non-dimensional variable and new functions:
(2)
called basic reproduction ratio. Then the basic equations given by Kermack and McKendrick [1] are written as
(3a)
(3b)
(3c)
with
(4)
and
(5)
Using Equation (3c) in Equation (3a) and formally integrating, one gets
; using this and Equation (3c) again, from Equation (3b) one easily finds
; then from Equation (3c) she/he will obtain
(6)
This is a transcendental equation, whose solutions one cannot give explicitly in closed analytic form by elementary functions. In their original paper Kermack and McKendrick themselves [1] gave approximate solutions, however without any exhaustive discussion of applicability for various values of the basic reproduction ratio. Quite recently various authors have approached the problem in different ways, but with the same incompleteness [7] [8] [9] [10] [11]. In the sequel, on the basis of simple, analytic geometry considerations, a novel method is introduced, producing approximate but accurate solutions, given explicitly, piece-wisely, from generalized logistic function (see [12] for a description of the origin of the logistic function and its adoption in bio-assay); due attention is paid for the method to be robust over the whole range of possible known values of
, from just above 1 as for influenza, to 1.4 - 3.9 as for Covid-19, to 3 - 5 as for SARS, to 5 - 7 as for polio, to 10 - 12 as for varicella, to 12 - 18 as for measles (see for instance [13] and references therein).
2. Getting the Key Differential Equation
For the epidemic to spread, the increment rate of the newly infectious individuals must be higher than the increment rate of the newly removed individuals. Dividing Equation (3a) by Equation (3c), one finds that it must be
(7)
As a matter of fact this condition implies that
increases over time due to Equation (3b). The functions
,
and
are all defined positive and less or equal to 1; consequently it must be
for the epidemic to spread and
turns to be monotonic decreasing according to Equation (3a), while
monotonic increasing according to Equation (3c). It follows that the function
starts growing due to (7), reaching necessarily a maximum at a time
such that
(8)
then asymptotically decreasing to zero. This implies that the bounded monotonically increasing function
must exhibit a point of inflection at
, after which it bends, increasing slower and slower, finally flattening to some limiting value
(9)
So one must have
(10)
thus getting a transcendental equation for
.
Conveniently for the following developments, a new function is introduced, namely
(11)
in terms of which Equation (6) is re-written as
(12a)
(12b)
(12c)
Clearly
(13)
must be solution of the equation
(14)
for Equation (10) and the fact that
so that
The functional
is null in
, but
cannot be 1 because
and
is not null (see Equation (11)); thus
must be solution of the equation
(15)
which is nothing but Equation (10), as can be easily verified. Equation (15) is transcendental and is to be solved numerically; the interval
is the range of
as x runs from
to
.
The second derivative of F, namely
(16)
starts and remains negative from
, until it reaches the point of inflection
, given by
(17)
then it becomes positive: thus
starts and remains concave until
; then it becomes convex. Of course, in an interval around its inflection point,
is nearly straight. Figure 1 shows how
and
vary as a function of
: for
one has
and consequently
is always concave in the domain
; otherwise it changes from concave to convex
Figure 1. Point of inection and
as a function of
.
after
. It is worth noting that as
increases,
(together with
) approaches more and more the limiting value 1, namely a region where the log term in
becomes important: this fact is relevant here because such log term, with its argument approaching zero, rises complications in searching for an effective approximation.
3. Approximating the Key Differential Equation
The idea is to approximate
by few stretches of up to second order polynomials, joining continuously each other with the first derivative. Then in each stretch the obtained approximate differential equation becomes analytically and explicitly solvable by a generalized logistic function. For
, it is taken
(18)
so that
(19)
Figure 2 shows on the left, in red, this
segment against
(black curve) for
and (consequently)
, extending to its maximum point, which is rather close to the maximum of
. Clearly
is a parabola with axis along the ordinate line, so that the maximum is its vertex.
Denoting by
and
the roots of
, one can write
(20a)
(20b)
with
(21)
The vertex is located in
(22)
A new parabola is chosen as the second approximation stretch, tangent to
on its descending side, with axis along the ordinates and the vertex coincident with that of the first segment
:
(23a)
(23b)
(23c)
(23d)
(23e)
Equations (23b) and (23c) impose that the two stretches have in common their vertexes, located in
; the system of the last two equations states the conditions for
to be tangent to
. It is convenient expressing
, appearing in Equation (23c), in terms of the unknown tangency point
using Equation (23e), so that consequently one solves Equation (23d) for
.
Namely, introducing
(24a)
(24b)
due to Equation (23c) one can write
(25)
while from Equation (23e) and Equation (23d) one has
(26a)
(26b)
Using this expression for
in Equation (26a), one obtains a transcendental ordinary equation for
, to be solved numerically:
(27)
Using
so obtained, one gets
from Equation (26b) and finally
and
via Equation (25) and Equation (23b). In Figure 2, on the left, the second segment for
is shown in blue, extending from
to the point of tangency of the successive approximation segment still to be chosen.
Figure 2. Examples of two cases, with three approximation stretches on the left (red, blue, green) and four approximation stretches on the right (red, blue, green, magenta).
With reference to the discussion before the end of Section 2, it should be noted that
remains concave up to
when
, while it happens that the root
of
(see Figure 3) remains very close to
: this suggests in that range of
values replacing the above
by a different arc of parabola
, keeping its vertex in common with
as
does, but just ending in
, thus imposing the constraint
instead of the tangency to
.
Then for
(28a)
(28b)
(28c)
(28d)
(28e)
For
is almost always concave, ending roughly as a straight line when approaching
. In this range of α’s one keeps
, but completes the approximation through a new parabola, requiring it to be tangent to
and to reach
along the tangent to
in
; an alternative is the ray originating in
, tangent to
. The latter is settled by
(29a)
(29b)
Figure 3.
and
as functions of
.
where
is the discriminant of the second order algebraic equation
, set to zero to assure
to be tangent to
. The appropriate solution for u is
(30)
The problem with this approximation is that, looking for instance at the function
obtained from
, it gets unacceptably overestimated in the region where it bends to reach the asymptotic value as
: this is because
necessarily remains below
due to the concavity of the latter.
The quadratic alternative is defined by
(31a)
(31b)
(31c)
where “prime” stands for derivative and
is the discriminant of the second order algebraic equation
, set to zero so to assure
to be tangent to
. In this case, however, with respect to using
, one has the opposite effect on
, because the given choice for
forces
to stay somewhat above
.
The solution is to keep the quadratic alternative, but replacing the previous value of
by a compromise one, defined through
(32)
Then the parameter
in (31a) is set by means of the condition (31c):
(33a)
(33b)
(33c)
where
is the tangency point of
to
.
So, for
the third and last approximation segment is given by (31a), with
replaced by
, extending from
to
.
For
the convexity trait of
, following the almost straight stretch around
, gets more and more included in the domain
, because
increases with
. Then, the solution adopted is to introduce a linear segment
parallel to the tangent in
to
and tangent to
in a point that will be denoted
; this linear segment will be continued by a new parabola
, which is similar to
, thus ending in
, but tangent to
. Namely
(34a)
(34b)
(34c)
giving
(35a)
(35b)
Then the
approximation stretch, constrained to end in
and to be tangent to
in a point
chosen by trial and error optimization, is given by:
(36a)
(36b)
(36c)
giving
(37a)
(37b)
4. The Approximate Analytic Solution
For each of the above approximation segments a differential equations is defined of the type
(38)
where
is one of
or
or
, with given
and β parameters (or βand γ) and initial conditions. For
, from the definition in Equation (11), the initial condition is
(
without loss of generality), while for each of the remaining approximation segments it is given by the value of the respective preceding segment at the junction point. Since
is at most a second order polynomial, Equation (38) is indeed quite trivially solved, giving a generalized logistic function.
For
:
(39a)
(39b)
For
, thus
:
(40a)
(40b)
For
, thus
:
(41a)
(41b)
For
, thus
:
(42a)
(42b)
(42c)
For
, thus
(see (34) and (35)):
(43a)
(43b)
Finally for
thus
:
(44a)
(44b)
(44c)
It is convenient to introduce
(45)
Then, from Equation (11) one has
(46)
so that
On the other hand Equation (12) implies
and consequently (see Equation (46))
(47)
Finally, of course, due to (4):
(48)
In the case of the SIRD model one defines
(49a)
(49b)
Figure 4 shows a comparison between the numerical “exact” solutions of the SIRD model and the approximate solutions of this work with
and
.
1
? then
: otherwise
.
Imitating a formal expression typical of computing languages1, the result for w can be summarized as follows:
Figure 4. Comparison of “exact” numerical solutions and approximate solutions for the SIRD model.
for
(50a)
(50b)
for
:
(50c)
for
:
(50d)
Similarly for
,
,
and
.
In practice one does:
· solve numerically the transcendental ordinary Equation (15) to get
;
· use Equation (21) and Equation (39) to get
as in Equation (39);
· for
use Equation (22), Equations (28d) and (28e) to get
as in Equation (40);
· for
use Equation (27), Equation (26b), Equation (22), Equation (24b), Equation (25) and Equation (41) to get
as in Equation (41);
· for
use Equation (32), Equation (33a) and Equation (33b), Equation (33c) and finally Equation (42) to get
as in Equation (42);
· for
use Equation (34b), Equation (35) and Equation (43) to get
as in Equation (43);
· for
use Equation (17), Equation (36b), Equation (37) and Equation (44) to get
as in Equation (44);
· eventually use Equation (46), Equation (47), Equation (48), Equation (49).
The four plots in Figure 4 are produced by a C++ code implementing the above steps, then sending the produced analytic function to the graphing utility “gnuplot”: the C++ code could be re-used easily to fit-study data.
5. A Useful Feature
The equation of the first approximation segment can be re-written as
(51)
Using the explicit solution Equation (39), one has
(52)
and consequently
(53)
Typically
(54)
but anyway with
greater than some
’s, in the end one can write
(55)
Analogous results hold for all the approximation stretches in the different
intervals as summarized in Equation (50); for instance, with
greater enough then
, one has
(56)
These piecewise linear behaviors can be seen in Figure 5 for
. The plot on the left shows the numerical solution of the exact equation, compared with the corresponding approximate analytic solution: it is worth recalling (see Equation (47)) that
, so that w is directly related to the data. The plot on the right shows that the function
of the removed individuals exhibits an analogous behavior: since in the SIRD model the
function is a fraction of
, then one has the analogous behavior for the function of the deceased individuals. Figure 6 refers to the data of the deceased individuals during the winter-spring 2020 first wave of Covid-19 in Italy: it remarkably confirms this model feature. One important point here is that the slopes of the straight segments, that are inversely proportional to the related time constants
, are completely determined by the parameters
and
(besides the initial conditions) and so is the angle between such straight segments: consequently one can compare that angle with the theoretically predicted one and argue about the effects of social measures to reduce the pandemic, of course within the trustworthiness of the model.
Figure 5. A couple of checks of the particular piece-wise linearity disclosed by the approximation approach of the present work against the numerical solution of the exact equations.
Figure 6. The special piece-wise linearity disclosed in the present work as exhibited by the real data for Covid-19 originated deaths during the winter-spring 2020 wave in Italy.
6. Conclusions
In this paper the equations of the SIR(D) epidemiological model are replaced by approximate ones, whose solutions are totally defined uniquely by the basic reproduction ratio α and the fractional removal rate γ (alternatively by
). These solutions are continuous (with the first derivative) chains of two or three or four generalized logistic related functions, the number depending on the value of
only; they are summarized in Equation (50) and easily implementable and usable, for instance, to fit-study data.
The analytic geometry based approximation method used here is novel and set stably at least over the range of the measured values of the basic reproduction ratio for several known pandemic diseases. A useful feature of the SIR(D) model, never disclosed before, is also given.