Predicting Parotid Dose Changes in Head-and-Neck Radiotherapy Using Machine Learning: Leveraging Anatomical Variations ()
1. Introduction
Clinically significant changes in patients’ anatomy during fractionated radiotherapy, e.g., body shrinkage due to weight loss or target/organ shape and position variations, cause the actual dose delivered to patients to varying considerably different from the originally planned. Adaptive radiotherapy (ART) is a method to account for those changes and adjust the dose [1]-[8]. However, the process of ART is time-consuming and may not benefit those patients whose anatomy has limited variations [9]-[12]. Therefore, a quantitative method that considers anatomical changes to predict target/organ dosimetric changes for initiating adaptive planning decisions is highly desirable. This study focuses on the patients’ anatomical changes in predicting parotid dosimetric changes for head-and-neck cancer since xerostomia caused by radiation side effects to the parotid is a major cause of reduced quality of life after radiotherapy [13]-[15].
This study hypothesizes that the following three anatomical changes during treatment sessions cause changes in parotid dosimetry: (1) parotid volumetric shrinkage, (2) distance changes from the parotid to the PTV, e.g., displacement of the parotid into (or away from) the high dose region, and (3) length of beam path changes between the parotid and skin, caused by patient’s weight loss near the neck. Anatomical changes in (1) is well-documented [12] [16]-[20]. For example, our previous study involving 185 CT scans (including a planning CT and several weekly follow-up CTs) of 26 patients shows that on average the parotid has a volumetric reduction of ~10 cc (or ~30% in terms of the fractional volume) by the 7th week [16] [18]. However, the anatomical changes in hypotheses (2) and (3) have not been studied due to a lack of geometrical features for quantifying the distance and beam path length. Therefore, this study has two aims:
To propose a set of geometrical features for quantifying patients’ anatomical changes relevant to hypotheses (2) and (3).
To use a machine-learning approach to determine a feature subset producing maximal predictive power to estimate parotid dosimetric changes.
2. Methods
2.1. Geometrical Feature Quantification
Signed Euclidean distance [21]-[23] between two objects (main object and voxelized, secondary object as shown in Figure 1) is applied to quantify the distance and beam path length in hypotheses (2) and (3). As illustrated in Figure 1, the shortest Euclidean distance d(i, j) from voxel (i, j) of the secondary object to the main object’s surface is calculated: if voxel (i, j) is inside the main object, d(i, j) < 0; if voxel (i, j) is on the surface of the main object, d(i, j) = 0; if voxel (i, j) is outside the main object, d(i, j) > 0. The distance calculation is repeated for every secondary object’s voxel. Finally, a cumulative histogram v = Hist (d) relating the shortest Euclidean distance d and the secondary object’s absolute volume v is formed. v is then normalized by the absolute volume of the secondary object as the fractional volume in the range of [0, 100%]. Hereinafter v refers to the fractional volume.
Figure 1. d(i, j) is the shortest Euclidean distance from the secondary object voxel (i, j) to the main object’s surface. v = Hist (d) is a cumulative histogram in relating distance d and fractional volume v; dv = Hist−1 (v) is the inverse function of v = Hist (d), representing distance d at fractional volume v.
Next, we define dv = Hist−1 (v) as the inverse function of v = Hist (d), where dv represents distance d at fractional volume v. For example, d0, d50, and d100 represent the minimal, median and maximal distance from the secondary object to the main object’s surface;
represents the mean distance from the secondary object to the main object’s surface, where N is the total number of voxels inside the secondary object. Finally, dv and dmean are applied to quantify the distance and beam path length in hypotheses (2) and (3), where the parotid is the secondary object and the PTV and skin are the main objects.
65 geometrical features categorized into four sets in Table 1 are defined a priori for quantifying patients’ anatomical changes and predicting parotid dosimetric changes:
SET1 is the parotid absolute volume in cc; hypothesis (1).
SET2 is the distance between the parotid and PTV; hypothesis (2).
SET3 is the distance between the parotid and skin near the neck; hypothesis (3).
SET4 is the distance of the parotid to the two bony landmarks in Figure 2: the dens of C2 (Cervical Vertebra; Bone1) and tip of the basilar part of the occipital bone (Bone2).
In Table 1, distance features dmean and dv are calculated from dv = Hist−1 (v) (inverse function of v = Hist(d) defined in Figure 1), where the parotid is the secondary object, where PTV, skin and bony landmarks are the main objects.
Table 1. Proposed 65 geometrical features (4 sets) quantifying patients’ anatomical changes.
SET1: parotid volume |
SET2: distance between the parotid and PTV (cm) |
SET3: distance between the parotid and skin (cm) |
SET4: distance between the parotid and two bony landmarks* (cm) |
volume in cc |
dmean and {dv}# |
dmean and {dv}# |
dmean and {dv}# |
Feature No.: 1 |
Feature No.: 2 - 17 |
Feature No.: 18 - 33 |
Feature No.: 34 - 65 |
#v as the fractional volume in %: 0, 0.5, 1.0, 2.5, 5, 20, 35, 50, 65, 80, 95, 97.5, 99, 99.5 and 100. *: surrogates for the PTV in SET2; Bone1 landmark: the dens of C2; Bone2 landmark: the tip of the basilar part of the occipital bone. See Figure 2 for the details.
Figure 2. Two bony landmarks (Bone1 and Bone2; coronal view) as surrogates of the PTV in SET 2 for quantifying the position of the parotid relevant to the high dose region.
As discussed in Section 2.2, the two introduced bony landmarks in SET4 are used as surrogates for the PTV in SET2 to determine the parotid’s position with respect to the high-dose region. Figure 3 provides a simplified geometrical illustration of the four geometric sets.
2.2. Study Design
Eighteen patients with head-and-neck cancer previously treated with ART are retrospectively enrolled. Each patient has two CTs: initial planning CT (initialCT) and adaptive planning CT (adaptiveCT), which are acquired in the middle of the treatment. Physicians contour respective targets and organs in both CTs. Each patient has two VMAT plans: Plan1 is the initial plan based on initialCT with its associated contours; Plan2 is the forward-dose calculation plan, applying Plan1’s beam configuration on the adaptiveCT with its associated contours. In plan evaluation, our institution uses the mean dose to the parotid as the primary dosimetric metric. Therefore, this study focuses on predicting parotid mean dose changes.
Figure 3. Simplified illustration of the 4-SET geometrical features in predicting parotid dosimetric changes. SET 1: the parotid absolute volume in cc; SET 2: the distance of the parotid to the PTV; SET 3: the distance of the parotid to the skin near the neck area; SET 4: the distance of the parotid to the two bony landmarks (Bone1 and Bone2) shown in Figure 2. Bone1: the dens of C2; Bone2: the tip of the basilar part of the occipital bone.
The two bony landmarks introduced in SET4 of Table 1 are used as surrogates of the PTV in SET2 for the following reason: in this retrospective study, the PTV contours in adaptiveCT (or Plan2) are available since all enrolled patients have been treated using ART. However, in clinical application, the PTV contours in adaptiveCT may not be available since target contouring is time-consuming and not always completed until a decision to use ART is made [11]. Since the two bony landmarks in SET4 are rigid, both provide localization reference shared by initialCT and adaptiveCT. Additionally, they are easily contoured as only two slices are needed in the transverse plane. Thus, the introduced bony landmarks allow us to investigate whether the parotid is being pushed into (or away from) the high dose region without using PTV contours in initialCT and adaptiveCT.
For each enrolled patient, corresponding DICOM RT-dose, RT-structure and CTs of Plan1 and Plan2 are extracted from the treatment planning system. An in-house developed Matlab (The MathWorks, Inc., Natick, MA, US) program is used to calculate Table 1’s described features from the DICOM data. Below, we define the anatomical feature changes between the initialCT (Plan1) and adaptiveCT (Plan2) as:
Anatomical ratio <feature> = (Plan1 <feature> − Plan2 <feature>)/|Plan1 <feature>| (1)
where “| |” is the absolute magnitude. A positive ratio means Plan2’s feature (volume or distance) is smaller than Plan1. Next, we define the parotid mean dose changes as the ratio of the relative parotid mean dose of Plan1 to Plan2:
Dose ratio = (Plan1’s relative mean dose in %)/(Plan2’s relative mean dose in %) (2)
where “%” is the organ’s absolute mean dose normalized by the target prescription. Dose ratio < 1.0 means that the Plan2’s relative mean dose in % is larger than Plan1. The introduced relative mean dose in % allows us to compare the parotid dose from the plans with the different target prescriptions.
2.3. A Machine-Learning Approach
A decision tree classifier for predicting x% parotid mean dose increase over Plan1 is built from the data of the 36 parotids of the 18 enrolled patients. The predictors are the 65 anatomical ratios calculated from Equation (1) using the features in Table 1. The response is the binarized dose ratio calculated from Equation (2): if the parotid’s relative mean dose in % from Plan2 is x% larger than its value in Plan1, the response is “1”; otherwise, it is “0”. x is set at 0%, 5%, 10%, 15% and 20% to generate the corresponding binary response. Since only the data of the 36 parotids are available, leave-one-out cross-validation is applied in model training and verification. Additionally, in the feature selection, enumerating k-combination (k = 1, 2, 3 and 4) of the 65 predictors combined with the leave-one-out cross-validation is used to find a feature subset maximizing the classifier’s accuracy under various x% parotid mean dose increase. Gini-diversity index and maximal number of decision splits = 10 are used in the decision tree algorithm. Note that in this retrospective study, SET2’s PTV features from adaptiveCT (Plan2) are available, while in clinical application, those PTV features would not. Therefore, SET4 with the two bony landmarks are introduced as surrogates for the SET2’s PTV features. Accordingly, the classifier’s accuracy with/without SET2’s PTV features is evaluated to determine the feasibility of the surrogates.
3. Results
Figure 4 shows the found feature subset yielding maximal predicting accuracy with/without SET2 (e.g., PTV features from the 2nd to 17th in Table 1) under various x% parotid mean dose increases. The results show that when SET2’s PTV features are excluded from feature selection, the accuracy drops by ~5.5% to 86.1% in x = 0%, 5% and 10% cases. In x = 15% case, SET2’s PTV features inclusion does not improve accuracy. In x = 20% case, SET2’s PTV features inclusion improves accuracy by ~2.7% to 97.2%. In addition, Figure 4 shows that the parotid volumetric change (SET1) is an irrelevant feature in x = 5%, 10%, 15% and 20% cases.
Table 2 illustrates the confusion matrix and predicting accuracy for various x% when SET2’s PTV features are excluded. The last column shows the improvement of the accuracy over prevalence (maximal prior probability of true class of the 32 parotids in this study).
Figure 5 shows an example of the decision boundary of a classifier for predicting x = 10% parotid mean dose increase, excluding SET2’s PTV features in the feature selection: feature 28th (d80) in SET3 and feature 57th (d35) in SET4 yield a maximal accuracy of 86.1% with 19.4% improvement over prevalence = 66.7%; confusion matrix parameters are: TP = 9, FP = 3, FN = 3, TN = 22, TPR = 75%, TNR = 91.7%, PPV = 81.8% and NPV = 88%. The definitions of confusion matrix and TP, FP, FN, TPR, TNR, PPV and NPV are discussed in Wikipedia [24].
Figure 4. The found feature subset yielding maximal predicting accuracy under various x% parotid mean dose increases. “Red circle”: feature selection excluding SET 2 (PTV features) of Table 1; “Blue diamond”: feature selection including all SETs.
Table 2. Classifier performance evaluation excluding SET2’s PTV features in feature selection: confusion matrix, predicting accuracy and prevalence.
x%: dose increase |
True class |
Predicted class |
Predicting accuracy |
Prevalence: maximal prior probability of true class |
Improvement # |
“0” |
“1” |
“1”: >x = 0%& |
“0” |
11 (73.3%) |
4 (26.6%) |
86.1% |
True class “1”: (20 + 1)/36 = 58.3% |
27.8% |
“1” |
1 (4.7%) |
20 (95.23%) |
“1”: >x = 5% |
“0” |
17 (85%) |
3 (15%) |
86.1% |
True class “0”: (17 + 3)/36 = 55.6% |
30.5% |
“1” |
2 (12.5%) |
14 (87.5%) |
“1”: >x = 10% |
“0” |
22 (91.7%) |
2 (8.3%) |
86.1% |
True class “0”: (22 + 2)/36 = 66.7% |
19.4% |
“1” |
3 (25%) |
9 (75%) |
“1”: >x = 15% |
“0” |
31 (100%) |
0 (0%) |
91.7% |
True class “0”: (31 + 0)/36 = 86.1% |
5.5% |
“1” |
3 (60%) |
2 (40%) |
“1”: >x = 20% |
“0” |
32 (100%) |
0 (0%) |
94.4% |
True class “0”: (32 + 0)/36 = 88.9% |
5.5% |
“1” |
2 (50%) |
2 (50%) |
&Class “1”: parotid mean dose increase > x%; #: difference between accuracy and prevalence. Prevalence: maximal prior probability of true class of the 32 parotids in this study.
Figure 5. The decision boundary of a classifier in predicting x = 10% parotid mean dose increase, excluding SET2’s PTV features in the feature selection. Feature 57th in SET4 from the bony landmarks and feature 28th in SET3 from the skin features are found to yield maximal accuracy of 86.1%. “1”: means parotid mean dose increase is larger than 10%.
4. Discussion
This study proposes three hypotheses related to anatomical changes in predicting parotid mean dose increase in head-and-neck fractioned radiotherapy. Surprisingly, Hypothesis 1, parotid volumetric changes (Feature 1 in Table 1), is found not to be among the top three features in predicting parotid mean dose increase, as shown in Figure 4. Both Hypotheses 2 (the distance changes from the parotid to the PTV) and 3 (the length of beam path changes between the parotid and skin near the neck) are found to be strong predictors.
Before discussing the findings in detail, we identified the following limits: 1) We also applied two other machine learning methods, Support Vector Machine and Ensemble Methods, for predicting parotid dose changes. However, those methods generate inferior results to the proposed decision tree method, and it is not worth discussing. We do not apply Neural Network approach since there is not enough patient data for training and testing. 2) The clinical decisions for adaptive planning of the 18 enrolled patients in this study vary (e.g., mask unfitting due to weight loss, or tumor size variations observed from on-board imaging): none of the adaptive planning decisions are triggered due to parotid dose concerns. 3) The 4-SET geometrical features, as shown in Table 1, are pre-defined, which could limit the searching space of the feature selection and performance of the classifier. 4) The goal of the study is to find a subset of the proposed geometric features in predicting parotid dose changes. How to clinically incorporate those findings in adaptive planning decisions is beyond the study.
However, the confusion matrix in Table 2 gives some hints in implementing the findings in clinics:
Although the predicting accuracy in the cases of x = 15% and x = 20% parotid mean dose increase is high at 91.7% and 94.4% respectively, the classifiers do not help in adaptive planning decisions since the prevalence (maximal prior probability of the true class in our data) is also high—86.1% for x = 15% and 88.9% for x = 20%. The accuracy improvement over prevalence is marginal at 5.5% for both x = 15% and x = 20% cases.
Although the accuracy improvement over prevalence in the case of x = 0% is 27.8%, it is not clinically useful. The output “1” in the case of x = 0% merely implies that the estimated parotid mean dose in Plan2 is larger than that of the initial plan (Plan1), providing no quantitative information for parotid mean dose increase.
The classifiers under x = 5% and x = 10% produce 30.5% and 19.4% accuracy improvement over prevalence. Both are clinically meaningful if x = 5% (or x = 10%) parotid mean dose increase over the initial plan would be the clinical endpoint for adaptive planning decisions.
To our knowledge, the findings in Hypothesis 2 are the first to use the bony landmarks shown in Figure 2 as surrogates for the PTV in quantifying the motion of the parotid relative to the high dose region. Several studies show that the parotid tends to shift medially towards the PTV areas during treatment, potentially jeopardizing parotid sparing [12] [16]-[20]. Since target and organ contours in both initial and adaptive CTs are available in this study, the classifier performance using PTV features (or SET2 in Table 1) is evaluated as a benchmark. We find that without PTV feature inclusion, the classifier’s accuracy at x = 5% and x = 10% decreases by only 5.6% from the 91.7% benchmark, demonstrating that the proposed bony landmarks are feasible surrogates for the PTV. In addition, Figure 5 shows that 9 out of 11 parotids predicting as “1”, e.g., parotid’s mean dose increased by 10%, have the reduced distance (e.g., Feature 57’s anatomic ratio in Equation (1) being positive) from the parotid to the bony landmarks. Since the bony landmarks are the rigid localization reference between initial and adaptive CTs, our findings indicate that parotids being shifted closer to the bony landmarks (e.g., reduced distance as mentioned above) would receive higher dose during treatment sessions, which is in line with the reports using PTV [12] [16]-[20].
The benefits of not having to include the PTV contours in predicting parotid mean dose increase are obvious: in clinical application, target contouring is time-consuming and generally not performed for the adaptive CT (or Plan2) unless the decisions to adapt planning have been made. However, the bony landmark surrogates introduced here only require 2-slice contours in the transverse plane, which can be easily segmented by therapists/dosimetrists/physicists. How to incorporate those segmented bony landmarks as surrogates of PTV in adaptive planning decisions is our future research.
Another factor influencing the parotid’s mean dose is weight loss during treatment sessions. De Bari et al. found that external body contour changes could serve as a flag to trigger adaptive planning [25]-[27]. Our study focuses on the skin contour near the neck. Figure 5 shows that Feature 28 quantifying the beam path length between the skin and parotid (SET3 in Table 1) is a key feature in predicting parotid mean dose increase: 9 out of 11 parotids predicted as “1”, e.g., parotid’s mean dose increased by 10%, have an enlarged distance (e.g., Feature 28’s anatomic ratio in Equation (1) being negative) from the skin to the parotid. At present, we are not able to explain the above observed correlation between enlarged distance and parotid mean dose increase. Future research is necessary to investigate the issue.
The findings in Hypothesis 1 do not fully align with the published literature [12] [16]-[20]: we do not find parotid shrinkage (or volumetric changes in general) to be a key factor in correlating with parotid dose increase. Indeed, our study finds that parotid medially shift towards the PTV areas (or the proposed bony landmarks as PTV surrogates) is a key factor in predicting parotid dose increase during treatment sessions. Further study is necessary in this direction.
5. Conclusion
We have presented a machine-learning approach that utilizes a decision-tree classifier to predict parotid mean dose changes caused by patients’ anatomical changes in fractionated radiotherapy for head-and-neck cancer. In the current 18 enrolled patients, we found that if parotid mean dose increase is a major concern, the contours of the parotid, skin near the neck area and proposed bony landmarks in initial and adaptive CTs are sufficient predictors. Further study with a larger sample size of patients is necessary to provide a robust basis for generalization.