Gardasil on Twitter: A Content Mining Study Examining Message, Context, and Source Characteristics of Human Papilloma Virus (HPV) Vaccine-Related Tweets ()
1. Introduction
Genital human papillomavirus (HPV) is the most common sexually transmitted virus in the United States. It is estimated that more than 90% of all sexually active men and 80% of sexually active women will be infected with at least one type of HPV [1]. HPV is associated with cervical, vaginal, and vulvar cancers in women; penile cancer in men; and anal and oropharyngeal cancers in both men and women [2] [3] [4] [5] [6]. HPV can also cause genital warts and warts in the throat [7] [8] [9].
In 2014, the Food and Drug Administration (FDA) approved a vaccine that has the potential to prevent approximately 90% of anal, cervical, vaginal, and vulvar cancers [10]. The recombinant HPV 9-valent vaccine (Gardasil 9, Merck) covers 9 HPV types—5 more than Gardasil and 7 more than Cervarix (2 other FDA-approved HPV vaccines) [10]. Gardasil vaccines are frequently delivered in 3 separate doses over a short period of time. HPV vaccines have the potential to either reduce or exacerbate disparities in HPV-related diseases, depending on vaccine uptake which varies based on racial/ethnic group, income, and language [11].
While Gardasil 9, Gardasil, and Cervarix all have demonstrated positive effects in reducing multiple types of cancer, the vaccines remain controversial. Reasons include the fact that the vaccines prevent a virus that is transmitted only via sexual contact, differentiating it from most other recommended vaccines [12]. The cervical cancer prevalence and death rate in the US are also both relatively low. Furthermore, some parents believe giving their children this vaccine implies consent to engage in sexual activity, confers a false sense of protection from sexually transmitted infections (STIs), and will lead to earlier first sex [12]. There is also a vocal anti-vaccine movement in which individuals believe that children already have too many vaccinations on the immunization schedule; vaccines in general are “bad”; HPV vaccines are not safe; and/or long-term side effects are unknown [12].
Given these controversies, HPV vaccine uptake remains low, especially in key priority populations. Such health inequities remain a critical adolescent health concern. For example, in 2014, 56% of 13-year-old and 78% of 17-year-old girls reported completing the 3-dose vaccine series [13]. However, fewer males—47% of 13-year-olds and 62% of 17-year-olds—reported series completion [13]. Moreover, important racial disparities exist. Compared to their non-Hispanic White peers, non-Hispanic Black females remain less likely to report 3-dose series completion [13]. Although HPV vaccination coverage among females has increased nationally, HPV coverage lags behind tetanus-diphtheria-acellular pertussis (Tdap) vaccine and meningococcal conjugate coverage [13].
Social media represent a potential vaccine promotion modality, as Blacks, Hispanics, and adolescents and young adults who are within the indicated age-range for receiving the HPV vaccine are heavy users of social media [14]. Social media are interactive Web sites and applications that enable users to create, share, comment on, and, modify content [15]. One type of social media, the microblog, consists of the sharing of short pieces of information to the public. One of the best known microblogs is Twitter, started in 2006. Ten years later, Twitter averages 310 million (66 million US) monthly active users [16] [17], who post content in the form of “tweets”, or messages of ≤140 characters. Twitter is the social media platform of choice for 18% of teens, beating out Facebook (17%) [18]. Hispanic and Black individuals use Twitter more frequently than their White peers [19]. These young people represent important priority populations for HPV vaccine promotion efforts given both the generally inequitable promotion and uptake within these populations and the critical importance of effectively disseminating evidence-based information to those who are eligible to receive the vaccination.
In order to analyze social media posts, several studies have utilized tweets to examine myriad health issues such as influenza surveillance [20] [21] [22] and tobacco use [23], as well as to assess dissemination of information on antibiotics [24] and to examine attitudes toward HPV and the HPV vaccine [25] [26] [27] [28] [29]. Many of these studies utilized a popular method for analyzing the content of tweets known as sentiment analysis which entails using machine learning methods to analyze tweets in terms of the opinions they express, capturing public opinion about a concept or issue [30]. Investigators have theorized that knowledge gained about HPV and HPV vaccines through sentiment analysis could be harnessed in developing HPV vaccine promotion strategies and messaging through social media. In an effort examine this potential application, tweets about influenza vaccinations were mined to assess the sentiments, or opinions, toward vaccinations as well as how those sentiments could be mapped onto large networks for potential public health outreach [31]. That study found that while sentiments toward influenza vaccination vary, they are often shared within a virtual network of like-minded individuals that would make messages targeting vaccination behaviors difficult to effectively disseminate. Several studies found similar results when examining the content of tweets that discuss the HPV vaccine such that sentiments toward the vaccine are frequently shared within a social community that harbors those same beliefs [26] [32] [33]. These communities could be reached through several distinct sources such as doctors, organizations, celebrities, and fellow community members. This further highlights a research gap that exists between HPV vaccine information on Twitter and adequately disseminating to the audiences that should consume this information.
The aim of this study was to examine message, context, and source characteristics related to original Twitter posts regarding the HPV vaccine over a 3-month time period. Specifically, our aims were to describe the source characteristics of twitter users (people and organizations) who post original messages on Twitter about HPV and the HPV vaccine, examine the types of HPV- and HPV vaccine-related information shared on Twitter, and assess the general sentiment toward the HPV vaccine among social media users who tweet about the vaccine. This study will add to the current literature by reporting sentiments expressed by organizations as well as by individuals.
2. Materials and Methods
2.1. Data Source
To address our study aims, we purchased social media data from Twitter. Those data included all tweets matching our search criteria posted in December 2014-February 2015. A total of 47,944 tweets met our criteria: 32,019 tweets matched the keywords HPV AND vaccine; 17,656 tweets matched the keyword Gardasil; 564 tweets matched the keyword Cervarix; and 5951 tweets matched the following criteria: (cervical OR vulvar OR vaginal OR anal OR genital warts) AND (vaccine). From this sample, we removed all non-English tweets, leaving 45,260 for this study.
2.2. Coding
Despite the growing use of sentiment analysis through machine learning methods, there is evidence that machine learning methods do not always align well with social science objectives [34]. For this study, investigators opted to use a qualitative coding methodology whereby we developed a coding document, adapted from sentiment analysis and other content analysis research (e.g. HPV vaccine information on YouTube [35] [36] [37], to manually extract information from this sample of tweets. To pilot test the coding document, we collected data via the Twitter streaming API between September 11 and November 20, 2014. Using 3 sets of keywords, tweets were searched within a 20-mile radius of each of the 100 largest US cities. We identified 5178 tweets: 3156 matching the keywords HPV and vaccine; 1190 matching the keyword Gardasil; and 842 matching the keywords “cervical cancer” and vaccine. Two researchers independently coded 4020 characteristics in a nonprobability sample of 96 tweets. Inter-rater reliability was high: 0.97 on 42 original tweets and 0.98 on 54 retweets.
After piloting the coding document, we then randomly selected 1000 original tweets, from the 45,260 tweets obtained from Twitter, for manual coding and analysis. Two researchers independently coded all tweets for source and message characteristics. Coded source characteristics included: whether the Twitter profile represented an individual, celebrity, verified user (indicated by a blue verified Twitter badge), and/or organization. Tweets posted by individuals were coded using information from the user’s profile page indicating whether the individual self-identified as a parent, child, and/or spiritual/religious person; if they had a particular political affiliation; and their reported profession (e.g. journalist, physician). If the tweet was posted by an organization, organization type was coded using information from the user’s profile (e.g. business, media outlet, nonprofit, government). Since we pulled data from the poster’s profile, these data are not considered anonymous. For example, some “personal” information was recorded, such as poster name (e.g. Kinsey Institute) and affiliation.
All tweets were coded for sentiment using the following categories: “neutral”, “negative”, “ambiguous”, “positive”, and “other”. “Neutral” tweets were defined as those using language indicating neither approval nor disapproval of the vaccine. “Negative” tweets used language disapproving of the HPV vaccine, whereas “positive” tweets used language approving of the vaccine. “Ambiguous” tweets contained both approving and disapproving language. Tweets coded as “other” did not fit into any of the other sentiment categories. Message tone was also coded as to whether tweets contained concerns about civil liberties, that the vaccine is a hoax, and that the vaccine is dangerous.
Guided by previous research [35], we coded tweets for specific information about the HPV vaccine. The following yes/no categories were coded for message frames, or mention of the vaccine being a: cancer prevention, STI prevention, or genital wart prevention tool. We also coded for other vaccine-related information (e.g. whether the tweet mentioned vaccine eligibility criteria, safety concerns, side effects) and the source that the tweet credited (e.g. Centers for Disease Control and Prevention [CDC], medical doctor). We coded for whether tweets gave a firsthand account of an experience with HPV or the vaccine, person writing about the personal experience was a child or adult, and personal experience came from a parent/relative of someone who had received or would receive the HPV vaccine. Finally, we coded messages for other HPV-related information (e.g. how HPV is transmitted, how to protect against infection, symptoms).
2.3. Analysis
We conducted all analyses using IBM® SPSS® Statistics (Release 22.0.0.0). Means and standard deviations were calculated for continuous variables. Frequencies and percentages were calculated for all categorical variables. We also conducted a descriptive temporal analysis of tweets, organized by date of original message posting and by tweet sentiment/tone. Lastly, a chi-square analysis was conducted to determine statistical differences between source types (i.e. individual versus organization) on tweet sentiment.
3. Results
3.1. Source Characteristics
Of the 1000 manually coded tweets, just over half (n = 548, 54.8%) were posted by individuals, whereas organizations posted 404 tweets (40.4%). The remaining tweets came from profiles that were unclear in terms of representing an individual or organization. Among the 548 tweets posted by individuals, the largest identified types were parents (9.5%), journalists (8.2%), and physicians (7.3%; Table 1). Only one tweet came from a dental professional, and no tweets from children/teens were identified. Among the 404 tweets from organizations, the largest identified types were health information providers (43.6%), businesses (37.4%), non-profit/advocacy groups (16.6%), and healthcare organizations (13.4%).
Only 3.6% of all tweets (n=36) were posted by a Twitter-verified user. The median number of followers at the time of tweet was 433 (M = 2559.43, SD = 12,150.74), ranging from a low of 0 to a high of 179,112 followers for The Kinsey Institute (http://www.twitter.com/kinseyinstitute). The median number of profiles the Twitter users followed at the time of tweet was 413 (M = 3518.19, SD =
Table 1. Source characteristics of tweets regarding the HPV vaccine.
a. Source characteristics do not add up to 100% due to information being unavailable on the individual poster’s Twitter profile; b. Source characteristics exceed 100% because an organization may be included under multiple categories (e.g. a business and a healthcare organization).
17279.85), ranging from a low of 0 to a high of 369,846 profiles for The Toronto Star (https://twitter.com/TorontoStar) a news source that published a high-profile story on the HPV vaccine, which was later retracted and removed from its website.
3.2. Tweet Sentiment
More than half of tweets (n = 571, 57.1%) were coded as having a positive tone. Less than one-fifth of tweets were coded as having a negative tone (n = 188, 18.8%), whereas 195 (19.5%) were coded as neutral. Forty-six tweets were coded as either ambiguous (1.4%) or other tone (3.2%). Table 2 displays examples of tweets that were coded across the various sentiment categories. Only a small percentage of tweets contained specific negative content about the vaccine, such as the vaccine being dangerous or a hoax, or civil liberties concerns (Table 3). Figure 1 contains a temporal display of tweets, by sentiment/tone, across each day of the 3-month time period. The figure also contains some important contextual events that occurred on or near tweet spikes and demonstrates that there can be increases in positive tweets during newsworthy events, such as the FDA approval of Gardisil 9.
Table 2. Examples of tweets coded by sentiment.
Table 3. Message characteristics and HPV-related content included in tweets regarding the HPV vaccine (n = 1000).
Figure 1. Number of tweets by coded sentiment/tone and date of posting.
Tweets posted by organizations were more likely to be positive, compared to those posted by individuals (X2 [1, n = 955] = 60.89, p < 0.001). Almost three-quarters (71.3%) of tweets from organizations were positive, compared with 57% from individuals. Alternatively, 23% of tweets posted by individuals were negative, compared with 13% of tweets from organizations.
3.3. Message Characteristics
The largest message frame related to the HPV vaccine as a cancer prevention tool, as 18% of tweets mentioned that the vaccine can prevent cancer (Table 3). Very few tweets in this sample included HPV vaccine-related information (e.g. girls are vaccine-eligible; Table 3). Attributing credit to sources of information in tweets was very uncommon in this sample, with the most commonly cited sources in tweets being medical doctors (3.0%; Table 3). Few tweets were delivered from a first person perspective. Of tweets posted by individuals, 23 (4.2%) included a personal account from individuals who discussed a firsthand experience with HPV or the vaccine. Only 0.6% originated from the perspective of a parent/relative of a potential or actual HPV vaccine recipient. Finally, few tweets included information about HPV, such as its link with cancers and how HPV can be transmitted (Table 3).
4. Discussion
The discussion surrounding the HPV vaccine in original Twitter posts appears to be positive, as more than half of tweets examined in our research contained language approving of the vaccine. These findings differ from some of the existing research on social media and the vaccine, including across 2 content analyses of YouTube videos [37] [38], a qualitative study [39] and content analysis [40] of online news stories, and a content analysis of top search engine results [36]. In more recent research, Bahk et al. [41] found that mainstream media were dominated by positive HPV vaccine sentiment (over a one-year period, the weekly average was 75.5% positive), but on Twitter the predominant sentiment toward the HPV vaccine was negative (over a one-year period, the weekly average was 74.9% negative). The Bahk et al. study included retweets in analyses, however, and it is possible that negative material is retweeted more frequently. In addition, our study indicates that the quantity, sentiment, and tone of tweets varies over time and is linked to “newsworthy” events, such as FDA approval for a new vaccination, research results being disseminated, and political debates.
Although HPV vaccine sentiment seems to vary across time, studies, and platforms, research indicates that negative sentiments seem to be more influential, at least regarding HPV vaccine sentiment. In a controlled experiment, for example, Nan and Madden [42] found that college students exposed to a negative blog post about the vaccine “perceived the vaccine as less safe, held more negative attitudes toward the vaccine, and had reduced intentions to receive the vaccine” compared to those who had no blog exposure. However, those exposed to the positive blog had no changes in perceptions, intentions, or attitudes regarding the vaccine [42]. Twitter-specific data on HPV vaccine sentiment suggest that we could stratify tweets based on social network connectedness to target large swaths of consumers that share specific sentiments regarding the vaccine [28] [43]. Targeted tweets or posts could then be disseminated to those community members to improve their sentiment regarding HPV vaccination. While this is one potential approach to effectively disseminating messages regarding HPV vaccination, there is still a critical need to optimize methods to counteract tweets providing negative and incorrect information, especially since HPV-related tweets often occur in quick spikes.
On Twitter, the HPV vaccine is framed largely as a cervical cancer prevention tool, with less focus on its potential to prevent genital warts or other cancers. In addition, there were more tweets describing how the HPV vaccine could help females, potentially making it seem less relevant for males. According to Bigman et al. [44], participants who learned about HPV vaccine effectiveness believed more in its effectiveness for preventing cervical cancer and felt more positively toward the vaccine for cervical cancer prevention. Adolescent medicine specialists and public health educators should expand communication regarding the HPV vaccine’s ability to prevent other types of cancers and warts, which can occur in males and females.
Very few tweets in this sample included HPV- and HPV vaccine-related information. Calloway et al. [45] and Habel et al. [40] found that US newspaper coverage and online news stories of the HPV vaccine lacked detailed information on HPV, which could contribute to people misunderstanding the complexity of cervical cancer and HPV. While the 140 character limit for tweets constricts the amount of information that can be shared, professionals can share detailed information via Twitter by including links with information about HPV and the vaccine.
Individuals (most commonly parents, journalists, and/or physicians) shared most (54.8%) tweets, but few tweets were written in first-person. Organizations (mostly health information providers, non-profit/advocacy groups, and healthcare organizations) shared about two-fifths of tweets. Additionally, the majority of negative sentiments were tweeted by individuals instead of by organizations. Thus, organizations may be a potential source of information that would reach more followers outside of social networks that share homogenous beliefs about the vaccine. Other types of healthcare personnel—nurses, allied health professionals, health educators, and dental professionals—might consider communicating correct and positive information about the HPV vaccine via Twitter. These practitioners could also be targeted as recipients of positive messages in the hopes that they would then disseminate the information to patients.
Study Limitations
This research should be considered within the context of its limitations. First, this research only examined data generated by Twitter users. Inclusion of data from other platforms is warranted, especially since the social media platforms of choice for 28% and 27% of teens are Snapchat and Instagram, respectively [18]. Second, our research did not examine the relations between Twitter users and their HPV vaccine sentiment. Salathé and Khandelwal [31] found that users with similar sentiments toward an issue share information more often with each other than users with dissimilar sentiments. Future research should explore how HPV vaccine information flows through and across social networks. Such research may guide adolescent health and medicine interventions and facilitate diffusion of information about the HPV vaccine across networks. Third, since only 0.4% of tweets contained exact geo-coordinates, we did not analyze these data from a geospatial perspective. According to Twitter, only 1% - 2% of all tweets are geo-tagged [46], so this limitation is inherent in most Twitter-based research studies that do not utilize a location inferencing technique [47] [48]. Fourth, our choice of keywords may have limited the population from which our sample emerged. For example, we searched for “vaccine” instead of “vax”, a term that may be more likely used among individuals opposing vaccination. Fifth, we only examined original tweets, excluding retweeted material. Because of this decision, we may have excluded potentially important information. However, according to Sysomos, 71% of tweets receive no reaction (compared to 23% of tweets that get a reply and 6% that get retweeted) [49]. Sixth, our research is limited to English language tweets and cannot be generalized to non-English tweets.
Parents, adolescents, and young adults all play a role in deciding whether or not to receive the HPV vaccine. Dempsey and Zimet [50] found that social media may influence adolescents’ intentions and decisions regarding vaccinations. It may be that different approaches to promoting the vaccine are necessary for different population segments. In addition, sentiment toward the vaccine appears to be swayed by media and political events, but it is not clear how people move back and forth from having a positive or negative attitude toward the HPV vaccine. It is also unclear how Twitter can be used to change that or motivate parents, adolescents, or young adults to choose to get the vaccine. Future research should investigate tweets and Twitter-users to determine how reflective tweets are of underlying beliefs and knowledge about the HPV vaccine and whether these behavioral constructs can be changed via Twitter.
While it is clear that some healthcare providers—mainly physicians—disseminate HPV vaccine information via Twitter, our study reveals an opportunity for a greater variety of healthcare personnel to provide accurate and positive HPV vaccination information on this platform. In addition, healthcare providers should be aware of messages that parents, adolescents, and young adults are receiving about the vaccine, especially in times of substantial coverage of an HPV vaccine news topic. Future directions could also target these various healthcare practitioners to then disseminate positive messages regarding HPV vaccinations to appropriate recipients. Finally, it is critical that public health professionals learn more about how to harness the power of social media like Twitter to better inform the decision making of parents, adolescents, and young adults related to the HPV vaccine.
Declarations
Funding: This work was supported in part by the Center for Human Dynamics in the Mobile Age (for use of its SMART Dashboard), and seed money provided by San Diego State University.
Ethical approval: None required.
Guarantor: EWB.
Contributorship: EWB, JRH, and JLF researched literature and/or conceived the study. EWB, AN, JLF, and KJW was involved in protocol development and data analysis. EWB and JRH wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Acknowledgements
We thank Ming-Hsiang (Ming) Tsou, Ph.D. and the Center for Human Dynamics in the Mobile Age for use of the SMART Dashboard, and San Diego State University for the seed money that funded the acquisition of Twitter data and supported personnel efforts. We also thank Marcus Lewis, Roxana Rezai, and Pegah Sabzi for their work as tweet coders.