Studying Texas Public Health Agencies’ Twitter Messages about COVID-19 using Natural Language Processing

Texas represents a unique case among all the states in the US in dealing with COVID-19. It was among the first states to reopen in the Spring of 2020 as well as 2021. State and local governmental offices sued each other over COVID-19 control measures.

In this collaborative study involving authors from four universities in Texas (Texas A&M, University of Houston, UT Health, and Rice), we examined the Twitter message sent by all the public health agencies and emergence management organizations in Texas during the first six months of 2020. We used BERT, a natural language processing tool developed by google, to automatically classify these tweets in terms of their functions, prevention behaviors mentioned, health beliefs discussed. We also explored the relationship between tweet contents and public engagement (in term of likes and retweets).

Here are some of our findings:
• Information was the most prominent function, followed by action and community.
• Susceptibility, severity, and benefits were the most frequently covered health beliefs.
• Tweets serving the information or action functions were more likely to be retweeted, while tweets performing the action and community functions were more likely to be liked. Tweets communicating susceptibility information led to most public engagement in terms of both retweeting and liking.


Tang, L., Liu, W., Thomas, B., Tran, M., Zou, W., Zhang, X., & Zhi, D. (In press). Texas public agencies’ tweets and public engagement during the COVID-19 Pandemic: Natural language processing approach. Journal of Medical Internet Research: Public Health and Surveillance. [Preprint here]

Mickey Mouse got the measles!

Do you remember the 2015 Measles outbreak originating in Disneyland in California? This outbreak was one of the biggest outbreaks of emerging infectious diseases in the United States before the COVID-19 pandemic.

In one study, we examined the semantic networks of Twitter contents about measles based on the corpus of 1 million tweets.

Semantic networks represent the semantic relationships among a set of words. In a semantic network, word-use frequencies and co-occurrence of the most frequently occurring words represent shared meanings and common perceptions. For instance, the cluster of purple words in the lower-left corner of the network represents the political frame, where people talk about the causes and solutions of the outbreak in political terms. For instance, whether measles was brought to the US by immigrant? What kind of role the government should play in preventing such outbreaks?

We identified four major frames: news update frame, public health frame, vaccine frame, and political frame.  

We also mapped the longitudinal changes of the frames during different stages of the outbreak.

The news update frame appeared to be the most dominant frame during the initial and resolution stages.

The public health frame was 1 of the 2 most dominant frames in the pre- crisis stage; however, its use decreased during the initial stage and was lowest during the maintenance stage.

The use of the vaccine frame increased from pre-crisis stage to the initial stage and the vaccine frame became the most dominant frame during the maintenance.

The political frame was the least often used frame in all four stages of the outbreak and appeared most frequently during the maintenance stage.

Tang, L., Bie, B., Zhi, D. (2018). Tweeting about measles during an outbreak: A semantic network approach to the framing of emerging infectious diseases. American Journal of Infection Control, 46(12), 1375-1380. doi: 10.1016/j.ajic.2018.05.019

The way we use YouTube influences our chance of getting vaccine misinformation.

Have you wondered how you stumbled upon that anti-vaccine video on YouTube? How come one innocent click of the video might or might not lead you to an avalanche of more videos that not only tell you that vaccine causes autism but also try to sell you vitamins and magical herbs?

Recently, I completed a study that examined how user behavior and YouTube’s ranking and recommendation algorithms influence the exposure to different kinds of vaccine information.

We explored the different patterns of exposure on YouTube when a person starts with a keyword-based search on the YouTube platform (goal-oriented browsing) and when a person starts with an anti-vaccine video on another website such as Facebook (direct navigation). We simulated the patterns of exposure by creating four networks of videos based on YouTube recommendations.

First, we created two search networks, one from a set of pro-vaccine keywords and the other from a set of anti-vaccine keywords. For each network, we collected the first six videos, and for each of these six videos, we further collected six recommended videos, and another six videos for this second layer of videos. Videos collected based on pro-vaccine keywords and anti-vaccine keywords are then put into separate networks. The same procedure was used to create two additional networks starting from two sets of anti-vaccine seed videos. (In the four networks below, green dots represent pro-vaccine videos, red dots represent anti-vaccine videos, and gray dots represent non-vaccine videos.)

What we found was that viewers are more likely to encounter anti-vaccine videos through direct navigation starting from an anti-vaccine seed video (the two networks on the bottom) than through goal-oriented browsing (the two networks on the top).

Why is it the case?

I think it is because YouTube is intentionally suppressing the rankings of anti-vaccine videos in its search research. So you are unlikely to find anti-vaccine video even if you start with some anti-vaccine keywords. However, when you start with an anti-vaccine seed video that you encounter from another platform, such as Facebook, reddit, or your crazy uncle’s email, the protection mechanism YouTube doesn’t work anymore. When you click one anti-vaccine video, YouTube’s recommendation algorithm will show more and more anti-vaccine videos. This is also called the “filter bubble,”

Tang, L., Fujimoto, K., Amith, M., Cunningham, R., Costantini, R.A., York, F., Xiang, G., Boom, J., & Tao, C. (2020). Going down the rabbit hole? An exploration of network exposure to vaccine misinformation on YouTube. Journal of Medical Internet Research. doi: 10.2196/23262