The research findings presented below are from a completed weight-loss trial that had 400 college students randomized to the treatment (n = 200) or control (n = 200) group. The treatment group had access to study-designed apps, a study-specific Facebook page, website, emails, and a text messaging platform. The control group received paper materials on healthy living including quarterly newsletters. There were no in-person sessions other than measurement visits every six months for 2 years. For more on project SMART, please see the published literature found on my google scholar page. An R Pubs document also available here:

Threats to causal inference in an increasingly connected world

A basic premise of intervention work is that group assignment is controlled, and those assigned to the control group are not exposed to the treatment (e.g., in a two-group parallel design). Combined with randomization, this is what enables randomized trials to be the "gold standard" of causal inference. However, as technology-mediated interventions proliferate, and the world is increasingly connected, it is very likely that treatment delivered digitally may spill over into the control group - via online connections. In a completed weight-loss trial, we demonstrated that people randomized to the control group may have been exposed to the treatment. Here I map the in-study network (n = 315) using bi-directional Facebook friendships, and show how many participants randomized to the control group were friends with those in the treatment group: 56% of participants were connected to at least one other participant. Given that one of the intervention's treatment tools was weight-loss campaigns on Facebook, it is quite probable that participants in the control group were exposed to and/or engaged with treatment content.


Talking about health on Facebook:
Does exposure to healthy content on Facebook change future posting behavior?

Participants in the treatment group of a remotely delivered weight-loss trial were exposed to Facebook campaigns designed to help them lose weight. For example, they could access a study-specific page that asked them to pledge to eat more mindfully over the holidays, and take part in competitions to increase their daily step count. I examine whether exposure to this content led to an increase in participants' posting behavior with their broader Facebook network (i.e., real-world friends). For example, do participants in the treatment group post more about healthy food compared to those in the control group over time? Broadcasting to your friends about diet and exercise may elicit social support and accountability for weight-related behavior change, and ultimately be associated with weight loss. Further, similar to emotional contagion that is known to spread across online networks, healthy posts may spread and elicit behavior change of those not directly enrolled in the weight-loss trial. We are testing three natural language processing techniques to evaluate unstructured Facebook posts, and I present results from the first and most crude approach below.

Two unigram dictionaries were iteratively built by experts in the field, one containing exercise (N = 112) and one diet (N = 119) words. Sample words include "workout" and "5K" (exercise), and "recipe" (diet). Participants' Facebook posts were classified as being about healthy-active-living (HAL) if they contained >= 1 diet/exercise word. Approximately 7% of posts analyzed (8,405/119,144) were classified as HAL. Posts with no text, and containing phrases such as "happy birthday" or "tagged in a photo" were excluded from analyses. Posts and dictionaries were stemmed. Human coding of a random sample of baseline posts showed that the dictionaries had excellent (98%) specificity (i.e., could classify posts not about HAL) but very poor (55%) sensitivity (i.e., missed a lot of true positives). Visualizing the percent of posts about HAL over time, we see that posts with exercise words are more common than posts with diet words. It also appears that there is an increase in diet words used by the treatment, compared to control, group from baseline to six months. These differences are also observed when visualizing the frequency of the most commonly used dictionary words over time (i.e., words used > 10 times at baseline).


Linear mixed effects models (R package nlme) were used to test whether participants in the treatment group posted more about HAL over time compared to those in the control group. A random intercept was used to address the nesting of repeated posts over time within person. Significant covariates included in the models were age and number of non-HAL posts. Considering the dictionaries combined, there was no significant differences between groups over time in terms of posting about HAL. However, stratified by diet/exercise, we observed that the number of posts containing diet words increased from baseline to six months among those in the treatment group. Drilling down further, we observe that the increase in posts about diet was driven by females' posting behavior (Beta = 1.01, p = 0.01). The effect is small, and the dictionary approach is extremely crude, and so we are currently completing analyses using a structural topic modeling, and a machine learning approach. We will compare and contrast these results in a forthcoming paper.

Contact Us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Not readable? Change text.

Start typing and press Enter to search