A monolithic two-year study reveals fittingness apps tin thief group return much steps, but nan improvements are humble and uneven—raising large questions astir who benefits astir and really to make integer wellness devices sustainable.

Study: Can fittingness apps activity agelong term? A 24-month quasi-experiment of 516,818 Canadian fittingness app users. Image Credit: Nan_Got / Shutterstock
In a caller study published successful the British Journal of Sports Medicine, researchers evaluated nan semipermanent effectiveness of fittingness apps.
Despite complete 100,000 commercialized fittingness apps crossed app stores, location is constricted knowledge astir their semipermanent effects. Systematic reviews of randomized controlled tests (RCTs) person recovered nary study examining nan effect of fittingness apps beyond 1 year. Understanding nan semipermanent effects of fittingness apps is important to designing them effectively.
However, implementing longitudinal RCTs successful integer environments tin beryllium challenging owed to factors specified arsenic package improvement costs and retention issues. Moreover, relying connected accepted RCT methods whitethorn limit nan benefits of fittingness apps. As such, robust quasi-experiments integrating strategies for improving soul validity whitethorn thief uncover nan semipermanent efficacy of fittingness apps.
About nan study
In nan coming study, researchers investigated whether a multi-component fittingness app could summation beingness activity (PA) complete a two-year period. The “Carrot Rewards” app was a commercialized fittingness app pinch micro financial incentives developed arsenic a public-private collaboration successful Canada. Participants downloaded nan app betwixt December 2016 and December 2018.
The app was discontinued connected June 19, 2019, owed to insufficient funding. Data were collected until June 18, 2019. Users could commencement nan cardinal characteristic of Carrot Rewards, ‘Steps,’ upon download. There was a one- to two-week pre-intervention aliases baseline play (i.e., nary PA incentives aliases personalized regular measurement goals) earlier nan involution was utilized earnestly, erstwhile users were asked to deterioration their instrumentality daily.
Users earned very mini (“micro”) regular incentives for achieving adaptive regular measurement goals, and nan withdrawal of regular rewards successful December 2018 created a earthy research to measure durability.
At slightest 5 days (before July 26, 2017) and 3 days (after July 26, 2017) pinch valid measurement counts were required for a baseline measurement count. Carrot Rewards unit established these criteria, aligning pinch nan minimum days needed for a valid play mean regular measurement count. Thereafter, nan squad calculated nan play mean regular measurement counts, ensuring astatine slightest 4 valid days per week.
The superior result was nan play mean regular measurement count. The analyses included users pinch a valid baseline measurement count and astatine slightest 1 different valid study week. A aggregate linear regression exemplary was utilized to measure nan effect of Carrot Rewards complete a 24-month period. The secondary result was nan power of prime covariates connected longitudinal effects, examined utilizing aggregate linear regression; these were baseline PA, commencement season, geographic location, and app engagement.
Results
The analytic sample comprised 516,818 users, pinch an mean baseline measurement count of 6,035. Approximately 47.1% of users were categorized arsenic low-active astatine baseline. Twelve-month retention rates ranged from 48% to 68% crossed commencement seasons. At 24 months, nan rates were 47% for Winter 2016/17 and 38% for Spring 2017. The squad observed flimsy increases successful nan play mean regular measurement count from baseline astatine each clip points.
In particular, a 464-step summation per time was observed 12 months aft baseline, and a 242-step/day summation was noted astatine 24 months. The increases astatine six months from baseline were mostly maintained astatine 12 and 18 months. Users who started utilizing nan app from Winter 2016/17 showed reductions successful play mean regular measurement counts astir study weeks 52 and 104 during nan Winters of 2017/18 and 2018/19.
At 12 months, astir 106,726 users accrued their measurement count by ≥ 1,000 per day, and 24,937 had a 1,000-step/day summation astatine 24 months. At 12 months, astir 41% of users accrued their regular steps by 1,000 aliases more, while astir 25% decreased by that amount. At 24 months, astir 39% of users accrued their regular steps, and astir 27% reduced them.
Users pinch earlier commencement seasons, who had much prolonged vulnerability to rewards, knowledgeable flimsy increases successful their play mean measurement counts from baseline to 12 months. Moreover, among those pinch earlier commencement seasons, measurement count increases from baseline and diminishes astatine 24 months.
Notably, increases successful play mean regular measurement counts were larger for baseline “low active” users (+1,986 steps/day astatine 24 months). Conversely, individuals pinch very precocious baseline PA had substantially ample reductions (−3,969 steps/day astatine 24 months, perchance linked to motivational “crowding out” aliases regression to nan mean). PA increases were minimal crossed geographic locations, pinch nan densest region, Metropolitan Toronto, showing nan smallest increases from baseline. All app engagement levels showed mini aliases minimal increases, and astatine 12 months, those pinch nan highest engagement showed somewhat smaller gains, a shape that attenuated by 24 months.
Conclusions
Taken together, location were very mini increases successful nan play mean regular measurement count from baseline astatine each clip points. However, mean gains did not scope nan commonly cited 1,000-step threshold; nonetheless, much users improved than declined by ≥1,000 steps per time astatine some 12 and 24 months. Therefore, humble population-level increases were maintained complete 2 years, pinch much group exhibiting clinically important increases than reductions.
Interpretation is constricted by nan non-randomized design, attrition astatine 12 and 24 months, and imaginable measurement correction successful measurement counts nether free-living conditions. The authors besides item that micro-incentives proved financially unsustainable, underscoring nan request for replacement models specified arsenic lotteries aliases AI-personalized goals.
Journal reference:
- Nguyen L, Faulkner G, Taylor C, Mitchell MS (2025). Can fittingness apps activity agelong term? A 24-month quasiexperiment of 516 818 Canadian fittingness app users. British Journal of Sports Medicine. DOI: 10.1136/bjsports-2025-109901, https://bjsm.bmj.com/content/early/2025/09/22/bjsports-2025-109901
English (US) ·
Indonesian (ID) ·