Effects of stimulant medication and special education on school performance

In this project we are trying to answer an important but hard question: Do ADHD medication and special education improve school performance? with observational data. Estimating causal effects from observational data is hard, some would say rarely possible at all. To still make progress, this project uses large data sets from national registries and approaches including structural models and Bayesian estimation of structural models.

Here is my related research about selection bias and estimation of causal effects with observational data1.

Some background

Most children with ADHD have minor or severe reading and writing problems already in the third grade. A substantial proportion of children and youth receive special educational assistance and/or ADHD medication. This begs the question: Do medication and special educational assistance improve school performance? This question is important, because the few papers that looked at long-term effects of medication find only very small effects, and we know even less about the long-term effects of special educational assistance.

Special education, ADHD medication and early reading difficulties in Norway. Special education and the use of ADHD medication increase with child age and are more prevalent among boys. Third graders with ADHD have 10 times more likely severe reading difficulties than those without ADHD.

Some methods

This project uses data from Norwegian national registries and the Norwegian Mother, Father and Child Cohort Study (MoBa) to estimate the effects of medication and special education on school performance of children with ADHD or other mental health or developmental problems.

Using observational data to estimate treatment effects is difficult (some would say impossible), because individuals are not randomized into treatment conditions. We are still doing this project for two reasons. First, we are interested in effects of longer-term treatments, but conducting randomized controlled trials (RCTs) is considered unethical if one wants to estimate long-term effects of treatments that one assumes should be beneficial. Secondly, well designed placebo controlled randomized trials have the strongest internal validity–if we see outcome differences between treatment conditions those are due to the treatment, but they still can have a sub-optimal external validity, i.e. we remain uncertain if the treatment effect generalizes to treatment that is not implemented in the context of an RCT. Reasons for a limited external validity include less formalized implementation of treatments outside RCTs or non-random selection of participants into the RCTs. Therefore, carefully implemented observational studies can be a useful complement to RCTs.

Our approach to a careful implementation of an observational study is to formulate and check a structural model that describes the assumed data-generating process, and to use Bayesian estimation of hierarchical models to obtain estimates of causal effects. Here is an article in which we implemented such an approach.

Directed acyclic graphs (DAGs) of design and causal assumptions (a) School performance at time t2 (SPt2, results of national tests in grades 8 and 9) depends on medication (MED), individual special education (ISE), and on confounders like earlier school performance (SPt1, national tests grade 5), parental education (EDU), and comorbid disorders (COM). Additional potential confounders like gender or parental mental health are omitted here for clarity. (b) Bias from observed confounders (here SPt1, which predicts exposure MED and outcome SPt2) can be controlled by adjusting for them. (c) When SPt1 is a collider between UPA (unobserved parental attention) and USC (unobserved school quality), adjustment for SPt1 leads to conditioning on a collider and cannot correct bias. Instead, inverse probability of treatment weights need to be used.


We are starting with a base population of around 600,000 children in Norway that were born between 2000 and 2100. The study population consists of children with ADHD or other mental health or developmental problems (up to maybe 60,000), and these children’s parents and siblings. Parents and siblings are an important part of children’s environment, which has an influence on school performance generally, and which can also influence how effective treatments are implemented. Special education data is only available from MoBa, so that sample sizes will be smaller for these analyses.

Age 0 5 6,7 8 10 (t1) 11-14 15 (t1)
ADHD symptoms MoBa MoBa
Medication NorPD NorPD NorPD NorPD NorPD NorPD NorPD
Special education MoBa MoBa MoBa
School perform. MoBa SSB MoBa SSB
Parental edu & income SSB S SSB SSB SSB SSB SSB
Parental diagnoses NPR NPR NPR NPR NPR NPR NPR
NPR: Norwegian patient registry, MoBa: Norwegian Mother, Father and Child Cohort Study, KUHR: Norway Control and Payment of Health Reimbursement Database, NorPD: Norwegian prescription data base. SSB: Statistics Norwy

Pilot analyses

To check the general feasibility of the project I did some pilot analyses with data from MoBa, NPR, and NorPD (N ~ 1,500 children with ADHD). I used a hierarchical Bayesian ordered logistic regressions to estimate the effect of special education in kindergarten or ADHD medication on reading problems in the 3rd grade. I used inverse probability of treatment weights to adjust for confounding by indication. This analysis shows that on average treatment reduces difficulties, but that also show that effect sizes vary substantially between sub-groups characterised by maternal education and child gender.

Pilot analysis results Top left: Effect estimates for ADHD medication. Top right: Effect estimates for special education. Bottom left: Bivariate distribution of medication years and dosage for 10-12 year old children. Darker shading indicates more frequent combinations. Bottom right: More hours special education per week are associated with fewer reading problems. Shaded areas show 25-95% credible intervals.

Working packages

WP0 provides input for and guides working packages that implement the research and also organizes acquisition of registry data and communication with users. WP1-WP3 follow a similar approach: Treatment patterns and predictors are identified in a first step, based on which effects are estimated in a second step. WP1 focuses on medication and uses data from Norway and Sweden. WP2 focuses on special education. WP3 focuses on interaction effects of special education and medication.

Working packages and their dependencies


The project will be implemented in collaboration with colleagues in Norway, Sweden, and the UK.

  1. This paper does not address one issue that complicates Bayesian estimation of structural models: In some circumstances, bias cannot be reduced through adjustment or multilevel regression and post stratification and one has to use weighting. The problem with weighting is that for a fully Bayesian approach, one would like to estimate weights and treatment effects in one model. However, this is due to model coupling impossible. This the begs the question how one could still propagate uncertainty about model weights into the estimation of treatment effects. As far as I know, there are currently no perfect answers to this question. ↩︎

Guido Biele
Guido Biele

My research interests include statistical and cognitive modeling around ADHD.