HLI’s Mental Health Programme Evaluation Project (MHPEP) aims to recommend highly-impactful programmes improving the lives of people with mental disorders which are run by nonprofit organisations in low- and middle-income countries (LMICs). The eligibility and evaluation criteria for organisations and the evaluation process are outlined here. Originally, we aimed to evaluate interventions (including those that were not necessarily being implemented by a nonprofit); but because of Charity Entrepreneurship’s work in the area, we shifted towards recommending existing donation opportunities. We still plan to evaluate one or two mental health interventions as part of ‘Project 2’ (read more). This page sets out further motivation for the project, and our approach and results of the first step of our evaluation process.
While there are potentially many ways in which we can make lives happier, improving mental health currently stands out as a particularly promising area, given the scale of suffering attributed to mental disorders (Happiness Research Institute 2020, Chapter 4), as well as the relatively low governmental expenditure globally allocated to improve it (Mental Health Atlas 2017).
The main aim of this project is to identify and direct donations to the implementation of highly impactful mental health programmes. Another aim is to investigate the cost-effectiveness of these programmes in-depth. Relatively little is known about the cost-effectiveness of mental health programmes in LMICs, at least compared to physical health (Horton, 2016), so we think that there is high value of information in cost-effectiveness analysis in this area.
A broader aim of HLI is to study cost-effectiveness in terms of subjective well-being (read more here [link]), and MHPEP is an important step towards this. When measured via the conventional methods used in health economics, programmes targeting mental health mostly seem less cost-effective than, for example, GiveWell-recommended programmes (Levin and Chisholm, 2016; Founders Pledge, 2019). However, as Michael Plant (2019, Chapter 7) argues, if the cost-effectiveness of mental health programmes is assessed using subjective well-being (SWB) scores – individuals’ reports of their happiness and/or life satisfaction – mental health programmes appear relatively more cost-effective than they do on conventional metrics.
We follow a three-step approach for MHPEP:
Step 1 - Longlist: Identify programmes targeting mental disorders in LMICs and make an initial assessment.
Step 2 - Shortlist: Assess the longlisted programmes against relevant criteria to create a shortlist for detailed evaluation.
Step 3 - Recommendations: Carry out in-depth evaluations of the shortlisted programmes, potentially resulting in a list of recommended donation opportunities.
We presented our latest findings at the EAGxVirtual conference in June, followed by a Q&A session:
Step 1 - longlist [complete]
1.1. Selecting mental health programmes for screening
As a starting point for our investigation, we chose the database provided by the Mental Health Innovation Network (MHIN). In several conversations with experts in the field of global mental health, it was mentioned as the most comprehensive overview of mental health projects and organizations, particularly those working in low- and middle-income countries (LMICs). We appreciated the focus on LMICs because the treatment gap for mental health conditions is especially high in these countries (WHO Mental Health Atlas, 2017), particularly in low-resource (e.g. rural) settings. Further, costs of treatment are usually lower than in high-income countries. We assessed only those innovations targeting depression, anxiety or stress-related disorders. This is because (a) they are responsible for most of the global burden of disease caused by mental disorders (Whiteford et al., 2013), and because of our prior beliefs that they (b) are very bad for well-being per-person (World Happiness Report, 2017) and (c) are mental health conditions that are relatively cheap and easy to treat (compared to, say, schizophrenia (Levin and Chisholm, 2016)).
1.2. The screening process
Screenings were conducted over the months of May and June 2019 based only on information from the MHIN database – no additional literature search on the programmes was conducted at this point.
76 innovations were randomly assigned to eight screeners with relevant academic backgrounds. Each innovation was screened by three screeners independently and blind to the ratings of others.
Screeners used the same standardised framework we developed. The inter-rater reliability of our screening tool was tested in two rounds. Overall, we found inter-rater reliability to be sufficient (here and here).
The screening framework
All screening data can be found in the master file (the reader is especially referred to the tab “Screening Outcomes Summary”). The screening framework includes the following parameters:
1.3. Identifying programmes to investigate in more detail
We chose to base our decision on a combined rule including ‘mechanical score’ and ‘intuitive score’: if a programme crossed the respective cut-off point for either of the two, it would be investigated in more detail regardless of its score on the other.
The cut-off points were decided to be defined based on the screening data, taking into account HLI’s limited resources to investigate programmes in more detail. As no clear clustering could be identified, we stipulated that to be considered in round 2, a programme needed to have an intuitive estimate ≥7 and/or a mechanical estimate ≥13. Additionally, we included programmes where there was high disagreement (i.e. a relatively high range of either intuitive estimate or mechanical estimate) and where repeating the highest intuitive estimate or mechanical estimate two times (i.e. adding two hypothetical screenings with this score) resulted in a mean score above the threshold.
This decision rule resulted in a total of 25 programmes, which can be seen in a separate document in Table 1, along with their mean mechanical estimate and intuitive score.
76 mental health innovations were screened as a first step in finding the most effective programmes targeting mental ailments worldwide. Using our screening procedure and decision rule, we identified 25 promising programmes for further evaluation.
A relatively high proportion of screenings (shown in yellow in Figure 1) could not be given even a rough ‘mechanical’ cost-effectiveness estimate on the basis of cost and effectiveness data. This indicates the challenges of finding cost-effective mental health programmes. Cost data were particularly likely not to be included.
Our inability to even vaguely estimate the cost-effectiveness of particular programmes may be either a result of the information existing but not being listed on MHIN, or its not having been collected so far. This lack of information is reflected in the considerable disagreement between raters when assessing both the intuitive estimate and mechanical estimate (see Table 1 in this separate document), and constitutes a major limitation of our analysis.
Our decision rule defining which programmes will be investigated in more detail imposed necessarily arbitrary cut-offs. While we currently believe that the mechanical estimate and the intuitive estimate offer the most promising combination to identify the most cost-effective programmes, this choice is debatable and so are the respective cut-off points. Hence, we do not have high confidence that all of the programmes we screened out are less cost-effective than those we included in the second round.
There are several other noteworthy limitations. First, screening was based on information from the MHIN database, and the extent to which information was provided varied greatly across programmes. This may have introduced bias towards placing higher ratings on the programmes with more available information. Second, we relied on the intuitive estimate as one of two central indicators determining whether an intervention will be investigated in more detail in the second round of ratings. This score, while presumably aggregating a lot more information than the mechanical estimate, may be prone to bias. Nonetheless, we believe that incorporating this judgment is important because it reflects the subject matter knowledge of our screeners as well as all other information collected via the framework. In addition, our impression was that the overall quality of data on costs and effectiveness for most of the programmes was relatively poor, which adds further value to the intuitive score compared to the mechanical estimate. Third, as we relied on the MHIN database, which has not been regularly updated since 2015, we will have missed any programme not included on that. To counter this flaw, we are currently conducting additional expert interviews in order to identify any additional promising programmes.
Step 2 - shortlist [in progress]
For the second step, we are narrowing down the list of 25 programmes from step 1 and searching for organisations implementing one of these. There are currently 13 priority programmes, based on the following additional criteria to those in step 1: whether a controlled trial has been conducted on the programme; whether an organisation is implementing the programme and can accept donations; whether an organisation has chosen to be evaluated by HLI, sharing information as outlined on this page.
The screening team
We consist of volunteers drawn from the effective altruism community who are committed to promoting human happiness. Our academic backgrounds include psychology, psychotherapy, public health, health economics, law, and philosophy, and our team members are graduates of Harvard, Cambridge, Northeastern University, and the London School of Hygiene and Tropical Medicine.