How not to do an evidence check
It's troubling that Australia's health minister appears to regard a low-quality Sax Institute review as a model of best practice
In July 2023, the NSW state government in Australia unveiled its blueprint for gender treatment services for children and youth.
The document, titled “NSW Framework for the Specialist Trans and Gender Diverse Health Service for People Under 25 Years”, outlined the expansion of the “affirmation model” for young people suffering gender distress, in line with guidelines endorsed by the Australian Professional Association for Trans Health (AusPATH) and from the World Professional Association for Transgender Health guidelines (WPATH).
The framework’s foundations were laid in an “evidence check” rapid review that was conducted by the Sax Institute for the NSW Ministry of Health in September 2020. This review supported the use of puberty blockers and cross-sex hormones, considering them effective and safe.
However, in October 2020, the National Institute for Health and Care Excellence (NICE) in the UK published two systematic reviews that cast doubt on the clinical value of puberty blockers and questioned the safety and effectiveness of gender-affirming hormones.
The disparity between the Sax and NICE reviews highlighted critical differences in approach to evaluating the quality of research literature.
The NICE reviews considered the research to be of too poor quality to guide clinical practice. The Sax Institute acknowledged the poor quality and high risk of bias in the research studies, but still cited their findings to justify their supportive statements regarding puberty blockers and cross-sex hormones.
In July 2023, following an ABC Four Corners episode on gender treatments for children, the Sax Institute was commissioned by the NSW government to update the evidence check.
In April 2024, whilst the Sax Institute evidence check was in preparation, the UK’s Cass review final report was published. This report was the culmination of a comprehensive four-year examination of all relevant literature, including eight systematic reviews, and extensive community consultation.
It found that, due to lack of evidence of benefit and the certainty of harmful side-effects, puberty blockers should be restricted to ethics-approved research trials, and that cross-sex hormones should be used with extreme caution in people under the age of 18, with approval from an independent expert panel required.
It concluded: “the evidence does not adequately support the claim that gender-affirming treatment reduces suicide risk”. The review found no convincing evidence that the “gender-affirming” model improved the mental health of minors and provided detailed recommendations for an alternative model prioritising psychosocial interventions.
On 6 September 2024, the Sax Institute published their evidence check (“the Sax report”) which examined the literature between 2019 and September 2023 and concluded that the research supports the government’s gender-affirming services. The Sax report did not refer to the Cass review.
On 31 January this year, Australia’s Federal Health Minister, Mark Butler, held a press conference following his announcement that he had tasked the National Health and Medical Research Council (NHMRC) with developing new guidelines for the care of children with gender distress.1
At the conference, the Health Minister told journalists he had been reading the Sax Institute’s evidence check over his summer break.
We therefore seek to alert all stakeholders to the limitations of this document.
In this article, we consider how the Sax report arrived at contrasting conclusions to the Cass review. Our analysis indicates that the Sax report’s conclusions resulted from the labelling of narrative reviews of poor-quality research studies as “Level 1” evidence; inaccurately reporting the findings of studies; and quoting unreferenced sentences within articles as evidence.
The Sax Institute
The Sax Institute is a registered charity with the listed purpose of “promot[ing] the prevention or control of diseases in human beings”. Charitable organisations can be registered as public companies limited by guarantee, meaning that the company’s members are protected by limited financial liability.
The Sax Institute’s revenue for the 2022/23 financial year was $14.268 million, with 67 per cent of its revenue derived from “government”. Its board has one director appointed by the NSW Minister for Health.
The Sax Institute’s financial dependence on government is acknowledged in its 2023 full financial statement—
“The Sax Institute is dependent on the NSW Ministry of Health (the ‘Ministry’) for a significant contribution to fund corporate costs. The Ministry provides funding on a quarterly basis. It is anticipated that adequate funding will be provided to enable the Institute to pay its debts as and when they fall due. Funding agreements are entered into for five-year periods… The Ministry has formally agreed to extend this funding agreement to 30 June 2028”.
It is a concerning conflict of interest that the Sax Institute’s ongoing viability depends on continuing financial support from the NSW Ministry of Health.
The Sax Institute commissioned the Monash University Behaviour Works group to conduct the evidence check. The process for selecting this research group was not explained. Individual contributors to the report are listed by surname and initial, however, no qualifications or conflicts of interest are disclosed.
Monash University’s organisational conflict of interest related to its affiliation with the Monash Health Gender Clinic—a Victorian statewide gender-affirming service for people aged 16 and over—was not acknowledged.
A note of ambivalence
While other credible independent reviews of the evidence have recommended winding down gender-affirming services due to a lack of evidence of benefit and a certainty of harmful side-effects, the Sax report cautiously supported the affirmation model of care, promoted by AusPATH and WPATH, and employed within NSW Health gender services.
The Sax report, however, provided a disclaimer on its title page that: “It is not necessarily a comprehensive review of all literature relating to the topic area”. It warned the report is: “for general information and third parties rely upon it at their own risk”.
It states: “This report does not make recommendations for policy and clinical practice” and: “This report is not designed to support policy or clinical practice decision making in isolation from other inputs and consultation activities”.
The Sax Institute produced two related documents on interventions for children and young people with gender dysphoria: a full report, called the “evidence check”, and a plain language statement, called an “evidence brief” summary.
The evidence brief summary is unreferenced and includes multiple positive statements about the benefits of gender-affirming interventions which appear suitable for quoting by media and politicians. It is an accessible document that is likely to reassure the public of the benefits of gender-affirming interventions.
For those reading the full report, the Sax Institute uses more cautious language. It laments the poor quality of research in this area and states: “Overall, the evidence about gender dysphoria interventions remains weak due to poor study designs, low participant numbers and single-centre recruitment”.
Warnings about the poor quality of the research are subsequently littered throughout the text. The warnings direct the reader to appraise the individual studies for themselves, raising the question of the point of having an independent review of the evidence.
Failure to consider the impact of publication bias
A key finding of the Sax report is that although “studies in this evidence check update generally report favourable outcomes for gender-affirming care initiatives, the limitations in the evidence need to be borne in mind when interpreting these findings”. And yet there is no consideration of the broader issue of publication bias.
This has been shown to be a significant problem. Examples include suppressed negative studies, and failure to include a priori outcomes in published studies. The impact of activism in the research literature was acknowledged by Dr Cass—
“It often takes many years before strongly positive research findings are incorporated into practice … Quite the reverse happened in the field of gender care for children … Although some think the clinical approach should be based on a social justice model, the [UK] National Health Service works in an evidence-based way”.
Including poor quality research as Level 1 evidence
The Sax report claims to have assessed the research literature according to the criteria of the NHMRC and stated that it included 16 studies (20 per cent of the 82 included studies) that met the highest quality “Level I evidence”.
According to NHMRC criteria, Level I evidence includes systematic reviews of Level II studies (i.e., systematic reviews of randomised controlled trials). In contrast, the Sax report states: “although this is the technical NHMRC classification, the included systematic reviews did NOT review Level II studies”.
The Sax report has, in some cases, considered as the highest level of evidence articles by authors undertaking a narrative review of poor-quality, small, methodologically flawed studies located through a systematic search. The studies included in these reviews were discounted as too unreliable to guide clinical practice by the NICE and Cass reviews.
This approach has the consequence of devaluing the Sax report’s recommendations by introducing unreliable findings from historical studies that are at odds with the Sax report’s stated intention to derive its findings from literature published between 2019 and September 2023. For example, in one such review, eight of the nine studies were published prior to 2019.
The inclusion of these reviews is also at odds with Sax’s claim to exclude articles with “mixed-age populations with no sub-analysis of people ≤ 18 years of age or where the proportion of participants ≤ 18 cannot be determined”. Several of the articles included in these reviews had mixed-age subjects with no sub-group analysis.
The Sax report acknowledges that 48 studies (57 per cent) included in its review are of poor quality. That is, they are Level IV evidence (case series or cross-sectional studies). They also note that the level of evidence for gender medicine for children is deteriorating, with proportionally fewer comparative studies than in their previous evidence check.
Puberty blockers
Following the release of the Sax report, ABC News advised the public that puberty blockers are “a safe, effective and reversible form of gender-affirming care”. According to the plain language evidence brief summary—
“The research shows that these medications are safe and work well to delay puberty, and their effects can be reversed if stopped. Some studies also suggest that this treatment can help reduce the distress young people with gender dysphoria feel during puberty”.
This statement appears to avoid addressing a major clinical concern: that puberty blockers prevent children from recovering from gender distress and so place them on a pathway that progresses to cross-sex hormones and potentially surgeries.
It is well established that more than 90 per cent of children prescribed puberty blockers go on to take cross-sex hormones, posing major risks to fertility and sexual function and possible risks to cognitive development.
The Sax report’s statement that puberty blockers reduce distress in young people with gender dysphoria and are “safe”, “effective”, and “reversible” is supported by two references, both claimed to constitute Level 1 evidence. The first is a narrative account of 11 poor-quality studies, including the original highly criticised Dutch studies.
The relevance of these articles to Australia is debatable, with pubertal suppression provided overseas at age 15 to 16 rather than from Tanner stage II (age ~9 to 11) as recommended by the AusPATH-endorsed Royal Children’s Hospital Melbourne guidelines that underpin all Australian paediatric gender clinics. In addition, the articles included in the Sax review include mixed-age populations with no sub-analysis of people ≤ 18 years of age which were a stated exclusion criteria for the Sax report.
The second Level 1 evidence article similarly summarises nine poor-quality studies, including two single case reports. The studies were published between 2011 and 2020 and subjects were aged between 9–35 years.
The Sax Institute’s claim that puberty blockers are reversible stems from quoting unreferenced sentences from two articles—a sentence from the conclusion of one and a sentence from the abstract of the other. These sentences reflect the authors’ opinion and do not derive from any findings of the studies. There are no studies that examine the impact of puberty blocker cessation in children with gender dysphoria.
Moreover, the Sax report misquotes the findings of the NICE 2020 puberty blocker review. It states that this review found that “GnRHa [puberty blockers] had positive effects on psychosocial functioning and may reduce depression”. In fact, the NICE review actually concluded—
“The results of the studies that reported impact on the critical outcomes of gender dysphoria and mental health (depression, anger and anxiety), and the important outcomes of body image and psychosocial impact (global and psychosocial functioning) in children and adolescents with gender dysphoria are of very low certainty [when rated using the system] modified GRADE.
“They suggest little change with GnRH analogues [puberty blockers] from baseline to follow-up. Studies that found differences in outcomes could represent changes that are either of questionable clinical value, or the studies themselves are not reliable, and changes could be due to confounding, bias or chance.”
The methodological problems encountered in the research literature related to gender-affirming care for children have been documented in detail.
Gender-affirming hormone therapy (GAHT)
The Sax plain-language evidence brief summary regarding GAHT states: “Research indicates that this treatment can improve the mental health and well-being of young people with gender dysphoria”.
Within the full Sax report, it is further stated that: “The largest volume of new evidence pertains to the psychological benefits of GAHT. The identified studies reported positive results across the domains of body image, gender dysphoria, depression, anxiety, suicide risk, quality of life and cognitive function”.
Whilst analysis of all the methodological flaws of the small studies cited by Sax to claim positive benefits from GAHT is beyond the scope of this article, we seek to illustrate the weakness of the report’s methodology by providing these examples.
One study used in the Sax report to claim findings of improvements in body image and gender dysphoria with GAHT was published in 2023 in the Journal of LGBT Health by Tavistock gender clinic researchers, including clinic director Polly Carmichael.
In this retrospective study, 109 subjects were eligible for inclusion, but only 38 had completed questionnaires after one year of either puberty blocker or cross sex hormone intervention. Excluding puberty blocker subjects, 27 subjects completed questionnaires for GAHT at 12-months. Only 21 subjects completed the Body Image Scale. The study found no change in the overall Body Image Scale score.
The Sax report’s incorrect claim of an improvement in body image may relate to a statistically significant, but not clinically significant, improvement in one sub-scale that was outweighed by deterioration in the two other sub-scales.
This same UK study was also cited to support the Sax report’s claim of a reduction in gender dysphoria on GAHT. This finding appears to derive from the 19 subjects who completed the Utrecht Gender Dysphoria Scale (a 12-item, 5-point Likert-scale) after 12 months of GAHT. There was a clinically insignificant 0.73 point reduction in gender dysphoria (4.7 reduced to 3.97, max score of 5).
The Sax report also misquotes the NICE 2020 review of GAHT as concluding that GAHT reduces depression, suicidality, and behavioural problems, and increases quality of life. However, this NICE review actually concluded—
“This evidence review found limited evidence for the effectiveness and safety of gender-affirming hormones in children and adolescents with gender dysphoria, with all studies being uncontrolled, observational studies, and all outcomes of very low certainty. Any potential benefits of treatment must be weighed against the largely unknown long-term safety profile of these treatments”.
Another (falsely claimed to be) Level 1 study cited in the Sax report to further justify the claim that GAHT reduces depression in children and young people was a review article that included only studies of adult subjects.
Gender-affirming chest surgery
The Sax report’s conclusions regarding chest surgery have similar shortcomings. To make claims of sexual wellbeing following surgery and low rates of regret and dissatisfaction, the Sax report cites a review article that included adult subjects who had undergone gender-affirming chest, genital, facial, vocal cord, and Adam’s apple removal surgeries with a follow-up period of at least one year.
This contradicts a statement in bold in the Sax report that “only studies pertaining to chest and other ‘top’ surgery were eligible for inclusion in this update”. Studies which assess outcomes after only a couple of years may result in cases of regret being missed, with research indicating regret typically peaks around 8 years.
Conclusion
We scrutinise a government-funded research body which produced a report supportive of a health policy favoured by the government department that commissioned it. The health policy under consideration is one that affects children, adolescents and young people with gender distress in NSW.
European countries such as Sweden, Finland, and the UK have recognised the lack of evidence underpinning gender-affirming interventions which have profound implications upon the future health and wellbeing of children, including those that belong to minorities such as same-sexed attracted and neurodiverse youth.
These countries have responded with independent inquiries—inquiries that have been done with sensitivity, have included the input of lived experience, and employed the most robust methodology. They have resulted in detailed recommendations, including outlines of necessary resources, pathways and timelines, and outcome measures, with a shift in the primary modality of treatment to ethical psychotherapeutic approaches.
Harms in healthcare mostly comprise major incidents of clinical negligence or malpractice, rather than healthcare policy-making failures as described above. The health policy implications for Australia centre on the independence of research bodies, and the need to break the nexus that can exist between government and matters of public health administration.
Conflicts of interest are likely to be less in the case of public health policies not subject to powerful and well-funded lobby groups. Any independent national agency charged with an evidence review or guideline development would need to have measures embedded to protect itself from such influences.
Dr Spencer, a critic of the gender-affirming treatment model, is a child and adolescent psychiatrist in Queensland. Dr Clarke is a South Australian psychiatrist with an interest in gender dysphoria. GCN reported on the Sax Institute report here.
The evidence that is claimed to support the safety efficacy of the ‘Affirmative’ model of gender care does not exist. In addition there would appear to be a clear conflict of interest tempered by fiscal issues in relation to the role of the Sax Institute and the ‘evidence’ supporting the Affirmative ‘model of care’
That aside, the fact that our Minister of Health and Aged Care, Mark Butler, could be commenting the status of the evidence supporting the model is inappropriate. The Parliamentary website lists, among other things, ‘The occupation of politicians prior to entering Federal Parliament’. Minister Butler’s only work prior to entering parliament is cited therein as:
Union official from 1992 to 2007.
Hardly exposure that would lead to an understanding of 'Evidence Base'
Depressing. And Mr Butler declared publicly that he read it. Possibly a defence at a Royal Commission down the track. “I was misinformed..”