Published online: May 18, 2017
The year 2016 witnessed the anniversaries of several key events related to the prevention of neural tube defects (NTD) with folate supplementation. However, the road leading up to this achievement was full of stumbling blocks, both in terms of research ethics and researcher ethics. First, the decisions of ethics review boards differed with respect to allowing placebo groups in folate trials, thus reducing the level of evidence obtained from the earliest studies. Second, statisticians insisted on analysing the outcome of a trial by intention-to-treat – which turned out to be non-significant – rather than by treatment received, which was statistically significant. Third, the recognition of positive results was stymied by the reluctance of some researchers to recognise and quote others’ contributions. All this needlessly delayed the recognition of the NTD-preventive effects of folate by a decade. The story of the prevention of NTD thus offers insights into research inadequacies that have the potential to impede the advance of medical science, with the ethical aspects having the most immediate impact. Efficient ethics review boards play a major role worldwide and if they play safe, they may risk disallowing high-quality studies of great public health import.
It has been a great achievement to be able to prevent spina bifida (neural tube defect [NTD]) with folate (folic acid) supplementation. Nonetheless, this journey illustrates some of the difficulties of clinical research, not least how the working of ethics committees can stymie research and delay the introduction of evidence-based novel therapies. Getting recognition for one’s discovery can also become problematic.
In 2016, the world celebrated a number of the steps that led to the discovery of folate as a key factor in the prevention of NTD. These are summarised in Figure 1. Half a century ago, Richard Smithells and Elizabeth Hibbard noted that the metabolism of folate among women who gave birth to children with serious congenital malformations, such as NTD, was disturbed, compared to the mothers of unaffected children (1). It has been 40 years since the publication of a case series showing that compared to controls, mothers who gave birth to a child with NTD had lower intracellular folate in their erythrocytes and leucocytes in the first trimester (2). It has been 35 years since the first study that identified folate deficiency as a major preventable cause of NTD, and 25 years since the definitive study by the British Medical Research Council (MRC) (3, 4). These were all secondary prevention studies: the women had already given birth to an affected child. The first primary prevention trial (no previous baby with NTD) was conducted in 1992 (5), that is 25 years ago.
Al-Gailani (6) has given a fine account of the controversy surrounding the MRC study in the UK. Here, we expand on several ethical factors left out from Al-Gailani’s account, and his failure to mention the contributions of Michael Laurence and his team. We also place in context the ethical review boards’ decisions that led up to a failed secondary prevention study, and we explain the ethical aspects of why the MRC’s decisive secondary prevention trial on folate became so controversial. A full understanding of this controversy requires an insight into the different clinical trials that Laurence and Smithells were allowed – or not allowed – to conduct by their ethics review boards, and of the different formulations and doses they used.
We expand in particular on two features. First, why is Laurence’s contribution generally downplayed, minimised and often completely ignored? We especially examine the role of statisticians in this respect. Second, why did the ethics review board not allow Smithells to have a placebo control group?
Today, it is difficult to imagine the fear that parents once felt about giving birth to a baby with NTD. Less than 40 years ago, the incidence of NTD was as common as 1 in 100 births in parts of the UK, the country where most of the research to conquer this scourge was conducted. NTD is caused by the failure of the neural tube to close early in embryonal development (days 24 to 26). A piece of the spinal cord is exposed on the surface of the back (a meningomyelocele). The clinical problems depend on the level of the lesion, but generally, the lesion is located at around L1 and thereby, affects the legs, brain, bladder and anus (7, 8, 9, 10). Briefly, there is a loss of motor and sensory nerve function in the legs. The patient may develop hip dislocation and other orthopaedic malformations. The brain may be affected by hydrocephalus and mental development can be damaged in a variety of ways. A neurogenic bladder may lead to overflow incontinence and renal failure. Often there is no control over bowel movements.
Improvement in surgical techniques after World War II led to early operations to cover and protect the meningomyelocele. The landmark publication 45 years ago by the paediatric surgeon, John Lorber, on the appalling quality of life of a series of 524 patients with NTD stimulated research into methods of preventing NTD (8). The publication was a milestone. Briefly, only 7% of those who survived had a less than “crippling disability”, while the quality of life of the vast majority was “inconsistent with self-respect, learning capacity, happiness, and even marriage”. To save children and families from prolonged suffering and stress, Lorber proposed selection criteria for which newborns to operate on and which to leave alone (11). Some of the teams around the world that had been managing NTD and had not been applying some form of selection, treating everyone instead, later regretted the fact that they had not dared to follow the lead of Lorber and others (12, 13). In 1975, one of us (LHB) heard Lorber deliver a lecture, which was unforgettable. The long-term follow-up (up to 50 years) of another non-selected cohort confirmed the poor outlook for NTD patients in that era (9, 10).
The principle of the Lorber selection criteria was to guide decision-making in the immediate neonatal period on whether or not to operate. Delay was not an option because it would mean an increase in the risk of infection, resulting in a worse handicap. The criteria were based on Lorber’s views on how disabilities translated into an unacceptable quality of life, related both to the baby with NTD and the family unit. Both Lorber and Laurence (14) had found that newborns with severe scores could be expected to become mentally retarded and unable to engage in the usual social activities, besides putting the family unit under stress. When not treated, such babies died quickly. Lorber based his criteria on the outcome of his extensive patient series. The medical team took decisions about the management of the baby in consultation with the parents.
As a result of the introduction of screening with serum α-fetoprotein measurements and ultrasound visualisation, in many countries, NTD is detected early in pregnancy among most foetuses today. Following counselling, the vast majority of people in these countries (95%) elect to terminate the pregnancy (13, 15, 16), and the clinical panorama of NTD has, therefore, changed completely. Those born with NTD today have such minor defects that they go undetected by routine ultrasound examinations. Nevertheless, it is criminal not to ensure that the need for an adequate intake of folic acid pre- and peri-conceptionally is not impressed upon women before they get pregnant. Informing women about this would obviate the need to make an extremely difficult ethical decision on the termination of pregnancy. The legislation being introduced in various countries curtailing the right to abortion, including in the case of handicaps, is not in keeping with the views of pregnant women, as indicated by the high rate of termination. Worldwide, many countries still do not allow the termination of pregnancy, whatever the reason, compelling families to care for children with NTD.
In present-day western Europe, the majority of NTD cases born can be described as mild. Gone are the days when 75% of cases were severe, as described by Lorber (8, 11). That proportion had already fallen to less than 50% a decade later; though the outcome was still poor (17). Lorber and Laurence reported that in the USA, fear of litigation meant that the criteria for surgery were not applied consistently, and contrasted this with the UK. A recent study in the Netherlands found that less than 30% of cases were severe according to Lorber’s criteria (17). The long-term management of NTD has also made great strides (19, 20). For instance, the process of clean intermittent catheterisation has resulted in the preservation of renal function and avoidance of incontinence and infections.
Not only NTD, but also various other deformities at birth, including orocranial/orofacial defects and septum defects of the heart, can be prevented by folate supplementation (5, 21). A couple of years ago, we reported a new relationship between low intake of folate and the cognitive function of the adolescent brain (22, 23): the plasma homocysteine concentration (as a marker of poor folate status) correlated negatively with the school grades of young teenagers. It is, therefore, possible that an inadequate intake of folate even while growing up can result in deficient development of the brain. This has enormous public health implications for the world.
Many pregnancies are still unplanned, so women must have adequate levels of folate before fertilisation. Since 1992, women have been encouraged to take supplements at least during the first trimester (pre- and peri-conceptionally). Some countries (including Australia and many Latin American countries) have followed in the footsteps of the USA and Canada and fortify foodstuffs, such as milk, flour and flour products like pasta, with folate. Other countries, such as Sweden, prefer to target only young fertile women, who are encouraged to take folate before becoming pregnant. That this policy has largely failed to reach its goals so far (23) is another matter. It should be noted that the incidence of NTD in Sweden was never as high as in the UK. Today, most cases of NTD occur in countries with high birth rates but diets lacking in folate, such as in the poor but populous Third World countries. Even in the BRICS and MINT countries (primarily China and India because of the numbers), NTD cannot be considered an insignificant problem.
In retrospect, it is clear that Smithells was the main driving force behind the pursuit of folate as the cause of spina bifida, as recognised by Czeizel in his autobiography (24). He was the one to make the original observation, together with Hibbard, besides carrying out the large survey of various micronutrients in the blood, plasma and red cells of women. Nevertheless, Laurence also pursued the subject and by 1968, had already published the first of his papers on the social misery (financial distress, divorce, psychosocial problems among siblings, etc.) faced by those who have a baby with spina bifida (14, 26). This was followed by other clinical observations, including Lorber’s seminal papers (8, 11; see above).
An odd feature of the clinical research on NTD in the UK was that Laurence and his team in Wales were allowed to have a placebo control group in their trial, whereas Smithells, who had been the first high-profile researcher in this field, was not (27). It could be argued that by not allowing a placebo group, Smithells’s ethics committee delayed the implementation of pre- and peri-conceptional folate supplementation by a decade. One could speculate that the ethics committee (strictly speaking, a set of committees) considered that the logic in favour of folate was so strong that they could not allow randomisation to a placebo group for comparison (27). Yet, at the time, there was no proper evidence of the preventive role of folate; it was only a hypothesis. Unfortunately, we do not know why the committees did not grant approval, or why Smithells’ team was unable to persuade them about the correctness of its approach. As far as we know, Smithells has not provided any detailed information on this in his various communications.
In Wales, Laurence conducted a study with a placebo group. Not only was it placebo-controlled, it was a full-scale randomised double-blind trial. Presumably because he and his team were successful in their quest, they had no reason to publish their thoughts on why their ethics review board approved their study. The fact that Laurence’s was a placebo-controlled randomised double-blind trial made it more in line with what is today called evidence-based medicine (EBM). In an interview in 2004 (28), Laurence stated that Archie Cochrane was not involved in the research on NTD, even though Cochrane had already presented his classic paper on EBM (29).
Cochrane was working in the same university hospital as Laurence and went on to set up the Cochrane Institute of EBM in Oxford. Fortunately, Laurence had had the foresight to measure the folate levels in the blood of the participants in his trial so he could analyse the data not only by intention-to-treat (ITT) and treatment received, but also by the folate levels achieved. (ITT means that all randomised patients are included in the analysis even if they are mixed up and given the wrong treatment. The intention behind this is to avoid the creation of various misleading artefacts, such as non-random attrition of the participants in the study. ITT analysis gives information on the potential effects of a treatment policy rather than a specific treatment. Some argue that treatment effects are better judged by per-protocol or treatment-received analysis.)
An important issue in Laurence’s study is the placebo group. A randomised, double-blind placebo-controlled study is considered to have the most rigorous design and, therefore, the highest form of evidence. In contrast, the observational study design that was forced on Smithells by his ethics committee is much weaker. There are a number of systems of assessing levels of evidence in EBM, including the original Canadian system (30), the US preventative task force system (31) and the Oxford system (32). These would accord Laurence’s study levels I, I and 2b, respectively, and Smithells’ study levels II:1, II-1 and 3b, respectively.
One must nevertheless acknowledge the pressure on ethics review boards and committees for monitoring studies. It is unethical to undertake studies that are unnecessary. Likewise, if a study reaches the interim analysis point, it must be closed ahead of time, even if the numbers then are smaller than would be the case if the investigator continued to the planned end, with larger numbers (33, 34).
One reason that Laurence was allowed a placebo group could be that he was professor of medical genetics as well as a paediatrician, whereas Smithells was “only” a paediatrician. Also, it may have been relatively easy for Laurence to convince ethics committees’ members in Wales, where NTD was particularly common. As for Smithells and the researchers collaborating with him, they were scattered around various parts of the UK, some in areas of low incidence. Although at the time, women who planned to get pregnant were strongly advised to stop smoking and to live healthily, from our base in Sweden, we have not been able to document specific instructions to women in the UK to take vitamins in preparation for pregnancy. A study in the USA, covering the period 1968–1980, shows that overall, as many as 14% of US women planning to become pregnant took multivitamin preparations (35). These women tended to be better educated and more affluent than those who did not. The authors noted that the proportion of women who took multivitamins increased during the 1970s, but gave no detailed breakdown over time. It could be that this pattern of behaviour was prevalent in the UK as well. Members of ethics review boards come from the educated middle classes, so one cannot discount the possibility that such attitudes influenced the decision process in the ethics committees that refused the teams around Smithells the opportunity to conduct a placebo-controlled trial. Perhaps the committees felt that having a placebo control group was as good as denying women access to something they had already started to utilise of their own accord.
Laurence was convinced that folate alone was the culprit and to maximise the chance of success, he used a high dose – 4 mg per day. Smithells opted for a readily available commercial cocktail of vitamins, in which the daily dose of folate was about a tenth of Laurence’s – 0.36 mg/day. Given today’s knowledge, Smithells’ choice of dose was more physiological than Laurence’s. The commercial vitamin mixture used by him contained a number of ingredients – not just folic acid, but also ascorbic acid (vitamin C) and riboflavin – that might have been thought to play a role. It also contained a teratogen, retinoic acid (vitamin A). In 1980, Smithells reported that only 1 of 176 women who took the mixture, versus 13 of 260 who did not, gave birth to offspring with NTD (36). Further updates and reports followed, until more than 500 births had been analysed. Since Smithells’ study was an observational one, purists rightly considered that it lacked strength of evidence. He could also not say which agent in the cocktail was the crucial one and the presentation of the data in the updates was difficult to follow, even for experts in the field. This made the message hard to grasp.
The findings of Laurence and his team, on the other hand, strongly implicated folate alone as being the active factor (3). Briefly, non-compliance with the medication created complications in this study, but when the female doctors and midwives interviewed the women, they were able to establish who had not taken their trial medication. In addition, since Laurence had measured the erythrocyte folate levels, his team could show that the folate levels of the non-compliant women were indistinguishable from those of the placebo group. When analysed by treatment received and folate levels achieved, Laurence’s study was statistically significant (p=0.04 and p <0.0001, respectively), though in the more strict ITT analysis, it did not achieve statistical significance. Unfortunately, statistical purists opted to classify his study as a failure for this very reason, although from a biomedical perspective, the aim of his study – to prove that it was indeed folate that was the sole preventive agent – as a qualitative aim would have been more appropriately evaluated by the treatment received or, even better, the highly significant folate levels achieved endpoint. This statistical practice contributed to a one-decade delay in the introduction of efficient vitamin prophylaxis for NTD.
Neither Smithells’, nor Laurence’s studies were perfect, even though they were highly persuasive. Moreover, the failure of Smithells’ group to cite Laurence’s work did not help matters. Laurence – who did quote Smithells – was probably too critical of his own trial, instead of emphasising its strengths (37). Overall, today one is left with the impression that the establishment viewed Smithells as the diligent worker in the vineyard and looked upon Laurence as a latecomer who nearly robbed the other of his due (cf. Bible, Matthew 20:1-14). It should be noted that Laurence, like Lorber, was a refugee who had escaped Nazi persecution – Laurence’s family fled from Berlin in 1938, while Lorber came to England from Budapest, also in 1938. Thus, some bias may have crept in against him on account of his being a foreigner. In the interview cited above (28), Laurence spoke of a certain negativity towards him in the early years of his medical career. He ascribed this to his origin, in spite of the fact that he had studied at the University of Cambridge. However, Laurence also freely acknowledged that others had supported him, not least the leading clinical geneticist in the UK, Cedric Carter, who helped him analyse the data of his trial.
The incomplete results obtained by both pioneers paved the way for further research by a group of epidemiologists, who wanted to launch a definitive study. We must be grateful that these researchers were willing to undertake this task because it is due to their work that we now have a definitive answer (4), further underscored by Endre (Andrew) Czeizel’s positive primary prevention study of 1992 (5). Czeizel was also able to show that folate could prevent other birth defects, such as heart defects, and later, cleft palate – something not commented upon in the MRC study but now confirmed by many recent studies (38). When one contemplates the controversy surrounding the launch of the MRC study – as witnessed by us at the time and summarised by Al-Gailani (6) – one is left with the impression that maybe the study was partially driven by the consideration that the successful completion of a new randomised, double-blind, placebo-controlled study would at least give the organisers a share in the achievements of Smithells and Laurence. Instead, it became a bitter slugfest in the media (6).
The letters written separately by Laurence and Smithells et al in the Lancet, commenting on the 1991 MRC study, are revealing (39, 40). Laurence gives a simple account of what happened to the 255 women (234 pregnancies) at his centre. They refused to take part in the MRC trial but were encouraged to take the 4 mg folate supplement. Laurence was able to recruit 64 women (40 pregnancies) to the study. This indicates that the women voted 4 to 1 with their feet to simply take folate without joining the study. Laurence put the results in context, and the tone was measured. Smithells and his team were prevented by their ethics review board from taking part in the MRC study because it was placebo-controlled. Though they disagreed with some of the discussion in the MRC paper, their conclusion was much the same as Laurence’s, and they went further to exhort all women to take folate, whether or not they were at high risk.
In 1990, that is, a whole year before the MRC study’s report, a secondary prevention study which showed that folic acid could prevent NTD was reported from Cuba (41). However, this relatively small study has been criticised for inadequate randomisation (4). The Cuban investigators used an even larger dose of folate supplementation than Laurence – 5 mg – but they had also taken the precaution of measuring folate concentrations both in serum and red cells. Finally, in 1992 came Czeizel’s and Dudas’s primary prevention study (5), but they had used 0.8 mg of folate supplementation.
Oddly enough, though Brian Hibbard published eight papers together with Laurence, he never published with Smithells. As noted earlier, Smithells and Laurence never published together. The names on the key papers and the persons acknowledged are revealing. In the paper on the MRC study (4), Smithells’ name does not appear anywhere, not even in the Acknowledgements. Laurence is listed as a recruiter (presumably Smithells’ ethics committee had not changed its mind, so he could not be included). It must have been frustrating, if not galling, for Smithells to have his hands tied in this manner and it is intriguing that he seems not to have been involved in the study in some functional capacity, even if his clinic could not recruit. We have been unable to find any paper that names both Laurence and Smithells as authors. The two wrote separate letters to the editor of the Lancet in 1991 about the MRC trial. In Czeizel’s and Dudas’ primary prevention study, Smithells is listed as a member of the “Scientific Advisory Committee”, while N J Wald, who was the principal investigator of the MRC study, is listed among the “External Experts”, though the difference between these two categories seems unclear.
A final ethical aspect is the aggressive discussions of how best to treat newborns with NTD. More often than not, these discussions have taken place in journals of ethics (42, 43, 44, 45) and even legal fora (46). They took place well before the 40- and 50-year follow-ups by the Cambridge group (9, 10) that confirmed the poor outcomes reported by Lorber (8). The 40-year update by the Cambridge group is not cited in Pruitt’s discussion (47), an omission which may be of some significance. Pruitt’s contention is that although those who live successfully with spina bifida today are seen as having miraculously overcome negative odds, “in reality, the odds have been misrepresented in ways that have cost countless born and unborn lives and sometimes negatively shaped the experiences of those who live with spina bifida”. She could not honestly have advanced this argument if she had referred to the Cambridge group’s follow-ups (9, 10). It is a standard practice in medical publications today to provide a statement of conflict of interest, which usually means listing pharmaceutical companies that have provided financial support, lecture fees, etc. It seems that in publications relating to medical research ethics, it may be necessary to request the authors to make a statement of their religion and life views.
Despite the great medical and scientific advances, children are still born with malformations that could be prevented with adequate folate intake, which is especially important in the case of populations whose intake of folate is insufficient. Other mechanisms, such as developmental gene mutations, also play a part, but the promotion of an adequate pre-conceptional intake of folate is an important primary public health aim. The termination of foetuses with gross malformations should be a second-line strategy.
Conflicts of interest
The authors have no conflicts of interest.