Guest post by Kevin Kilty
Some weeks ago, Pat Frank suggested that I might consider writing an essay about the efficacy of masks and mandates to wear masks during this pandemic. I hesitated doing so at first, but March 8th I noticed another research effort on the part of the CDC to justify masks as a prophylactic strategy. This effort seems very deficient in my view and so this essay resulted. What I write here is a summary of a much larger work in progress.
Lincoln Moses and Frederick Mostellar long ago suggested that public policy be organized as experiments so that we might learn of its effectiveness, or lack thereof, and avoid successive failures. When the COVID-19 pandemic arrived last spring, I wrote that we didn’t need to go through successive battles with exponential processes, but that we appeared not ready to gather useful data and evidence about the effectiveness of social distancing and other advice in this battle. Considering the tendency of people to don a mask against all sorts of bad air is so universal that even screen writers employ it to add realism to a disaster scene, one would think we would know something about their effectiveness. We do and we don’t. While I am told by some people employed in medicine along with many amateurs that masks are essential to controlling spread of SARS-COV-2; highly reputable authorities, many of them, thousands of them, make much more modest and even opposite claims.
How might we analyze these competing claims? I see three avenues of attack: First, we can examine theoretical reasons for and against masks from a mechanical perspective. Second, there are limited experiments known as randomized clinical trials available, all of which have some deficiencies and limited pertinence. Third, we can examine observations of the progress of this epidemic as shown by cases in the light of local mandates. These observations and the methods used to evaluate them are quite deficient in many ways, but they do tend toward similar conclusions.
The CDC, WHO, and local departments of health have issued a variety of advisories about masks which they update periodically. A typical advisory begins as follows:
“Because the virus is transmitted predominantly by inhaling respiratory droplets from infected persons, universal mask use can help reduce transmission.”
As a rationale for masks this fails because it does not mention a necessary prior element. In order to work, masks have to attenuate the guilty aerosols. The individual aerosols involved could be only a micrometer or few micrometers in size. The individual virions are in the range of 50-130 nanometers. I have looked at a number of cloth masks that one can purchase and found their pore sizes to be 0.05 to 0.15 millimeters. This is 1000 times larger than virions and hundreds of times larger than small aerosols. No wonder these packages of masks should come with disclaimers. Adding to this issue of excessive pore size is that cloth masks are not made of certified materials, are manufactured to no standard, are often ill-fitting displaying gaps aside the nose and on the cheeks, or pulled down below the nose, and sometimes placed over a beard. Flat surgical masks do better at times with the excessive pore size problem but still present issues with poor fit and gaps.
There is a mask that corrects most of these deficiencies. The N-95 mask is made of qualified materials and manufactured to a standard. These masks attenuate 95% of particles in the size range of 0.3 to 0.5 micrometers. However, they still require attention to fit to reduce gaps, and they are not guaranteed to halt very small aerosols the size of individual virions. A news article last summer in the Japanese newspaper, The Asahi Shimbun, summarized measurements that researchers made on particle attenuation of cloth, gauze, and N-95 masks, supports what I have summarized here. Cloth and gauze masks have zero effectiveness; while N-95 masks perform to specification, but only if fitted and worn properly. And even then there is no guarantee they prevent the transmission of disease.
There is one more mechanical aspect to ponder. Often in a crisis people will offer what expertise they can – they recycle their expertise. Something I am doing here. Recently a number of researchers in the field of fluid dynamics have weighed in with measurements and simulations (as one would expect) using computational fluid dynamics (CFD). The AIP journal Physics of Fluids produced a special issue in October 2020 highlighting the physics of masks. One study uses CFD to model persons wearing masks inside and outside, in various conditions of air flow, to address ability of masks to attenuate aerosols ejected from a cough or a sneeze. They state in conclusion…
“…our results suggest that, while in indoor environments wearing a mask is very effective to protect others, in outdoor conditions with ambient wind flow present wearing a mask might be essential to protect ourselves from pathogen-carrying saliva particulates escaping from another mask wearing individual in the vicinity.”
This means, I presume, that masks are useful in a situation when all around are sick, and sneezing, wheezing, and coughing — in other words, in a Covid ward of a health care facility. What does “very effective” mean? If it means a very great attenuation of particles, greater than 95% say, then this still has to be interpreted in the light of findings that as few as 300 virions can lead to disease. However, one would think that if coughing and sneezing are the issue, then covering a cough or sneeze should do as well, or perhaps even better when one considers the problem of ill-fit and aerosol escaping through gaps. My experience since March 2020 is that I never encounter anyone in public who are so sick that they are simply sneezing and coughing with abandon.
This computational fluid dynamics approach to determining the efficacy of masks resembles the equivalent modeling approach to climate change. They imply that models define reality when, in fact, it should be that observations and measurements do. There is no means to turn CFD models into clinical outcomes.
In summary, there are mechanical reasons to suppose that masks could reduce the spread of virus in some settings, but none appear pertinent to the materials used to construct masks, or to the ways the public wear them in about 98% of situations. Opposed to supposing that masks might work, or modeling how they might work, we can only learn what efficacy they have by making experiments or observations.
The closest thing I have found to true experiments regarding masks are a small number of randomized clinical trials (RCTs). A surprisingly few RCTs involving masks and respirators have been done. I will summarize only two of these. Of these one is pre-COVID-19 and not controversial, and the other is post COVID-19 and subject to controversy and censorship.
There are many respiratory diseases which circulate in the human population. The recent epidemics of MERS, SARS, Ebola and influenza provoked a search for effective non-pharmaceutical interventions. In one example, a group of doctors became interested in how well cloth masks performed for preventing infection in hospitals because such masks are in wide use in the developing world. This trial involved 1607 volunteers at 14 hospitals in Hanoi, Vietnam working in high-risk wards. There were three arms in this RTC: cloth masks, surgical masks, and a control arm of “standard practice” which involved some mask usage but at about one-half the compliance rate of the two treatment arms. The study took place over a four week period, and was to the authors’ knowledge, the first RCT involving cloth masks. Among their findings were that particle attenuation was virtually nil in the cloth masks (97% infiltration), and surprisingly poor in these particular medical masks (44% infiltration). The rate of infection in the cloth mask wearers was double that in the medical mask wearers; medical masks showed some effectiveness, but this contradicted earlier studies showing no efficacy to the medical masks. The researchers conclude that cloth masks should not be advocated for health-care workers, at least until a much better design of such is produced.
The second RCT was performed in Denmark last spring and was subject to censorship by our social media as well as facing some publication resistance. It involved 4862 participants who completed the study. It is more pertinent to this essay because it addressed the efficacy of masks outside of a health care setting. Participants were divided into a control group asked to refrain from wearing masks when out of their home and a treatment arm asked to wear a mask when out of the home for three hours per day. Both groups were ask to follow other social distancing guidelines in order to prevent confounding of masks and distancing which have similar if not identical effects. The primary measured outcome was the number of participants showing SARS-CoV-2 or other respiratory viral infections after one month as determined from PCR testing or hospital diagnosis.
The outcome produced an infection rate of 2.1% in the control arm against 1.8% in the treatment arm. However, the confidence interval of odds ratio (CI of 0.53 to 1.23) included a value of 1.0 almost at its center, suggesting no significant difference in outcomes. If one were to yet insist that the small difference in attack rate (42/2392=1.8% versus 53/2470=2.1%) is nonetheless an important risk reduction, the absolute risk reduction implied (0.003) translates into 30,000 hours (90 hours/0.003) of mask wearing to prevent one case of COVID-19 when community prevalence is around 2.0%. Take that as you may.
There is an interesting series of response letters to this study that are published along with it. These make some legitimate points about design deficiencies. It is certainly true that a study involving masks cannot be a “true RCT” because one cannot blind a study involving masks to a clinical end. The wearer knows they are wearing a mask, and so does the rest of the public. I won’t belabor this point by describing what can go wrong in an unblinded study. Another criticism focuses on using PCR tests, with their false positives and negatives, to measure outcome – a problem which will return in the next section about observations. However despite some criticism, one might note that the outcome of the CHAMP study, in which U.S. Marine Corps recruits were subjected to rigorous social distancing, hygiene and mask wearing resulted in just about the same attack rate as found in this study. I doubt it is possible in the present politicized and hysterical atmosphere to do an RCT on any non-pharmaceutical intervention that could satisfy critics, but none that I know of have shown significant effectiveness of masks.
Before launching into a discussion of what observations concerning the epidemic may mean, a brief segue into the incubation period and other influences on reporting is instructive. The incubation period of Sars-CoV-2 is probably ten or fourteen days long. Following exposure there is a probability on each successive day of someone becoming a case with half of the ultimate cases developing by day five or six. The process behaves like a low pass filter with a delay. Figure 1 shows this. One-hundred exposures on day zero, presuming all result in cases, produces rising numbers until 19 cases occur on day five. Then they decline to zero.
This has two important considerations. First, it smooths the results of any factor producing a change to R, the reproductive ratio, and makes such changes harder to detect. That is, it reduces resolution. Second, it produces a correlation of cases day to day, so that counts of cases on successive days are not independent of one another, and this has the effect of reducing the degrees of freedom in observational data.
Add to this the distortions resulting from common graphing options like 7 to 21 day averaging done with one-sided (causal) filters; and distortions which resulted from switching from clinical diagnosis to “lab confirmed” cases resting on PCR tests, and what one has is a mess. It is easy to reach a point where what a graph shows today is what might have happened three weeks earlier.
One does not have to search extensively to find evidence suggesting that epidemics proceed unhindered despite all sorts of mandates. I know of no epicurve showing a clear effect. Figure 2, using data drawn from the Covid Tracking Project, for example, shows a comparison among Colorado, New Mexico, and Utah. Despite mandates of various rigor, introduced at different times, the epicurves are virtually the same. The Swiss Policy Research Group produced a nice twelve-paned panel, found here, which makes comparisons among various countries, with the same result – masks have no obvious benefit. A more detailed time series of cases in four German cities during April, 2020 also shows no benefit; however, I would criticize these time series as being of such short duration following the mandatory mask order as to have possibly missed the period of greatest effect, if there is one, just over incubation delay.
The global data firm Dynata reported that by the first of July mask wearing in Houston and south Florida was likely to be 80% even before mandates; yet these places saw multiple large waves of infection thereafter. California and New York applied rigorous mask mandates, yet still went through several large waves in the summer and autumn. The USA as a whole, in which 39 states imposed mask mandates in April or before, exhibits an epicurve almost identical, except for vertical scale, to Wyoming, the smallest state, even though Wyoming applied no state-wide mandate until November 9. The CDC reported that most people contracting COVID had worn masks, although self-reporting is notoriously inaccurate.
There are many problems with our observational data. Death counts have been biased by incentives provided to hospitals over payments for COVID-19 deaths. While many states tried to build useful epicurves by placing cases on date of symptom onset, many publically available data sets were built by date of case report and become dominated by the cycle of bureaucratic testing and reporting rather than by characteristics of the disease. To see how these differ Figure 3 shows Colorado data from 08/02/20. The difference is stark with a dominant seven day cycle which some people have confused with a dynamic of the disease and which disappears in the date of onset rendition. A subtle effect like mask usage is likely to be lost in these extraneous influences.
The case data is a mess because when it began early in 2020 cases were confirmed through symptoms or at least a probable contact with another case, but eventually became dominated by mass testing of people without symptoms using PCR tests. Once this mass testing took hold even states trying to maintain an epicurve by date of onset could no longer do so. Figure 4 shows the curve for the state of Wyoming which became dominated by the weekly cycle of PCR testing which began at the University in Laramie in mid-august, but really took effect with return of students around September 1. Because so many of the “lab confirmed” cases had no associated symptoms a full one-third of cases remained always under investigation and the date of report became the de facto date of onset.
This university provides an interesting case study in itself. The total number of cases from the start of the epidemic to the 31st of August in the entire county was134 – less than one case per day. The university instituted a very rigorous set of rules for reopening including mask wearing in all settings inside and out, rules for limiting number of persons in university vehicles, foot traffic patterns inside buildings, dedicated entrances and exits, periodic sanitation of all surfaces, social distance guidelines and even a web site to report persons not following rules. I did a few informal surveys around campus in September and October and thought mask compliance was between 80 and 90%.
Nevertheless by October 15, six weeks later, the county had added 780 cases of which 551 (71%) were connected to the U.W. campus. The rules and masks appeared to present no barrier to the spread of our mini-epidemic.
Evidence provided to support mask mandates consisted mainly of a single study. There have been many criticisms of this study, including one which suggested it be retracted. However, ignoring its controversy for the moment, let’s just focus on what the authors have to say.
They state, first of all, that masks may have effectiveness as large as 85%, but that this estimate has low confidence – precise number but narrow confidence interval. Second, they notice a diminished effectiveness between N95 respirators on the one hand and cloth masks with 12 to 16 plies on the other. No one wears cloth masks with even one-fourth as many plies. Thus, this can’t be an endorsement of cloth masks. No one has unlimited access to N95 respirators, and couldn’t because there is not enough manufacturing capacity to supply them to the public in general. Thus, this “essential” study does no more than reiterate what the other sources of information, including the measurements of particle attenuation reported in the Asahi Shimbun article, have to say. Its recommendations are not pertinent to reality of mask wearing by the general public. This is an unscientific rationale.
A more recent effort to promote masks as essential to controlling the pandemic appears to me to have many shortcomings. This is a retrospective study of the history of the epidemic on a county level, referenced to timing of mask mandates and orders to close or limit restaurant traffic between March 2020 and October 2020. It is what economists would call an “event study”. Problems with the study include:
- The event involved in an event study should be independent of the data. It is not in this case. Mask mandates were generally applied through political pressure during a pandemic wave. Often applied when the wave had begun to wane.
- Mask mandates are probably hopelessly confounded with other orders such as closure of restaurants. According to the researchers themselves, the mask mandates began in April in 39 states, and restaurant closures began in 49 states in March and April. Two influences atop one another. The claim to having a mask measurement unconfounded by closures cannot be true, or there was a lot of data sorting involved which becomes another confounder.
- The paper is missing details about the statistical methods and calculation of significance.
- Even if significant in a statistical sense, the effect seems very small.
The worst flaw seems to me to be a subtle one. The underlying data of the CDC study are curves of cumulative cases and deaths, which I have already explained are flawed to begin with. However, the typical cumulative curve, being a logistic curve, has a particular shape that begins as an almost exponential rise but quickly passes through an inflection with constantly diminishing slope as it approaches a horizontal asymptote. Such a curve will display a long sequence of days in which the case rate declines. An average of daily changes over segments of this decline, even with noise added, which are then referred to an earlier time period, will produce results just like those in the CDC study. No matter what the cause of the limit to an epidemic, the result is the same. What has happened is the CDC has chosen a statistic having a nearly perfect expectation to the characteristics of a logistic curve from any limiting influence, and cannot draw a distinction between the null hypothesis and a particular alternative. It is like circular logic.
There are situations, health care settings mainly or situations of extreme community prevalence with a lot of coughing and sneezing in public, where masks serve a useful purpose. Yet, people who insisted last spring that the epidemic would go away with mask mandates could not have been more wrong. Every consideration shows this.
Nearly all the masks we see people wearing are constructed to no standard, made of varying sorts of cloth, are poorly fitting, are worn with near complete disregard for effectiveness, reused who knows how many times, used for what else we know not, and are often completely open at the cheeks, nose, chin and beard. They appear mainly useful for making a person touch their face constantly.
How about experimental or observational evidence from the present pandemic? The only experimental evidence is consistent with the benefits being so small they cannot be distinguished from occurrence by chance. Probably no new experimental evidence will become available for the following reason: People have probably changed their behavior drastically during this pandemic leading to too many confounding factors to identify the effect of just one. As the epidemic wanes recruiting sufficient subjects for RTCs becomes difficult.
Masks mandates are not a risk free intervention. They have a poor effect of civil society, they absorb resources, they possibly carry health risks of their own, and they certainly contribute to mistaken notions of safety and risk. Masks seem to me like a solution to a political problem which should alone raise skepticism about all claims.
1- Gery P. Guy,Jr. et al, Association of State-issued Mask Mandates and Allowing On-Premises Dining with County-level COVID-19 Case and Death Growth Rates, https://www.cdc.gov/mmwr/volumes/70/wr/mm7010e3.htm?s_cid=mm7010e3_w, last accessed 3/8/2021.
2-Lincoln Moses and Frederick Mostellar, Experimentation: Just do it!, In Statistics and Public Policy, Bruce D. Spencer Ed., Oxford U Press, 1997.
3-Futile Fussings: A history of Graphical Failure from Cattle to #coronavirus https://wattsupwiththat.com/2020/03/31/futile-fussings, last accessed 03/13/2021.
4-Close Encounters of the Third Kind, for example.
5-I have a collection including about three-dozen essay, opinion pieces, and research papers, discussing the topics of social distancing, mask mandates, lockdowns, school closures. These include contributions by Dr.s Scott Atlas, John Ioannidis, Paul Alexander, Donald Henderson, Jay Battacharya, Sunetra Gupta, Carl Henehgan, Tom Jefferson, Martin Kulldorff, and others; and almost all of these have been ignored, scorned, or censored in some way.
-Individual virions are mentioned as having various sizes ranging from 50 to 130 nanometers in various internet sources. Corona viruses are pleomorphic which means they have a variety of shapes.
7- Cloth face masks offer zero shield against virus, a study shows, Nayon Kon, The Asahi Shimbun, July 7, 2020.
8-Ali Khosronejad, et al, Fluid Dynamics simulations show that facial masks can suppress the spread of COVID-19 in indoor environments, AIP Advances 10, 125109, (2020); https://doi.org/10.1063/5.0035414;
9-Referenced in Imke Schroeder, COVID-19: A Risk Assessment Perspective, J Chem Health Saf., 2020 May 11: acs:chas.0c00035
10-Tom Jefferson, and Carl Heneghan, Masking lack of evidence with politics, Center for Evidence Based Medicine, July 23, 2020. In particular the authors note the surprisingly small number of RTCs considering the great importance of controlling respiratory disease.
11-C. Raina MacIntyre, et al, A cluster randomized trial of cloth masks compared with medical masks in healthcare workers. BMJ Open 2015;5;e006577. doi.org/10.1136/bmjopen-2014-006577. Two earlier studies conducted in China by same group found no effectiveness for medical masks.
12-By significant in this context the authors mean a 95% confidence interval that does not enclose a relative risk of infection of 1.0, but is entirely above or below 1.0.
13-Henning Bundgaard, et.al. Effectiveness of adding a mask recommendation to other public health measures to prevent SARS-CoV-2 infection in Danish mask wearers, Annals of Internal Medicine, 18 November 2020. https://doi.org/10.7326/M20-6817
14-Andrew G. Letizia, et al, SARS-CoV-2 Transmission among Marine Recruits during Quarantine, N Engl J Med 2020; 383:2407-2416. DOI: 10.1056/NEJMoa2029717
15- Not finding significant protection, significant in the statistical sense, does not mean masks are completely ineffective, or counter-effective, but rather that their effect was not so large that it could be distinguished from a chance outcome at some level, usually 95%, of confidence.
16-P.E. Sartwell, The distribution of incubation periods of infectious disease, Amer. Jour. Hyg., 1950, 51:310-318. Sartwell lists coronaviruses as having a log mean of 0.4 (2.5 days) and dispersion of 1.5. However, a recent training class stated a median of 5-6 days for SARS-CoV-2. I used 5 days for purposes of producing Figure 1.
17-swprs.org/2018/10/01/covid-19-intro/ search for the English language version.
18- This panel of four German city graphs can be found at swprs.org/face-masks-evidence/ last accessed on 3/12/2021
19-This is well known, but see for example, chaamjamal, Illusory Statistical Power in Time Series Analysis, April 30, 2019, https://tambonthongchai.com/2019/40/30/illusory-statistical-power-in-time-series-analysis/ last accessed 1/18/2020
20-WSJ July 29, 2020.
21-CDC report referenced in article at The Federalist, CDC Study Finds Overwhelming Majority Of People Getting Coronavirus Wore Masks, October 12, 2020 https://thefederalist.com/2020/10/12/cdc-study-finds-overwhelming-majority-of-people-getting-coronavirus-wore-masks/
22-Payments for covid deaths, but not for others is incentive enough to bias results.
23-My attempts to learn how many cycles were being employed to report PCR results revealed that no one at any responsible agency in my state knew. All they would do is refer me to a misleading and wrong page at the supplier of the tests. However, a news item reported that researchers at Wayne State University a variety of cycle numbers are used to report results nationally including numbers from 25 to above 37. Viral Loads In COVID-19 Infected Patients Drop, Along With Death Rate, Study Finds Researchers find “a downward trend in the amount of virus detected.” Joseph Curl, DailyWire.com, Sep 27, 2020
24-UW to implement enhanced covid-19 testing program Monday, UW press release, Oct. 15. Data from this also mentions the university expects to perform 15000 tests per week. Yet my asking questions revealed that no one seemed to know what to expect from false positive and negative results. Amazingly few people recognize that interpreting the outcomes of PCR tests is a matter of conditional probability and cannot be done reliably without other information. Even one-half of the faculty and students at Harvard medical school did not know this according to an example from Julian L. Simon in his book “Resampling: The New Statistics, 1997.”
25-Derek K Chu, MD, et al, Physical distancing, face masks, and eye protection to prevent person to person transmission of SARS-CoV-2 and COVID-19: a systematic
review and meta-analysis, The Lancet, v 395, issue 10242, p1973-1987, June 27, 2020 https://doi.org/10.1016/S0140-6736(20)31142-9
26-For example, the Center for Evidence Based Medicine (CEBM) at Oxford University objects to its social distancing conclusions.
27-The term “N95 Respirator” is ambiguous. These respirators are designed to be tight fitting, but most N95s are manufactured for construction, while there are N95s specifically manufactured to prevent disease transmission. Unfortunately the studies cited do not present a clear picture of which N95s were employed.
28-Refer to note #1 above. But in addition to my concerns listed here more were raised in Paul E. Alexander, The CDC’s Mask Mandate Study: Debunked, AIER, March 4, 2021 https://www.aier.org/article/the-cdcs-mask-mandate-study-debunked/ last accessed 3/13/2021
29-John Staddon, Scientific Research: How Science Works, Fails to Work, and Pretends to Work, Routledge, 2018, p. 124.