The United States Food and Drug Administration (FDA) for many years has been trying to increase the participation of minorities in clinical trials to help ensure that regulated products are tested and labeled in an appropriate cross-section of Americans.  Clinical evidence has shown that there are significant differences among the races that impact the safety and effectiveness we can expect from a particular drug or device, and consequently FDA has concluded testing and labeling to identify those racial differences are important.  The question for today is, how are we doing in achieving racial diversity in clinical trials involving drugs?

The data come from, a repository to which sponsors of controlled clinical investigations (other than phase 1 investigations) of any FDA-regulated drug or biological product for any disease or condition must submit certain information. This website is managed by the U.S. National Library of Medicine, with support from FDA.  Data on race come from the results section of the repository, which was launched in September 2008 to implement Section 801 of the Food and Drug Administration Amendments Act of 2007 (FDAAA).  Specifically, race data are found in the Baseline Characteristics section of the results.

When looking at the data over the last 10 years, it is important to keep in mind that the policy in this area has been evolving.  For example,

These policies on racial data reporting are on top of efforts that FDA has made to generally encourage sponsors to diversify their clinical trials.


This is an interactive chart, so if you place your cursor over the chart, and use the scroll function on your mouse, you can zoom in on the data.  For example, it is useful to zoom in on the lower portion of the chart for 2020 to see the differences more clearly in the clinical trial racial profile versus the population racial profile.


As a preliminary matter, we filtered the clinical trials data set to include only those trials where in the oversight module the sponsor indicated that the subject of the trial is an FDA regulated drug. We then filtered and sorted the data by the Results First Submitted Date.  The database includes many dates, and we picked this particular date because the rule for sharing data on race only applies to the submission of results, and the date the results are first submitted would be the first date on which the obligation to share racial data would be triggered.

We then sorted the clinical trials by whether they included data on race in the format specified by OMB, or only in a custom format that the sponsor would have created.  In the custom format, the sponsor can essentially define their own categories of race which are different from the OMB categories.  While a sponsor may feel that their own customized approach better reflects the racial composition of the subjects in the trial, that approach frustrates efforts to aggregate the data.

In the chart above, the bars in the background reflect the total percent for a given calendar year of the clinical trials submitting race data using the OMB methodology (orange), followed by sponsors who only submitted data according to a customized format (blue).  The data presented in the line graphs reflect only the data contained in the standardized OMB reports.  They are thus incomplete, both because they do not reflect the customized reports but also because obviously they do not reflect data that sponsors did not submit.  Notice that in later years, about 10% of sponsors are not submitting any data on race.  If a particular sponsor submitted both the standardized format and the custom format, we used the standardized data.

To give the population benchmark, we used the 2020 Census data that also follows the same OMB framework for categorizing race.  We did not provide census data for earlier years because it made the chart too busy and frankly, it did not change that much over the 10 year period.

It is important to understand that a basic premise of collecting race data is to allow a citizen to self-identify race, rather than have a race assigned to them.


Among sponsors submitting results, it seems reasonably clear that over the years more sponsors are sharing their race data.  The policy timeline above explains that significant agency announcements were made in 2014, 2016 and 2017.  It took a while to kick in, but the last four years have seen reasonably high compliance rates.

The lines reflecting participation by the different races overall do not exhibit a trend over the last 10 years.  We are not seeing, for example, a major decline in the percentage of clinical trial subjects who are white, versus those that fit the minority definition.

Selecting 2022 for the benchmarking against the overall US population, if you use the interactive feature to zoom in on the lower part of the chart to see the finer differences among the various minority groups, the following summarizes the clinical trial participation versus population percent:

  • White and Asian populations have greater clinical trial participation than their overall population percentage.
  • Blacks, Native Americans, and Pacific Islanders all participate roughly in proportion to their population percentages.
  • People of more than one race, and those who do not identify with any of the choices given (therefore unknown), are both significantly lower in their clinical trial participation than their overall percentage in the US population.

While the policy changes that FDA seem clearly to have encouraged increases in data sharing, those policy changes do not seem to have had much of an impact on the diversity of clinical trial subjects.

Attorney Bradley Merrill Thompson is the Chairman of the Board and Chief Data Scientist for EBG Advisors and a Member of the Firm at Epstein Becker Green.

 The opinions expressed in this publication are those of the author.