Over the spring and summer, I did a series of posts on extracting quality information from FDA enforcement initiatives like warning letters, recalls, and inspections. But obviously FDA enforcement actions are not the only potential sources of quality data that FDA maintains. FDA has what is now a massive data set on Medical Device Reports (or “MDRs”) that can be mined for quality data. Medical device companies can, in effect, learn from the experiences of their competitors about what types of things can go wrong with medical devices.
The problem, of course, is that the interesting data in MDRs is in what a data scientist would call unstructured data, in this case English language text describing a product problem, where the information or insights cannot be easily extracted given the sheer volume of the reports. In calendar year 2021, for example, FDA received almost 2 million MDRs. It just isn’t feasible for a human to read all of them.
That’s where a form of machine learning, natural language processing, or more specifically topic modeling, comes in. I used topic modeling last November for a post about major trends over the course of a decade in MDRs. Now I want to show how the same topic modeling can be used to find more specific experiences with specific types of medical devices to inform quality improvement.
Continue Reading Unpacking Averages: Using Natural Language Processing to Extract Quality Information from MDRs