This is the first post in a series about the challenges and unlocked potential in analyzing pharmaceutical data at scale. Feel free to email me or post in the comments section about topics you’d like to see covered in this series.
Today, nearly 60% of Americans over the age of 20 take at least one prescription drug. According to a recent study, in 2015, 4.4 billion prescriptions were dispensed in the United States. Spending on medicines in the U.S. increased by 12.2%, reaching $424.8 billion (based on invoice prices). A key driver in this continued growth is the establishment of new brands, which contributed to over half of the total spending growth in 2015. As the pharma industry continues to grow, so too does the amount of data — data that can and will inform the development of drugs that shape the future of medicine.
Among the numerous disparate streams of data flowing through the healthcare industry, adverse events (AE) data has the potential to make a profound impact on patient health. An adverse event (AE) is an untoward medical event (think side effect) that occurs when taking a drug (e.g. you took ibuprofen and got a skin rash). However, an AE does not indicate that a given drug is responsible for the side effect. Rather, it signals that someone has reported an outcome after a patient took a drug. In other words, it notes a co-occurrence, but not necessarily a causation. Patients, relatives, doctors, and nurses (or anyone, really) can report these events to pharmaceutical manufacturers or regulatory agencies like the FDA. Such events range from non-serious instances like drowsiness to more serious events resulting in hospitalization or even death. Tracking this information is critical to patient health and safety.
Beyond uncovering unknown (and potentially life-threatening) side effects of a drug, AE data can reveal troubling issues about a step in a manufacturer’s supply chain (e.g. a drug being tainted by a harmful chemical), issues with the drug’s mechanism of action, or even new adverse drug-to-drug interactions. Surfacing this information in a timely manner would enable manufacturers to change medication warnings and labelling, restrict usage, discard specific affected batches, or completely recall hazardous medications.
While AE data has long been considered a compliance resource to monitor drug-related side effects, this data also has the power to predict, identify, and prevent dangers that pharmacological interventions can have on patients. Analyzing this data in aggregate could produce real-world findings not revealed in clinical trials that reflect how specific patient profiles or populations respond to certain interventions. Unfortunately, AE data is largely fragmented and unstructured, making timely reporting and holistic discovery very difficult.
Volume: The massive volume of reports, ranging in the millions, can stifle the ability to effectively sort through each report. This volume also results in delays in reporting, as some processes take additional time for case managers to adjudicate and manage. In all this noise, it’s often difficult to sift through true signals of side effects.
Fragmentation: Today, AEs are fragmented across different country locations of a global pharma company, and public AE data from the FDA or similar country agencies live in their own silos. There is not standardized schema to merge this data.
Standardization: Much like real-world health data from claims and EMRs, AE data is messy and largely unstructured. Most AEs are reported in a manual process, as doctors or patients commonly file reports over the phone or via mail. The data is often comprised of non-standardized reports about interventions used or side-effects produced. While there are coding mechanisms to describe the type of AE (e.g. rash, stomach pains, headache etc.), coding standards vary over time and by country. The sheer volume of data makes it difficult to normalize the data at a scale that “old world” data infrastructure could manage.
Duplication: AE reports are often submitted multiple times by patients, nurses, or doctors to different regulatory bodies or manufacturers. The FDA may get the same report that the pharmaceutical company gets. Perhaps the doctor calls it in to the FDA and a patient reports it to the pharma company. This leads to a massive decentralization of data both publicly and within the enterprise, making it difficult to analyze from a single source of truth.
So, where do we go from here? In the next post we’ll walk through how manufacturers and regulators can normalize and manage data to gain a more holistic view of the drug development and patient health landscapes to drive proactive decision making.