Largest Opioid Dataset of its Kind in US History

The first ever “The Opioid Hackathon” demonstrated a new and rapid turn-turnaround approach to solving the opioid crisis through citizen engagement and led to 20 potential solutions in fewer than 24 hours.

On October 14th-15th, 2018, teams of computer and data scientists, public health officials, researchers, and patients/families affected by the opioid crisis traveled across the country to compete on finding software and big data-based solutions to the opioid crisis at the University of California Institute for Prediction Technology’s (UCIPT’s) “The Opioid Hackathon.”

Hackathon participants received the largest opioid dataset of its kind in US history, with greater than 150 opioid-related datasets. Some of them are publicly-available and include links, other were proprietary (e.g., provided by companies with non-disclosure agreements) where we have listed the name of the dataset. For all of you data nerds out there like me, I’m putting up the list of datasets for you to enjoy!
( is an HHS open data resource providing 254 health-related data sets, some of which might help inform modeling/visualization solutions to the opioid crisis.


For this event, we have partnered with Socrata to prepare and provide participants of this hackathon special access to an API including more than 120 opioid-related datasets, the largest dataset on opioids of its kind in history. All teams should have received a link for how to access the data. For assistance/developer Help, see
(we’re awaiting response of whether we can share these datasets with the general public)

California Open Data Portal


The California Open Data portal provides hundreds of open data datasets on California, including 334 datasets related to health and human services.

California Opioid Overdose Surveillance Dashboard

This site is the result of ongoing collaboration between the California Department of Public Health (CDPH), Office of Statewide Health Planning and Development (OSHPD), Department of Justice, and the California Health Care Foundation. The goal is to provide a data tool with enhanced data visualization and integration of statewide and geographically-specific non-fatal and fatal opioid-involved overdose and opioid prescription data.


OpenJustice is an open source data portal provided by the California Department of Justice, including datasets that may help inform opioid-related solutions, such as drug-related arrests.


The California Department of Justice oversees the management and data on California Health and Safety Code section 11165(d), which requires dispensing pharmacies, clinics, or other dispensers to report prescriptions dispensed for Schedule II, Schedule III, or Schedule IV controlled substances to the Department of Justice (DOJ).

Drug Enforcement Agency Datasets


The DEA provides data on a variety of drug-related activities, including drug labs in the US and drug seizure data.

Medi-cal Drug Utilization Data

Medi-Cal drug utilization data is available for fee-for-service outpatient drugs reimbursed on or after July 1, 1996, by Medi-Cal to pharmacies. The data are available in ASCII fixed length records format only.

Google Datasets Search (

Google has recently launched a tool to help researchers find datasets. This tool is in Beta.

Inter-university Consortium for Political and Social Research (ICPSR)

ICPSR maintains a data archive of more than 250,000 files of research in the social and behavioral sciences. It hosts 21 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields, such as treatment episode data.

The State of the USA

Provides data on overall health & economic status at the individual community level

Postmarket Drug Surveillance Programs

Produced by the FDA showing adverse drug effects reports

The Office of Women’s Health – Quick Health Data Online

Health Indicators with a focus on women (eg. mammogram rates)

Behavioral Risk Factor Surveillance System (CDC)

Survey of behavioral factors

Cannabis Dispensary and Sales Information from BDS Analytics
Dataset available here (we’re awaiting response of whether we can hyperlink to share this dataset with the general public)

Headquartered in Boulder, Colo., BDS Analytics provides businesses with comprehensive, actionable, and accurate cannabis market intelligence and consumer research. The company provides a holistic understanding of the cannabis market by producing insights from dispensary point-of-sale systems through its market-leading GreenEdgeTM platform, driving consumer research with its Cannabis Insights Group, and generating market-wide cannabis industry financial projections. To learn more about how you can utilize BDS Analytics’ comprehensive market research, please visit BDS Analytics data is sourced from dispensary Point of Sales reporting. Through their relationships with hundreds of dispensaries across the country, BDS Analytics captures detailed daily sales information, cleanses and standardizes data to the individual product level, and projects sales to for defined markets in multiple states.

Cannabis and Integrative Health Survey Data from

Dataset available here. (we’re awaiting response of whether we can hyperlink to share this dataset with the general public)

Healer is a trusted, doctor-developed medical cannabis brand and provider of internationally acclaimed educational content. Founded to address the challenge of educating patients on how to best use cannabis, Healer’s educational material are based on the work of leading cannabis clinician, Dr. Dustin Sulak, D.O., an expert and educator on medical cannabis and pioneer of clinical applications. Data here are from a survey sent out to Dr. Sulak's medical practices in 2016 for researching opioid use with cannabis patients and include both opioid and cannabis (as well as overlapping) patients.

Sean Young PhD

Sean Young, PhD, MS is the Executive Director of the UCLA Center for Digital Behavior. I'm a scientist, innovator, and UCLA medical school professor. I study the science behind human digital behavior (see for more info about this field of research).I also assemble technology teams and solutions to improve UCLA Family Medicine patient care. For more info or to contact me: