menu

Submission

introduction
title
CURE AI: a tool for Rapid Pandemic Response
short description
CURE AI facilitates rapid pandemic response through AI-assisted analysis of pandemic clinical trial data to accelerate therapy selection.
Phase 1 Submission Form
Overview / Abstract

We developed a new machine learning platform called CURE AI (Clinical trials Uncovering Real Efficacy Artificial Intelligence). The CURE AI foundation model uses a deep learning architecture to understand clinicogenomic relationships from clinical trials. We have successfully applied CURE AI to over 12 different clinical trials (shortening clinical trial success detection by 6-18 months), but we have not tested the platform with infectious diseases.

Our proposal is to test how well CURE AI can be trained on COVID-19 patient-level data and then used to predict therapeutic benefit on three completed COVID-19 clinical trials. We will identify, share, and publish (1) clinical variables that predict patient response (2) identification of patient subsets who benefit from trial therapy and (3) if our Accelerated Approval Modeling detects earlier clinical trial success. If successful, CURE AI will function as a rapid pandemic response tool that could be deployed during the next pandemic.

Secondary Analysis: Research Aims

CURE AI is a deep learning technology developed by our team which consists of two main parts: (1) a base neural architecture that was trained on a large cohort of clinicogenomic datasets (>100,000 patients) and (2) finetuning the foundation model on COVID-19 patient-level data to learn clinical patterns that are predictive of patient outcome (risk of lower respiratory tract infection, hospitalization, or mortality). 

We will first train CURE AI with patient data from the COLCORONA randomized clinical trial (doi.org/10.25934/00007133). This dataset includes over 4,500 patients, half who received colchicine and half who received placebo. Endpoints included mortality, hospitalization, or pneumonia due to COVID-19. Finetuning CURE AI on a large COVID-19 dataset will optimize how the model selects clinical variables (gender, race, age, smoking status, BMI, other medical comorbidities) that are most predictive of clinical outcome for patients with COVID-19. Our model, trained on COVID-19 data, will be called CURE COVID. 

Next, we will use CURE COVID to perform a secondary analysis of three separate randomized clinical trials, which we will refer to as the Losartan Trial (n=580, doi.org/10.25934/00007231), the hydroxychloroquine post-exposure prophylaxis trial (HPE Trial, n=829, doi.org/10.25934/00006837), and the hydroxychloroquine-azithromycin trial (HA Trial, n=289, doi.org/10.25934/00007132). For each of these trials, we will (1) identify clinical variables that predict patient response (2) identify patient subsets who benefit from each trial therapy and (3) assess if CURE COVID can detect earlier clinical trial success through use of our Accelerated Approval Modeling. To do this, we model clinical trial significance based on simulated enrollment of a CURE COVID-enriched trial population. Finally, we will pool data and perform a meta-analysis to identify any additional trends.

COVID-19 trial data is of a high enough quality and completeness to be easily manageable by CURE AI (we designed the neural network to account for this). We have successfully applied CURE AI to over 12 different clinical trials and have found earlier detection of trial significance by 6-18 months, while handling the data completeness issue. A recent use case implementation of CURE AI, including methods, can be reviewed at docsend.com/v/vcf4p/dw24 and docsend.com/v/vcf4p/dw24methods.

Results from the analysis of all GREI data will be made publicly available. Raw patient data will remain available only through Vivli. Data will be utilized in an ethical and non-discriminatory manner.

Our proposal is ambitious. However, we are able to consistently complete clinical trial analyses with the same methodology in 1 month. Therefore, we believe that our timeline is realistic.

 

Timeline (by month):

1: Train model on COLCORONA Trial

2: Analyze Losartan Trial 

3: Analyze HPE Trial

4: Analyze HA Trial

5-6: Meta-analysis & any necessary re-analyses. Prepare data for public presentations.

GREI Repository Data Sets
Vivli
DOI (Digital Object identifier) of GREI Repository Dataset
We have a signed Data Use Agreement with Vivli (9/12/2024) and will analyze and publish our results on the analysis of the following datasets:
https://doi.org/10.25934/PR00007133.0
https://doi.org/10.25934/00006837.0
https://doi.org/10.25934/PR00007231.1
https://doi.org/10.25934/PR00007132.0
Outcomes and Outputs

CURE COVID will generate a list of clinical variables, combinations of which represent patient subgroups that we identified who show benefit to trial therapies. We represent these data in t-SNE plots, bar graphs representing patient subgroups, and as tabulated data with statistical analyses. We also present our findings through the visual representation of the CURE Curve and Accelerated Approval Modeling plots. These are demonstrated in the supplemental documents.

We will share our research findings at medRxiv.org, in academic medical journals, and in scientific conferences such as ICID. Our results will be transparent and publicly available for download at standard data repositories. We also plan to publish our research findings on our website numenos.ai.

Our team has experience with prioritizing FAIR principles. The FAIR data principles - Findability, Accessibility, Interoperability, and Reusability - play a crucial role in the effective use of AI-analyzed clinical data. By adhering to these principles, we will enhance the transparency and robustness of clinical research and serve as role models for the use of AI in clinical trials. Our data will be easily findable, use persistent identifiers, be easily accessed, with standard and intuitive formatting and indexing, and remain straightforward to reuse and validate by independent teams.

We acknowledge that replicability and reproducibility are critical challenges in the field of artificial intelligence.  We have integrated measures to minimize the risk of overfitting and bias within CURE AI including internal trial validation using held-out validation data, a permutation test to assess our model on ‘fake’ outcome-randomized data, and a randomized enrollment test (part of the CURE Curve) as described in our supporting documents. 

In addition to our internal validation tests, we provide replicability through analysis of 3 independent COVID-19 clinical trials, where we would expect to identify similar patterns. If not, it will be difficult to conclude that we found a replicable result. If we advance to Phase 2, we will iterate the training process to train CURE AI on each test trial (the Losartan Trial, HPE Trial, and HA Trial), and then test on the three other trials. This will provide additional evidence that our models are internally valid.

We can also address reproducibility, since we will be publishing our research findings. First, we will assess if our findings of clinical relationships that predict response are consistent with those published in the literature. If we find novel clinical parameters that predict outcome, then we will encourage and assist other researchers with validating our findings through traditional clinical research methodologies. Lastly, we, or others, can take our findings and apply them directly to additional COVID-19 datasets to see if our findings could have been used to shorten the time to significance of COVID-19 clinical trials.

Impact/ Scientific Significance

The CURE AI foundation model is an innovative example of how complex neural networks can be applied to clinical trials. As part of the 2024 DataWorks! Challenge, we have been able to access important COVID-19 clinical trial datasets. As part of this challenge, we hope to provide evidence that AI can be effectively applied to clinical trials to guide more rapid identification of treatments that provide real clinical benefit. 

As pioneers in this field, we are eager to lead the way with implementation of the best practices for data utilization, re-utilization, and re-analyses including open collaborations, transparency, and data sharing. 

When the next pandemic occurs, we hope to deploy the CURE COVID architecture to rapidly detect which therapies - studied on clinical trials - show true effect. Through this challenge, we anticipate that we will demonstrate that CURE COVID will be able to identify patient subsets who do and who do not benefit from specific COVID-19 interventions.  This will provide strong evidence for the use of AI during the next pandemic. 

The most important aspect of this technology is our implementation of Accelerated Approval Modeling, in which we can detect early trial success. In 2021, there were over 460,000 deaths from COVID-19 in the US (CDC data). When we encounter another pandemic, the impact that we could have from earlier implementation of effective treatments - even by 3 months - is immense. 

To implement CURE COVID during the next pandemic, we hope to partner with clinical research organizations conducting clinical trials. We will be able to provide interim trial analyses and guide adaptive trial design to quickly sort through therapies that are more or less effective. Through this, we will also be able to identify subpopulations who may receive outsize benefit from certain therapies over others.

In the near future, it is possible that AI will enable ‘virtual control arms’ for trials, as standard control arms will pose increasing ethical concerns for withholding treatment with high prediction for benefit over standard of care. Regulatory bodies will need to collaborate with academic, biotechnology, and patient advocacy leaders to ensure safe, transparent, and straightforward implementation of technologies like CURE AI, as the clinical translation of transformative patient treatments could be rapidly expedited with the assistance of these technologies. We are optimistic that special implementations of AI, such as CURE COVID for rapid pandemic response, will demonstrate consistent successes and lead to a revolution in healthcare including a major step toward curtailing the next pandemic.

Team

Vitalay Fomin and Amit Weiss co-founded Numenos and have worked together for 2 years. Vitalay and Neil Pfister have worked together for 10 years. Neil and Amit have worked together for 16 months.

 

Vitalay Fomin, PhD from Columbia University. Has a strong background in data science, statistics and biomarker discovery, having worked at Roche/Genentech to develop and apply advanced machine learning architectures to optimize trial design. He had a senior role at Otsuka Pharmaceuticals, leading translational medicine efforts in cardiorenal and neuropsychiatric disorders. Has extensive experience in clinical trial statistics and designing, testing and implementing machine learning models.

 

Amit Weiss, MSc from the Hebrew University. Has developed machine learning algorithms since 2011. Amit founded and led Israel's Ministry of Defense AI R&D group in multiple domains including voice, seismic, radar, tabular data, time series and text. He transitioned to clinical trial and biomarker machine learning as co-founder of Numenos. Has extensive statistical background through real-world implementation of AI models and has published work on COVID-19. 
 

Neil Pfister, MD/PhD from Columbia University. Runs an NIH-funded research lab and has published clinical work in infectious disease research (HIV). Neil is the head of the AI in Precision Medicine Research Group and head of GI oncology in his department at the University of Alabama at Birmingham. Neil directs the research mission of Numenos.

 

Considerations

Advanced Computational Infrastructure: Cutting-edge computational resources to power data-intensive analyses.

Domain Expertise: Deep knowledge in clinical science, data science/statistics, and biomarker discovery.

Proven Track Record in COVID-19 Data Analysis: Extensive experience analyzing COVID-19 data, as evidenced by published work with Amit Weiss.

Clinical Trial Data Expertise: In-depth understanding of clinical trial data, including potential pitfalls and limitations.

FAIR Data Principles Implementation: Significant experience (in Roche/Genentech) in implementing FAIR (Findable, Accessible, Interoperable, Reusable) data principles, enabling the development of accessible and reproducible solutions.

Democratizing Machine Learning: Expertise in making complex machine learning approaches and their results accessible to a broader scientific community.

High-Performing Team: A cohesive team with a proven history of delivering high-quality results in short timelines

Supporting Documents
Provide up to 10 resources for the evaluation of your secondary research project including but not limited to: ● The persistent identifier of the dataset(s), other than GREI dataset DOIs already listed above, to be used in the proposed project (where available) ● Tools/workflows or resources to be utilized in the proposed project ● Relevant references or scientific publications that directly relate to the proposed project
Non Scored Criteria
Please complete this information. It will not be scored by the evaluation panel.
Entity Participation
Participate as an Entity (i.e., registering as a group of individuals competing together on behalf of a legally established organization, institution, or corporation)
Legal Entity Organization Name
Company name: Numenos
Address: 1007 N. Orange St., 10th Fl., Wilmington, Delaware, 19801
Research Discipline (non-scored criteria)
Biomarker Discovery
Multi-omics data analysis
Artificial Intelligence
Infectious Diseases
IDeA State (non-scored criteria)
No
All Team Member Information - Name, Organization, Job Title, and Email address
Vitalay Fomin, PhD.
Co-founder & CEO at Numenos.
vitalay.fomin@numenos.ai

Amit Weiss, MS
Co-founder & CTO of Numenos
amit.weiss@numenos.ai

Neil Pfister, MD/PhD
Global Director of Medical Strategy at Numenos
neil.pfister@numenos.ai

Entity Website: Numenos.ai
MSI (non-scored criteria)
No
Participation in prior DataWorks! Prizes (non-scored criteria)
No
Team Point of Contact Eligibility
yes
Eligibility (non-scored criteria)
Yes, I confirm that I have read and meet the terms of eligibility for this challenge