SandboxAQ Launches Dataset for Training AI Models in Drug Discovery

SandboxAQ, AI drug discovery, pharmaceuticals

SandboxAQ has launched a dataset designed to help researchers advance artificial intelligence (AI) models in drug discovery.

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    The SAIR (Structurally Augmented IC50 Repository) is a detailed dataset of protein-ligand pairs with annotated experimental potency data designed to enhance the speed and accuracy of binding affinity predictions, the company said in a Wednesday (June 18) press release.

    SAIR includes 5.2 million synthetic 3D molecular structures across 1 million protein-ligand systems, according to the release.

    The SAIR dataset was generated with the use of SandboxAQ’s AI large quantitative model (LQM) capabilities and Nvidia’s development platform for AI training and fine-tuning, DGX Cloud, the release said.

    With this dataset, resources can train AI models to accurately predict protein-ligand binding affinities at least 1,000 times faster than traditional physics-based methods, per the release.

    “This achievement marks a pivotal moment in drug discovery, demonstrating our capacity to fundamentally transform the traditional trial-and-error process into a rapid, data-driven approach,” Nadia Harhen, general manager of AI simulation at SandboxAQ, said in the release. “By putting five-plus million, affinity-labeled protein-ligand structures into the public domain, we’re handing every scientist the raw fuel to train breakthrough models overnight, setting a new pace for drug discovery.”

    SandboxAQ said in April that it raised over $450 million in a Series E round to support its development of large quantitative models that help enterprises leverage AI to solve scientific and quantitative challenges.

    The company’s newest investors participating in that round included Nvidia, GoogleBNP ParibasHorizon Kinetics and Ray Dalio, the founder of Bridgewater Associates.

    Since spinning out from Alphabet in 2022, SandboxAQ has raised over $950 million.

    The merger of AI with quantum computing could have significant implications for many verticals, Chris Hume, senior director of business operations for SandboxAQ, told PYMNTS in an interview posted in February 2024.

    “The physical world is defined by quantum mechanics,” Hume said. “The more effectively we can understand those interactions and then model those interactions, the more efficiently and effectively you can build predictive models.”

    PYMNTS reported in October that the wave of AI breakthroughs in the medical field was reflected in financial markets, where HealthTech stocks rose 12% in 2024 and AI healthcare companies commanded valuations up to five times higher than their non-AI counterparts.