Recently, there has been a flurry of announcements from AI-powered biotech companies about the potential of large language models for early drug discovery. In the second part of a three-part series, Dr. Raminderpal Singh provides an example of using ChatGPT, which shows how accessible large language models are to lab scientists.
In our previous article, we summarized the role and challenges of LLMs in early drug discovery. In this article, we provide a simple case study to download and practice using ChatGPT or other accessible LLM systems. This illustrates the power that LLMs have to improve the daily tasks of scientists, despite their caveats and challenges. You can download all the source files for your own use.1- (See simple GPT exerciseThank you Nina Trotter.2 To support her in building this example. The example should work with any LLM program but has been tested with ChatGPT.3
About the example
- Example objective: Use the metrics extracted from 10 research papers on acarbose-treated rats to refine the recommendations made from the initial study results.
- Key outputs required from the example: Recommend dose, participants, and measurements based on the results of the initial study.4.5 Articles on acarbose-treated mice, with supporting data points.
- Challenges I faced when implementing the example: Creating prompts to accurately extract information to support recommendations, and accurately describing the content of multiple files and sheets.
It is important to be aware that publicly accessible LLM systems often share the input you provide, so it is recommended that you do not enter confidential information.
To help ChatGPT provide useful insights, there needs to be some “claims engineering.” This is a technical term for best practices in how to write claims. For example, the first claim in this example is just meant to provide background and context for ChatGPT:
“You are a drug discovery scientist looking to make decisions about dosage, participants, and measurements when administering a current diabetes drug to age-related diseases. You have experimental results from a study in mice showing the effects of acarbose on human lifespan, body weight, body composition, fat pads, glucose, grip strength, grip duration, rotation, and pathology. You also have several relevant scientific publications from studies investigating the effects of acarbose on various measurements in mice. You now want to interrogate your study results (in Excel files and images) and publications separately for insights, and then together to come up with the best set of recommendations for your colleagues looking to conduct early-stage clinical trials using acarbose in age-related diseases. To do this, you will now process a series of specific ChatGPT prompts entered by the user.”
The screenshot below shows the results from the last prompt. There are some nuances that ChatGPT didn’t pick up. For example, in female mice, lifespan isn’t extended as much compared to male mice, but their physical measurements are improved. Improved prompts will help generate more accurate results.
Photo of Dr. Raminderpal Singh, showing rapid results.
Please comment below to share your findings from the example. Let us know if you were able to improve the results, and if so, how?
The next article in this series, published on Monday 24 July, will discuss the key challenges in effectively using the LLM for early drug discovery, and offer some practical approaches to addressing them.
References
1 Reading. HitchhikersAI.org. Available at: https://www.hitchhikersai.org/reading
2 Nina Trotter. LinkedIn. Available at: https://www.linkedin.com/in/nina-trotter/
3 ChatGPT. Available at: https://chatgpt.com/
4 Alaves S, and othersAcarbose improves health and lifespan in aged HET3 mice. aging cell. 18(2) (April 2019). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6413665/
5 Harrison D, and othersITP: Intervention Testing Program: Effect of Different Therapies on Human Lifespan and Related Phenotypes in Genetically Heterogeneous (UM-HET3) Mice (2004-2023). Mouse Phenomenon DatabaseAvailable at: https://phenome.jax.org/projects/ITP1
About the author
Dr. Raminderpal Singh
Dr. Raminderpal Singh is a recognized key opinion leader in the biotechnology industry. He has over 30 years of global experience leading and advising teams on building cost-effective, high-value computational modeling systems. His passion is helping early- and mid-stage life science companies achieve novel biological breakthroughs through the effective use of computational modeling.
Raminderpal currently leads the open source community HitchhikersAI.org, accelerating the adoption of AI technologies in early stage drug discovery. He is also the CEO and co-founder of Incubate Bio, a biotechnology company that serves life science companies looking to accelerate their research and reduce lab costs through computational modeling.
Raminderpal has extensive experience building companies in both Europe and the United States. As a business executive at IBM Research in New York, Dr. Singh led the market launch of IBM Watson Genomics Analytics. He was also Vice President and Head of Microbiome at Eagle Genomics Ltd, Cambridge. Raminderpal received his PhD in semiconductor modeling in 1997. He has published numerous research papers, two books, and has twelve issued patents. In 2003, he was named by EE Times as one of the 13 most influential people in the semiconductor industry.
For more: http://raminderpalsingh.com ; http://hitchhikersAI.org ; http://incubate.bio