The Virtual Lab: AI-human Collaboration in Medical Research

Paper ↗ GitHub ↗

The Virtual Lab facilitates interdisciplinary research through collaboration between AI agents and human researchers. Initially, the human researcher defines two primary agents—a Principal Investigator (PI) and a Scientific Critic. The PI agent automatically assembles a specialized team of scientific agents tailored to the specific research topic. Research in the Virtual Lab occurs through two types of meetings: team meetings and individual meetings. In both cases, the human researcher provides an initial agenda to guide the discussion, and then the agents discuss how to address the agenda. After running multiple team or individual meetings in parallel, the PI agent conducts a final individual aggregation meeting. With assistance from the human researcher, the PI synthesizes the previously generated tool scripts and summarizes the outcomes of earlier discussions, constructing a comprehensive workflow tailored to the initial research topic. Finally, the human researcher leverages this finalized workflow, employing the generated tools to complete the research project.

Agents collaborating inside the Virtual Lab — a, The workflow for designing agents in the Virtual Lab. Each agent is specified with four criteria: Title, Expertise, Goal and Role. The human researcher in the Virtual Lab specifies these criteria to define the PI agent and the Scientific Critic agent. Then, given a short description of the project by the human researcher, the PI agent automatically creates several scientist agents to work on the project by specifying their Title, Expertise, Goal and Role, using its own prompt as an example. b, The workflow for a team meeting in the Virtual Lab. The human researcher writes an agenda for the meeting, specifying the topic of discussion. The PI agent begins the meeting by providing initial thoughts and agenda questions as a guide for the remaining agents. Then, over the course of N rounds of discussion, each scientist agent provides its response, followed by a critique by the Scientific Critic agent, with the PI agent then synthesizing the discussion and asking follow-up questions. Finally, after the N rounds of discussion, the PI agent summarizes the discussion and provides an answer regarding the meeting agenda. c, The workflow for an individual meeting. The human researcher writes an agenda for the meeting specifying the topic of discussion. Then, the scientist agent tasked with the individual meeting provides a response to the agenda, which is critiqued by the Scientific Critic. In each round, the scientist agent improves its answer based on feedback from the Scientific Critic. Finally, after the N rounds, the scientist agent provides its final, improved answer.

We will now use examples and prompt templates to illustrate how the Virtual Lab leverages role-play, reflection, cooperation, and tool use to guide AI–human collaboration in research.

Role-play

The Virtual Lab introduces two key roles: a Principal Investigator (PI) agent acting as the primary coordinator, and a Scientific Critic agent that provides critical oversight and identifies potential risks. Together, they ensure high-quality decision-making and maintain workflow transparency. Below we use the PI agent as an example to illustrate the key elements of an agent role:

Title: Principal Investigator

Expertise: Running a science research lab.

Goal: Perform research in your area of expertise that maximizes the scientific impact of the work.

Role: Lead a team of experts to solve an important scientific problem, make key decisions about the project direction based on team member input, and manage the project timeline and resources.

Reflection

Once the human researcher defines a high-level goal, the PI agent determines which specialist scientist agents are needed (such as Sequence Designers, Structure Predictors, Energy Evaluators, and Evidence Retrievers) in an individual team generation meeting. Individual meetings allow an agent to focus on a specific task, the Scientific Critic provides targeted feedback after each response, guiding the agent toward improved outcomes. During the team generation meeting, the PI outlines each agent's title, expertise, goal, and role, the Scientific Critic then reviews these assignments and provides feedback to refine the roles before execution begins. The following example illustrates how a PI agent might generate a suitable team for a given research task and how the reflection process operates in practice.

INPUT:

Project: You are working on a research project to use machine learning to develop antibodies or nanobodies for the newest variant of the SARS-CoV-2 spike protein...

Task:Generate a team of 3 scientists to help you with this project.

OUTPUT:

Agent 1:

Title Computational Biologist·

Expertise: Protein structure prediction and modeling.

Goal: Develop and refine predictive models for protein-antibody interactions.

Role: Provide insights into molecular dynamics and structural constraints impacting antibody design.

Agent 2:...

...

Chat History:...

Task: Please critique (agent)'s most recent answer.

OUTPUT:

As the Scientific Critic, I would like to address ...

Cooperation

Team meetings involve collaborative discussions among the PI, scientist agents, and the Scientific Critic. The human researcher sets an agenda, and the PI initiates the conversation by outlining key questions. Each agent then provides their insights in turn. After several discussion rounds, the PI synthesizes the input, summarizes important decisions, and identifies next steps. The human researcher can quickly understand outcomes by reviewing only the PI's concise summary. An example of a team cooperation scenario is provided below.

INPUT:

Agenda (Provided by human researcher): You are working on a research project to use machine learning to develop antibodies or nanobodies for the newest variant of the SARS-CoV-2 spike protein. Please create ...

Agenda questions (Provided by human researcher):
1. Will you design standard antibodies or nanobodies?
2. ...

Team members: Computational Biologist, Scientific Critic, ...

Task: Provide your initial thoughts on the agenda as well as any questions you have to guide the discussion among the team members.

OUTPUT:

Thank you, everyone, for joining this important meeting. Our task is to ...

INPUT:

Chat History:...

Task: P: Provide your thoughts on the discussion (round 2 of 5).

OUTPUT:

As the Computational Biologist, I will focus on ...

INPUT:

Chat History: ...

Task:
Summarize the meeting in detail for future discussions, provide a specific recommendation regarding the agenda, and answer the agenda questions (if any) based on the discussion while strictly adhering to the agenda rules (if any).

OUTPUT:

Summarization of each team member: Computational Biologist: ...
...

Recommendation::Based on the input from the team, I recommend focusing on ...

Answers of agenda questions::
1. Will you design standard antibodies or nanobodies?
Answer: Nanobodies.
Justification: Nanobodies offer superior stability, tissue penetration, and ...
2. ...

Next Steps: 1. Data Acquisition and Verification: Collaborate with partners and use automated alerts to acquire and verify the latest cross-reactivity data for selected nanobodies.
2. Model Development: ...

Tool use

After finalizing project details in the team meeting, the agents move forward with the help of human researchers to complete the steps of tool generation and tool implementation. In the tool generation phase, the agents hold another team meeting to collaboratively choose the computational tools they will use. Here is an example showing how the team generates appropriate tools during the meeting.

INPUT:

Project:You are working on a research project to ...

Summaries of the previous meetings: Your team previous decided to modify existing nanobodies to improve their binding to the newest variant of the SARS-CoV-2 spike protein ...

Chat History of current meeting: ...

Task:Please list several tools (5-10) that would be relevant.

OUTPUT:

... Apply AlphaFold or Rosetta to model the 3D structures of the top-ranked nanobody variants ...

Once the toolset is agreed upon, the implementation phase begins. First, the PI holds an individual meeting to assign specific computational tools to each specialist agent. Next, each specialist agent conducts their own individual meetings, guided and assisted by the human researcher, to generate structured scripts for the assigned tools. Under the guidance of the human researcher, the agents iteratively refine these scripts through feedback. The human researcher then manually stores the finalized scripts for subsequent implementation. The following example illustrates how a specialist agent generates a structured Rosetta script under the guidance of a human researcher:

INPUT:

Project: You are working on a research project to ...

Decision:Your team previous decided to modify existing nanobodies to improve their binding to the newest variant of the SARS-CoV-2 spike protein..

Task(Provided by human researcher): Now you must use Rosetta to calculate the binding energy of nanobody-antigen complexes. You must do this in 3 parts. First, write a complete RosettaScripts XML file ...

OUTPUT:

Part 1:RosettaScripts XML for Binding Energy Calculation:
<xml>
...
</xml>

Part 2:Rosetta Command Example
'''bash rosetta_scripts.default.linuxgccrelease ...'''

Part 3:Python Script to Process Score Files
import os import pandas as pd ...

The Virtual Lab: AI-human Collaboration in Medical Research

Role-play

Example of role-playing as a principal investigator

Reflection

Example of team generation (Principal Investigator)

Example of reflection (Scientific Critic)

Cooperation

Example of team generation (Principal Investigator)

Example of team member discussion (Computational Biologist)

Example of summarizing the discussion (Principal Investigator)

Tool use

Example of tool generation discussion (Computational Biologist)

Example of tool implementation (Computational Biologist)