Skip to main content

Design your plate

In this step you apply your trained model to design your plate(s) to take to the lab. The needed inputs will vary depending on your initial round configurations i.e. protein modality, whether you have experimental data, and whether you are looking to generate novel sequences, or downselect from a large set of sequences. These inputs may include some of the following:

  1. Template sequence(s)
  2. Objective and constraints
  3. Plate size
  4. Mutation load
  5. Requirements on blocked mutations, motifs / liabilities, allowed regions

Once the requested inputs are provided, you will start the design, and generation will happen in the background. After completion, a generation report will be posted to your round, and you can analyze and compare the runs, to confirm which plate(s) would go to the lab.

Note: To better understand why Cradle generates whole plates, not single sequences, see Importance of plates vs sequences.

Template sequence(s)

Template selection is very important in rounds with and without experimental data:

  • In rounds where no experimental data is available (yet), the models will be exploring the evolutionary neighborhood of the template, so a template with serious weaknesses e.g., poor expression, very weak binding, may lead to a higher fraction of variants you can't learn from — sequences that don't express or have no measurable signal.
  • In rounds with experimental data, you want to have one or few of the strongest performing variants as template(s), since variants will likely retain that strong performance while improving secondary properties.
Tips for template selection
  1. If you don't have plate-based assay data yet, but have screening data from a phage or yeast display campaign, use the latter to pick your best available starting point.
  2. Pick templates that already maximize your most important property, and use subsequent rounds to improve the weaker properties.
  3. If in doubt, prioritize sequences with strong expression — a sequence that doesn't express can't produce useful training data.
  4. If you have several sequences that clearly stand out across your properties, consider several templates to hedge the risk. For a 96-sequence plate, 2 to 4 templates is recommended. Choose them diverse enough (>3-4 mutations away from each other).
  5. Choosing an earlier-round template can cause the model to concentrate heavily on mutations that showed a positive signal in that round, over-representing them across the plate. If you see that, picking a later-round template may help.

When the best templates aren't obvious, or when you want us to pick them, you can use the Select Template helper. It will score your variants with the trained models and proposes template sequences. Make sure to review the proposed templates: check their actual assay values and any quality issues.

Especially in multi-property optimization, where there may be fundamental trade offs between different properties, template selection can be hard. If there is a clear hierarchy between objectives ("maximize yield; I don't care about stability as long as Tm > 65 °C"), you can pick the template that maximizes the primary objective while respecting (or approaching) the constraints (i.e. best yield above Tm cutoff). When there is no clear hierarchy, use Select Template helper to assess the trade-offs and find the optimal template sequences.

Note: Before running the Template Selection helper, filter your dataset so it only contains sequences with measurements for all the properties that you care about. The algorithm uses predictions to pick the templates; If sequences are missing key assay values, you cannot double check their quality.

Primary objective(s) and constraint(s)

Learn about objectives and constraints and how they are used in generation in Optimizing for multiple properties at once.

For your primary objective, choose the property you want to actively optimize in this round, and its direction (for example, decrease binding affinity as much as possible).

For each remaining properties, set a constraint as a floor (stay above), a ceiling (stay below), or a window (between a floor and a ceiling, e.g. for an affinity range). Constraints are gates, not targets — the model won't sacrifice the primary objective to push a constrained property further than it needs to go.

We recommend to set thresholds using a reference sequence ("be better than this sequence") rather than an absolute value. If no sequence sits exactly at the value you want to beat, add a margin. For a multiplicative assay the margin is a fold improvement (margin 1.5 = 50% improvement).

Pro Tip: Resist anchoring to your single best-performing sequence. Extreme values at the top of your distribution are more likely to be one-offs or measurement noise, and they anchor the constraint to a region where the model has little data, making predictions less reliable. Prefer a reference that sits within the bulk of your historical data — clearly good but not an outlier — and add a larger margin if needed.

Plate size

Pick the total number of candidates from your assay throughput and your control wells. If you're filling two 96-well plates (192 wells) and need 4 control wells, request 188 designs. A common single-plate setup is 96 wells minus controls. For instance, a good layout for a 96-well plate is 3 controls × 2 technical replicates + 90 variants.

If your experimental budget shrinks, re-run generation instead of manually filtering the plate. Read more on the Importance of plates vs sequences.

Mutation load

The mutation-load range is a floor and a ceiling on how many mutations a variant carries relative to the template.

The default range is 1 to 8 and is our general recommendation.

For generations without experimental data and a fragile template, we advise to lower the ceiling to 4 mutations. This is because the models work from evolutionary signal alone, and that signal weakens the further you stray from the template. If your template sequence is robust, you can use the default ceiling (8).

Note: Setting a minimum mutation count does not guarantee more diversity. It only ensures the diversity that exists is farther from the template in sequence distance; the models consider diversity across the predicted performance landscape, not just sequence distance.

Requirements on blocked mutations, motifs / liabilities, allowed regions

You can define positions / ranges and the following rules for them:

  • Fix a position so no mutations occur there.
  • Block residues or motifs you don't want at a site (e.g. block alanine, cysteine, aspartic acid at position 90).
  • Force a substitution at a site without specifying the replacement, to engineer out an existing pattern.
  • Force a specific mutation when you know exactly what you want.
  • (Antibody-specific) Discourage mutations is highly recommended in CDRs.

The Prevent common liabilities shortcut prepopulates rules for the motifs most relevant to therapeutics and antibodies:

LiabilityDefault behaviorWhere you usually apply it
CysteinesBlock any new cysteines; preserve existing ones (the platform finds existing cysteines, e.g. at positions 23 and 97, and keeps them so stabilizing disulfides aren't lost)Whole sequence
OxidationBlock new methioninesCDRs (edit the prepopulated range to your CDRs)
N-linked glycosylationBlock new glycosylation motifsWhole sequence length
Isomerization / deamidation / hydrolysis motifs (e.g. DG, NG)Block selected motifsCDRs / most problematic regions

After inserting, review the prepopulated ranges — the default range for methionine, isomerization, deamidation, etc. is usually not the one you care about, so replace it with your CDR annotations.

Important: Liabilities are hard constraints — no sequence carrying a flagged motif appears in the output. Configuring many strict liabilities narrows the space the model can work in. If your generated plates are sparse, try relaxing or removing a liability and run a parallel generation to compare.