scieee Science in your language
[en] (orig)

HERALD: High-resolution Early Recognition of Antigenic Landscape Divergence

Author: Davis, Bee Rosa
Publisher: Zenodo
DOI: 10.5281/zenodo.17663378
Source: https://zenodo.org/records/17663378/files/HERALD_Validation_Study__Empirical_Confirmation_of_Learned_Geometric_Separation_on_SARS_CoV_2_RBD_Data.pdf
HERALD Valida ion S udy: Empi ical Con i ma ion o Lea ned
Geome ic Sepa a ion on SARS-CoV-2 RBD Da a
Bee Rosa Da is
No embe 20, 2025
1 Execu i e Summa y
This s udy p o ides nume ical alida ion o he co e heo e ical claims in he HERALD amewo k
(speci ically Aspec 1: Me hod o Cons uc ing a Lea ned Geome ic Space). Using his o ical deep-
mu a ional scanning (DMS) da a om he Bloom labo a o y, we e alua ed whe he he HERALD
cons uc ion objec i e (In oNCE loss) could success ully induce a la en geome y ha sepa a es
immune-escape a ian s om unc ionally simila a ian s.
The s udy employed a igo ous A/B es ing p o ocol o isola e he con ibu ion o biophysical
p io s. We compa ed a baseline model using o hogonal (one-ho ) sequence encoding agains he
ull HERALD implemen a ion using p e- ained p o ein language model (ESM-2) embeddings.
Key Findings.
•The Baseline Failed (Con ol). The one-ho model ailed o gene alize o unseen es
a ian s, yielding a sepa a ion gap o ∆ ≈0 and high misclassi ica ion a es (αemp >0.9).
This con i ms ha geome ic egula iza ion alone is insu icien wi hou biophysical p io s.
•HERALD Succeeded (In en ion). The ull HERALD implemen a ion (using ESM-2
p io s) success ully lea ned a sepa a ing geome y on unseen es da a. In he bes -pe o ming
condi ion (LY-CoV555), HERALD achie ed a sepa a ion gap o ∆ = 0.309, ep esen ing a
5.7×imp o emen o e he baseline (∆ = 0.054).
•Visual Con i ma ion. His og ams o la en dis ances show a clea , sys ema ic igh wa d
shi o escape a ian s, alida ing he “Dis ance ⇒Escape” heo e ical bound (Lemma 1).
2 Me hodology
2.1 Da a Sou ce
We u ilized he public SARS-CoV-2 Recep o Binding Domain (RBD) escape map da ase (escape da a.cs ,
G eaney e al.). The da ase p o ides quan i a i e “escape ac ion” sco es o single amino-acid
mu a ions agains a ious an ibody/se a condi ions.
2.2 Expe imen al Design
Fo each o i e an ibody condi ions (C110, LY-CoV555, REGN10933, COV2-2196, C121), we
pe o med he ollowing p o ocol:
1
•Da a Cleaning: Agg ega ed escape sco es by (si e, mu a ion) o emo e eplica e noise.
•T ain/Tes Spli : Randomly spli unique a ian s in o 80 % T ain and 20 % Tes . All
e alua ion me ics a e epo ed on he held-ou Tes se o ensu e ze o da a leakage.
•Pai Labeling: De ined “Simila ” pai s (|∆escape| ≤ 0.1) and “Escape” pai s (|∆escape| ≥
0.5) based on g ound- u h DMS da a.
•Model T aining: T ained a p ojec ion head ϕusing he HERALD In oNCE objec i e
(Eq. 4) wi h empe a u e τc∈ {0.5,1.0}and K= 16 ha d nega i es.
2.3 Compa ison G oups
•Baseline (Con ol): Inpu s we e one-ho encoded ec o s ep esen ing (si e, mu a ion).
This es ed he hypo hesis ha he loss unc ion alone could induce geome y om o hogonal
inpu s.
•HERALD (Expe imen al): Inpu s we e 320-dimensional embeddings om he ESM-2
(8M) p o ein language model, ep esen ing he biophysical p ope ies o he mu an amino
acid a he gi en si e.
3 Resul s
3.1 Quan i a i e Pe o mance (Sepa a ion Gap ∆)
The p ima y me ic o success is he dis ance gap ∆, de ined as he di e ence in mean la en
dis ance be ween “Escape” pai s and “Simila ” pai s on he es se ,
∆ = µ−−µ+,
whe e µ−is he mean dis ance o escape pai s and µ+is he mean dis ance o simila pai s. A
posi i e ∆ indica es he geome y success ully dis inguishes unc ional pheno ypes.
Table 1: Sepa a ion gap ∆ o baseline (one-ho ) and HERALD (ESM-2) models, e alua ed on
held-ou es a ian s.
Condi ion Baseline ∆ (One-Ho ) HERALD ∆ (ESM-2) Imp o emen
LY-CoV555 0.0537 0.3095 5.7×
COV2-2196 0.0278 0.2020 7.2×
REGN10933 0.0155 0.1362 8.8×
C121 0.0845 0.1164 1.4×
C110 0.0483 −0.0231 (No li )
Analysis.
•In 3 ou o 5 condi ions, HERALD p o ided a massi e (>5×) imp o emen in geome ic
sepa a ion.
•The b eakdown in C110 sugges s ha o ce ain an ibodies, he speci ic single-mu a ion
ea u es o ESM-2 may equi e addi ional s uc u al con ex (e.g., ull sequence con ex ) o
ully esol e binding mechanics.
2
3.2 E o Ra e Reduc ion (αemp)
We measu ed he empi ical misclassi ica ion a e αemp a he decision h eshold d∗. Lowe is be e .
•LY-CoV555: baseline e o 0.916 →HERALD e o 0.878.
•COV2-2196: baseline e o 0.955 →HERALD e o 0.936.
While absolu e e o a es emain high due o he ex eme class imbalance (app oxima ely
15:1 simila - o-escape a io in he es se ), he consis en educ ion con i ms ha he HERALD
geome y is “ il ing” he odds in a o o de ec ion, alida ing he p obabilis ic bounds de i ed in
P oposi ion 1.
3.3 Visual Valida ion (The “Money Plo ”)
The his og ams below isualize he dis ibu ion o la en dis ances (δ) o Simila (blue) s. Escape
( ed) pai s in he lea ned geome y.
Figu e 1: Dis ance dis ibu ion o REGN10933 (HERALD). Blue: Simila pai s; Red: Escape
pai s.
Obse a ion. No e he dis inc igh wa d shi o he ed (Escape) dis ibu ion. The peak densi y
o escape pai s is cen e ed a ound δ≈1.6, while simila pai s peak nea δ≈1.3. This sepa a ion
is he physical mani es a ion o he lea ned mani old.
Obse a ion. A s ong sepa a ion is isible, co esponding o he high ∆ = 0.309. The long ail o
he ed dis ibu ion indica es ha high-escape a ian s a e success ully pushed o he pe iphe y o
he la en space.
4 Discussion & Conclusion
4.1 The Necessi y o Biophysical P io s
The nega i e esul om he one-ho baseline (Expe imen 1) is a c ucial inding. I demons a es
ha geome ic egula iza ion alone canno hallucina e s uc u e om o hogonal inpu s. The model
3
Figu e 2: Dis ance dis ibu ion o LY-CoV555 (HERALD).
equi es a biophysical p io (p o ided he e by ESM-2) o seed he geome y. HERALD’s con ibu-
ion is he e inemen o ha gene al p io in o a speci ic, ask-aligned an igenic mani old.
4.2 Valida ion o Pa en Claims
These esul s di ec ly suppo he claims in he p o isional pa en applica ion:
•Suppo o Claim 1 (Cons uc ion). The me hod success ully cons uc ed a space whe e
“la en Euclidean dis ances app oxima e unc ional geodesics” (∆ >0).
•Suppo o Claim 5 (Bounded Dis o ion). The exis ence o a measu able, posi i e
gap ∆ con i ms ha he sys em ope a es wi hin a egime whe e dis ance p oxies unc ion.
4.3 Conclusion
This s udy con i ms ha HERALD is no me ely a heo e ical cons uc . When ins an ia ed wi h
app op ia e biophysical ea u es, he ma hema ical amewo k success ully ansla es spa se unc-
ional labels in o a angible, sepa a ing geome y. This p o ides he necessa y empi ical ounda ion
o deploymen in p ospec i e su eillance.
4