Alec East Alec East

Tokenization to Transfer: Do Genomic Foundation Models Learn Good Representations?

Do genomic AI models actually learn anything useful?

Published at ICLR 2026, this paper from M42 and ADIA Lab tests seven Genomic Foundation Models (GFMs) across 52 tasks — comparing them against models with completely random, untrained weights. The results are uncomfortable: randomly initialised models are surprisingly competitive, and the biggest performance driver turns out to be tokenisation strategy, not pretraining. More critically, all evaluated models show near-complete insensitivity to single-nucleotide variants, with clinically relevant mutations in BRCA2 and CFTR producing AUROC scores barely above chance. The authors argue that simply importing NLP pretraining into genomics isn't sufficient, and call for biologically informed tokenisation and variant-aware training objectives before further compute is committed.

Read More
Alec East Alec East

The Permutation Test for Event Studies When the Number of Firms Is Small

This paper introduces a nonparametric permutation test for event studies designed to remain statistically valid even when the number of firms is extremely small, addressing a key limitation of standard tests for average and cumulative abnormal returns. The method is evaluated through Monte Carlo simulations and illustrated using real-world financial data.

Read More
Alec East Alec East

How to Use the Sharpe Ratio

“How to Use the Sharpe Ratio” by Marcos López de Prado, Alexander Lipton, and Vincent Zoonekynd (ADIA Lab) re-examines the world’s most widely used measure of investment efficiency. The authors expose common statistical errors in Sharpe ratio analysis and present a rigorous framework for correcting bias, non-Normality, and multiple testing, ensuring the metric remains reliable for both researchers and practitioners.

Read More
Alec East Alec East

Transactions of ADIA Lab

This inaugural volume of Transactions of ADIA Lab features peer-reviewed research from ADIA Lab on finance, AI, and data science, highlighting breakthroughs in asset modelling, ethics, and high-performance computing.

Read More
Alec East Alec East

Forecasting Inflation With the Hedged Random Forest

This paper explores inflation forecasting using a hedged random forest (HRF) model. Extensive empirical analysis demonstrates that this paper’s proposed approach consistently outperforms the standard random forest.

Read More
Alec East Alec East

The Hedged Random Forest

Enhance random forest regression with optimized weighting inspired by portfolio selection. Discover a new method for improved forecasting accuracy across datasets.

Read More
Alec East Alec East

The Case for Causal Factor Investing

Factor investing models are often misspecified, leading to biased risk premia estimates. Learn why causal inference, not associational methods, is key to accurate factor modeling.

Read More
Alec East Alec East

The Three Types of Backtests

Improving backtesting reliability in systematic investing. Resreach into walk-forward testing, resampling, and Monte Carlo simulations to avoid biases and false discoveries

Read More