MatFinProtocol • Live backtest leaderboard

Financial Asset Forecasting & Trading Leaderboard

Protocol‑compliant backtests evaluated under consistent, cost‑aware conditions. Each row corresponds to a published or reproducible AI‑based strategy benchmarked under MatFinProtocol.

Higher is better: Annualised profit, Sharpe Lower is better: Max drawdown % All strategies net of costs & slippage (per protocol)
MatFinProtocol Leaderboard
Click column headers to sort · Type to filter · Maintained by Dr Matloob Khushi – about & contact

This leaderboard lists manually scanned published papers where backtesting results are given and the work is loosely MatFinProtocol‑compliant. As the list get matured only fully compliant papers will be kept. To propose an additional compliant paper for inclusion, please email the full reference and backtest details to: matloob.khushi@brunel.ac.uk.

Last updated: 2026‑04‑01
Reference Asset Initial balance Final balance % annualised profit % max drawdown Sharpe ratio Backtest period Total trades Win rate % Model / notes
Uygun et al.(2025) DOI: 10.1007/s00521-025-11586-8 FX & Crypto 100 170 35.70% 27.49% 0.76 104 weeks (Ending 2023) Not given 54.95% MTGNN (Multivariate Graph Neural Network with temporal convolutions)
Wangchailert et al. (2025). DOI:10.37936/ecti-cit.2025191.256994 8 pairs EUR/USD Not given +$10,999.00 $916 per year Not given Not given 2010–2022 (13 years) 28,525 51.3% Candle pattern based trading
Papatsimpas et. al (2025) DOI:10.1007/s10898-025-01505-5 EUR/USD Not given Not given +2198.00 pips (Profit) -95.90 pips N/A July-Oct 2022 (64 days) 36 86.11% Hybrid EMD + Stacked LSTM + PSO aggregation optimization
López-Herrera, et al. (2025) DOI: 10.1007/s44163-025-00424-4 USD vs EUR, CNY, JPY, AUD, CHF, MXN, ZAR, and TRY Not given Not given 31.1% (TRY/USD Max) Near-Zero 3.38 (Median Bootstrap) 2018–2023 35 (Avg per period) 71.1% Logistic Regression with Mean Absolute Directional Loss (MADL) optimization
Iswara et al. (2025)DOI: 10.1109/ISCT66099.2025.11297372 XAU/USD Not given (Net Profit: +$1.53) Not given $0.41 Not given 4 to 5 hours 26 57.69% LSTM + Inductive Conformal Prediction (ICP) uncertainty filter
Zhang & Khushi (2020), “GA‑MSSR: Genetic Algorithm Maximizing Sharpe and Sterling Ratio…” EUR/USD 100,000 187,900 +13.4 ‑19.7 1.68 2010‑01‑01 → 2019‑12‑31 2,305 61.0 SS Ratio‑optimised GA
Arabha, Sarani & Rashidi-Khazaee (2024). arXiv:2411.01456 EUR/USD Not given Not given 42.22% (DS2 OOS, 7 months) Not given 0.47 (DS2 OOS) DS1: Jan–Jul 2017; DS2: Jan–Jul 2023 Not given Not given PPO + Auxiliary Task (PPO+AXT); LSTM Actor-Critic; DRL with OHLC + auto-encoder features; qualitative cost mention only; no slippage stated; short OOS windows (7 months each)
Pillai, Ajith & Sumesh K J (2026). arXiv:2601.19504. S&P 500 (100 stocks) $100,000 $235,492.83 53.46% CAGR 15.60% 1.68 Jan 2023 – Jan 2025 (2 years OOS) Not given 61.5% Hybrid XGBoost + FinBERT sentiment + EMA/MACD/RSI regime filter; Backtrader simulation; 70/30 train/test split; no transaction costs stated; long-only daily
Nyo et al. (2026). arXiv:2603.15848. S&P 500 (equities) $100,000 Not given Not given (189.10% total return on validation 2018–2024) 23.50% (validation) 2.04 (validation); 2.07 (test) Dev: 2000–2017; Validation: 2018–2024; held-out test set Not given 38.50% (validation) Enhanced momentum + FinBERT sentiment; EMA50/200, ATR14 trailing stop, top-10 cross-sectional momentum; strict 3-way train/val/test split; no transaction costs stated; student project paper – treat with caution
Saly-Kaufmann et. al. (2026). arXiv:2603.01820 Multi-asset futures & FX (bonds, commodities, energy, equity indices, FX – 60+ instruments) Not given Not given 26.32% CAGR (VLSTM, best model) 22.90% (VLSTM) 2.40 (VLSTM, 2010–2025) 2010–2025 (15 years, rolling OOS) Not given (turnover: ~967 ann.) 58.8% hit rate (VLSTM) Large-scale benchmark of 16 DL architectures; Sharpe-ratio optimisation objective; rolling OOS; gross returns with breakeven cost analysis per asset; statistically significant vs. passive (HAC t=8.81); Oxford-Man Institute
Azevedo et. al. (2024). SSRN:4702406 US equities (stock anomaly long-short) Not given Not given Not given Not given 0.84 net (LSTM, 1 hidden layer) Not given (post-2000 implied) Not given Not given LSTM anomaly-based long-short; net Sharpe 0.84 after all frictions; 57% cumulative performance reduction from full cost modelling; transaction costs, post-publication decay, post-decimalization all modelled
Buchanan & Benhamou (2026). arXiv:2603.14453 Top 30 S&P 500 stocks Not given Not given Not given Not given 1.10 (OOS 2019–2025; baseline 0.85) In-sample: 2005–2018; OOS: 2019–2025 Not given Not given E-TRENDS LSTM with Sharpe-ratio training loss; 70/15/15 train/val/test split; 2–5 bps round-trip transaction costs modelled; OOS Sharpe +0.25 vs. baseline; no total return or MDD stated
Cohen, Aiche & Eichel (2025). DOI: 10.3390/e27060550 NASDAQ-100 stocks Not given Not given 24.99% avg annual return (best: technical-quarterly framework) Not given 1.2967 (technical-quarterly, best model) Jan 2020 – Jan 2025 (5 years, rolling-window OOS) Not given Not given ChatGPT-4o semantic intelligence + ML (technical/fundamental/entropy frameworks); top-10 equal-weighted portfolio; cumulative return 573.37% (best); rolling-window OOS retraining; no transaction costs or MDD stated; peer-reviewed MDPI Entropy
Ghatak et al. (2025). arXiv:2509.16707 814 US equities (walk-forward production system) Not given Not given 26.38% cumulative (Jun 2021 – Jun 2025) 3.04% 2.54 Jun 2021 – Jun 2025 (4-year production walk-forward) 8,859 56.6% Deep learning (feed-forward + recurrent), six-quarter rolling calibration; production system (not pure academic backtest); trading commissions/slippage not stated; very low MDD 3.04% notable; Zanista AI industry paper
Holzer et al. (2024). arXiv:2501.10709 Stock task: DJIA 30 stocks; Crypto task: BTC LOB data Not given Not given 63.37% cumulative (stock task, PPO best, Jan 2021 – Dec 2023) Not given 1.55 (stock task, PPO); 0.28 (crypto task, ensemble) Stock: Jan 2021 – Dec 2023; Crypto: Apr 7–19 2021 (very short LOB window) Not given Not given (win/loss ratio 1.62 crypto task) PPO, SAC, DDPG ensemble; rolling 30-day training / 5-day test windows; Sortino (stock) 2.44; costs mentioned but no numeric value; crypto window 13 days only; ACM ICAIF competition paper
Hajdini, Nuhiu & Leka (2025). PeerJ Computer Science. DOI: 10.7717/peerj-cs.3630 US equities: AAPL, MSFT, AMZN, BAC, NVDA $1,000,000 per security Not given 53.87% avg annualised return (LLM-MAS-DRL framework) 12.54% avg 1.702 avg (LLM-MAS-DRL) Jul 2024 – Jun 2025 (OOS, 12 months) 25 avg per security 71.30% avg Three-layer LLM multi-agent + DRL framework (market analysis, risk management, execution); 0.1% transaction cost per trade modelled; Sortino, Calmar, CVaR also reported; comparison vs. Buy&Hold and PPO/A3C baselines; peer-reviewed PeerJ Computer Science
Fan et al. (2025). DOI: 10.1145/3766918.3766922 Ethereum (ETH) – multi-factor quantitative model Not given Not given 97% annualised return (main result, threshold ±1.0) 22% 2.5 (threshold ±1.0); 2.2 at ±0.5; 2.3 at ±1.5 Q4 2024 (Oct–Jan 2025) bull test; Q1 2025 (Jan–Apr 2025) bear test; sensitivity range Oct 2024–Apr 2025 Not given (1.7 trades/week at ±1.0) 59% ML-driven multi-factor model (RSI, MACD + on-chain gas/active address metrics); Information Ratio 1.2; avg holding period 4.5 days; IC mean 0.12; no transaction costs stated; note: 6-month backtest window is short; bull and bear sub-period tests reported separately
Huang et al. (2025). arXiv:2502.17493 US stocks (daily rebalancing, public stock data) Not given Not given 61.73% p.a. (2019–2024 test period) Not given 1.18 (2019–2024) Test 1: 2019–2024 (1,340 days); Test 2: 2005–2010 (1,360 days) Not given (daily rebalancing) Not given Return-weighted loss function for deep learning stock selection; 37.61% p.a. on 2005–2010 OOS test (Sharpe 0.97); two disjoint test windows including bear-market 2005–2010; no transaction costs or MDD stated; arXiv preprint 2025
Nguyen (2026). arXiv:2602.11708 Crypto perpetual swaps (150+ pairs, Binance Futures; top 20 by market cap) Not given Not given 40.5% annualised (70/30 long-short) 12.70% 2.41 Jan 2022 – Dec 2024 (OOS, 36 months; in-sample Jan–Dec 2021) ~142 trades/month portfolio-wide 54.2% (bull regime); overall not stated AdaptiveTrend: 6-hour momentum + dynamic trailing stop + Sharpe-based asset selection; 4 bps taker fee + slippage + funding modelled; Sharpe retained >2.0 at 8 bps; strict OOS separation; bootstrap significance tests
Zarattini, Pagani & Barbon (2025). SSRN:5209907 Crypto (Bitcoin + top-20 liquid altcoin rotation) Not given Not given +10.8% annualised alpha vs. BTC Not given >1.5 2015 onwards (exact end date not stated) Not given Not given Ensemble Donchian channel trend models (multi-lookback) + volatility position sizing; rotational portfolio; net-of-fees returns; transaction cost impact assessed; limited metric disclosure – backtest end date not stated
Jay & Berlanga (2024). SSRN:4987237 Cryptocurrency (BTC / crypto market) Not given Not given Not given (annualised return reported per model) Reported (model-specific) Reported per model (DQN best profit; LSTM best consistency; RF best drawdown control) Not explicitly stated Not given Not given Benchmarks DQN, LSTM, RF agents on crypto TA indicators; full suite: Total Return, Ann. Return, Volatility, Sharpe, Sortino, MDD, Calmar; no transaction costs stated; no single extractable top-line; exact period not published
Sattarov & Choi (2024). DOI: 10.1038/s41598-024-51408-w Bitcoin (BTC-USD) Not given Not given 29.93% annualised Not given 2.74 Training: Oct 2014 – Oct 2018; Test: last 30 days (~720 hrs) within Oct 2018 – Mar 2019 Not given Not given M-DQN (Multi-level Deep Q-Network) + Twitter sentiment; 3-module architecture; 1.5% round-trip transaction fee explicitly modelled; caution: test window only 30 days; Nature Scientific Reports (peer-reviewed)
Nguyen et al. (2025). DOI: 10.1080/23322039.2025.2594873 Bitcoin (BTC-USD) $1,000,000 $125,903,751.60 Not given (total return: +12,490.38% over Jan 2022 – May 2025) Not given Not given Train: Jan 2012 – Dec 2021; Test (OOS): Jan 2022 – May 2025 1,240 Not given DQN meta-strategy selector (chooses among RSI, SMA Crossover, Bollinger Bands, Momentum-20d, VWAP Reversion); ⚠ No Sharpe or MDD stated; no transaction costs stated; 12,490% return is unvalidated by risk metrics – credibility caution; peer-reviewed journal
Ni, Zhang & Fu (2025). arXiv:2412.18202. Cryptocurrency (BTC, ETH and major crypto assets) Not given Not given Not given (outperforms buy-and-hold benchmark) Not given Not given Not stated (dataset ends 2024) Not given Not given Denoising autoencoder + CNN + GAN pipeline for crypto trading signal generation; outperforms buy-and-hold; Sharpe/return/drawdown numerics not stated in abstract – insufficient for main leaderboard row; listed for transparency
Al-Waked & Al-Zoubi (2025). DOI: 10.14569/IJACSA.2025.0161181 Gold (XAU/USD) Not given Not given (cumulative: 80.21%) 27.10% CAGR (PPO + Kalman filtering) 0.48% 12.10 (PPO + Kalman) ⚠ Jan 2017 – Jan 2025 (8 years, hourly data; N=47,304) Not given Not given PPO + Kalman filter for noise-resilient DRL trading; Kalman reduces microstructure noise; raw PPO baseline: Sharpe 0.45, CAGR 3.46%; ⚠ Sharpe 12.10 is unusually high – likely driven by very low realised volatility in filtered series and no transaction costs stated; no train/test OOS split stated; IJACSA peer-reviewed
Fatouros et al. (2024). arXiv:2412.19245 US common stocks (news-based long-short portfolio) Not given Not given Not given Not given 3.05 (OPT long-short); 2.11 (BERT); 2.07 (FinBERT) Aug 2021 – Jul 2023 (24 months OOS) Not given Not given Sentiment trading with LLMs (OPT, BERT, FinBERT, Loughran-McDonald); daily long-short strategy on next-day stock returns using news sentiment; Loughran-McDonald baseline Sharpe 1.23; no transaction costs or MDD stated; Sharpe only metric extractable; arXiv preprint 2024