Labs

Services

  • Development

    Web applications, mobile applications, Backend & distributed systems, API design & integration, database design & scaling
  • AI

    Model training & fine-tuning, LLM application design, Agentic tooling & knowledge integration
  • Security

    Penetration testing, Red team & adversary emulation, Attack surface discovery & exposure management
  • Infrastructure

    Cloud architecture, Containerization & platform engineering, CI/CD pipelines & release engineering, Observability & SRE

Analysis

CryptoMamba Under the Hood

The arXiv abstract page identifies CryptoMamba as a daily Bitcoin price forecaster built on selective state space models, with version v1 posted on 2025-01-02 and version v2 on 2025-05-04 that notes publication at IEEE ICBC 2025 and preserves the original four‑author roster from the University of Southern California. The conference anchor for that claim is the official home of the IEEE International Conference on Blockchain and Cryptocurrency (ICBC) 2025, which ran in Pisa from 2025-06-02 to 2025-06-06 and provides the program context in which the work appeared. The foundational mechanism CryptoMamba builds on is Mamba, a selective state space model introduced in the Mamba: Linear‑Time Sequence Modeling with Selective State Spaces paper that replaces attention with input‑dependent state updates to retain linear complexity in sequence length.

The implementation commitments behind the paper are explicit in the CryptoMamba GitHub repository, which offers training scripts, evaluation utilities, trading simulators, configuration files and checkpoints under an MIT license, along with a results table and a brief set of announcements about the work’s presentation venues. The public code contains clear entry points for training, evaluation, and one‑day‑ahead prediction and corresponds closely to the experimental protocol described in the paper, which strengthens the reproducibility signal even though the paper itself omits compute‑hardware details and fixed seeds beyond mentioning reproducibility safeguards.

At the data layer the authors source BTC‑USD bars from the Yahoo Finance BTC‑USD historical data page and explicitly define their splits as train from 2018‑09‑17 to 2022‑09‑17, validation from 2022‑09‑17 to 2023‑09‑17, and test from 2023‑09‑17 to 2024‑09‑17, all at a daily cadence. The feature set is limited to open, high, low, close, and a toggle for including or excluding volume, with the open and close timestamps tracked in UTC as per the Yahoo schema. The choice to use a contiguous, recent six‑year window addresses a recurrent problem in the Bitcoin forecasting literature noted by the authors themselves: prior studies often confounded evaluation by mixing early illiquid years with modern high‑liquidity regimes, which makes like‑for‑like comparisons difficult across works.

CryptoMamba’s architecture is presented as a hierarchy of C‑Blocks, each made of stacked CMBlocks and an end‑of‑block linear projection, where each CMBlock is a normalization layer and a single Mamba block operating on the previous hidden state and the current day’s multi‑feature input. The outputs of several C‑Blocks are merged through a final linear map to produce a one‑step‑ahead forecast of the next day’s close, with sequence lengths at different depths set to 14, 16, and 32 and a Mamba state dimension of 64. Against this bespoke stack, the principal baseline that shares the same architectural lineage is S‑Mamba, introduced in the Is Mamba Effective for Time Series Forecasting? study that augments Mamba with a bidirectional block and a feed‑forward module to better capture inter‑variate correlations. Alongside S‑Mamba, the paper compares to conventional recurrent baselines: three‑layer LSTM, three‑layer bidirectional LSTM, and three‑layer GRU, each with hidden size 100, all trained under identical data splits and supervision.

The experimental setup follows a one‑day‑ahead protocol using a 14‑day input window, with RMSE, MAE, and MAPE as the reported metrics and a common optimizer across models. For optimization the authors specify the Adam optimizer with RMSE loss and a batch size of 32, together with a learning‑rate scheduler, weight decay, early stopping, and model selection on the best validation loss. The data‑pipeline guardrail to avoid leakage is explicit: predictions within each split begin 14 days after the split’s start, so that the lookback window never crosses into the previous regime. This setup is modest by modern deep‑learning standards but is coherent with the forecasting horizon and the daily granularity chosen.

The quantitative picture on the test interval tilts decisively toward the state‑space family. Without volume, the Mamba‑based methods cluster at the top: CryptoMamba reports RMSE 1713.0, MAPE 2.171%, MAE 1200.9, while S‑Mamba records RMSE 1717.4, MAPE 2.248%, MAE 1239.9; the recurrent baselines lag farther behind with GRU at RMSE 1892.4 and LSTM at RMSE 2672.7. When the single additional feature of daily volume is switched on, every model improves but CryptoMamba’s advantage widens: CryptoMamba‑v achieves RMSE 1598.1, MAPE 2.034%, MAE 1120.7 compared to S‑Mamba‑v at RMSE 1651.6, MAPE 2.215%, MAE 1209.7 and the best recurrent model still above those error levels. Because those figures are reported on the held‑out 2023‑09‑17 to 2024‑09‑17 test span and replicate the structure of the validation results on the earlier year, the margins are attributable to out‑of‑sample generalization rather than to overfitting to a specific volatility regime captured in training.

The paper further sizes the parameter counts to argue for efficiency, reporting the proposed model at roughly 136k parameters versus 330k for S‑Mamba, 569k for Bi‑LSTM, 204k for LSTM, and 153k for GRU. Given that sequence models are often compute‑to‑memory bound on long contexts, the selective‑state updates of Mamba provide an architectural path to linear scaling in sequence length; in that respect, CryptoMamba’s design banks on the same property that the Mamba paper emphasizes and converts it to a smaller, more deployable footprint without ceding the top spot in accuracy on this dataset. A direct implication is that a well‑tuned SSM can be both the most accurate and the smallest model in a regime where Transformers typically pay a quadratic tax on sequence length, which matters when deploying predictors in latency‑sensitive or resource‑constrained trading systems.

To stress test predictive value beyond regression metrics, the authors embed the forecasts in two mechanical trading simulators. The Vanilla simulator uses an absolute change‑ratio threshold of 1% between today’s close and the predicted next‑day close to decide all‑in buy, all‑in sell, or no‑trade actions, intended as a coarse proxy for transaction cost avoidance. The Smart simulator wraps a simple uncertainty buffer around the prediction using a risk parameter of 2%—calibrated from validation MAPE—to define an upper and lower band and then scales trade aggressiveness by the distance of the current price to that band. Starting from a notional 100‑dollar balance, the ending balances after the one‑year test period show CryptoMamba at the top in both simulators, with CryptoMamba‑v ending at $246.58 in Vanilla and $213.20 in Smart, followed by S‑Mamba‑v at $182.63 and $203.17 respectively, while recurrent baselines cluster close to flat or modestly positive. While these backtests are intentionally simple and omit commissions, slippage, and market impact, the ranking consistency between Vanilla and Smart is an informative external check on the regression metrics: the models that reduce RMSE and MAPE the most also dominate under naive, rules‑based application to trading decisions.

One practical detail that works in CryptoMamba’s favor is the handling of regime isolation and lookback contamination. By starting the prediction stream 14 days after the start of each split, the authors ensure that no prediction’s context includes data from an earlier regime, which is particularly important during the 2020–2021 pandemic shock and the 2023–2024 ETF‑driven rally. The model’s reliance on a small, fixed lookback of two weeks also makes it sensitive to local trend and momentum without relying on long memory; the longer receptive fields at deeper C‑Blocks then provide the capacity to soak up medium‑range dependencies without exploding parameter count. The combined effect—short local window fused with deeper selective state—tracks the intuition behind why SSMs are competitive on volatile sequences while remaining computationally lightweight.

The repository’s scope adds a useful comparator that the paper’s Table IV does not include explicitly: iTransformer, a channel‑inverted Transformer encoder introduced in the iTransformer paper. In the maintained results table within the public codebase, iTransformer and its volume‑augmented variant appear below the two Mamba‑based models and ahead of recurrent baselines in RMSE, MAPE, and MAE, which aligns with the broader time‑series evidence that channel‑first attention improves over vanilla time‑first attention but still cedes ground to selective SSMs on one‑step‑ahead forecasts of highly volatile univariate series. This addition in the code reinforces the authors’ central thesis without changing the basic ordering reported in the manuscript.

At the level of lineage and intellectual ancestry, CryptoMamba sits squarely in the modern SSM arc catalyzed by Mamba’s selective updates and hardware‑aware recurrence, but it also acknowledges and tests against a direct time‑series descendant in S‑Mamba. The Mamba codebase captures the broader engineering thrust behind SSMs as efficient drop‑in sequence modules that avoid quadratic costs, while the S‑Mamba study explored whether bidirectionality and explicit inter‑variate mixing could close the gap to Transformers on multivariate benchmarks. CryptoMamba’s choice to remain unidirectional and to focus on univariate next‑day price (optionally with volume) is deliberate and reflects the trading‑style use case: tomorrow’s close is causally determined by today and prior, and cross‑series correlations are unavailable in a single‑asset setup. That design decision keeps the structure simple enough to see incremental gains from volume yet complex enough to capture regime transitions on the price itself.

As a piece of empirical work, the paper is careful with several pitfalls that plague financial ML. It delineates absolute dates for dataset splits, avoids leakage by construction, and runs head‑to‑head comparisons under the same optimizer, window, and loss across baselines. It also reports both scale‑dependent and scale‑free metrics, which is useful when the asset traverses wide price ranges inside the test year. Where it is less complete is in operational realism: the trading simulators do not include commission schedules, spread costs, overnight financing for leveraged exposures, or exchange‑specific fee tiers, and the use of daily bars elides overnight gaps and intraday volatility that may dominate realized slippage. The backtests do, however, set conservative decision thresholds and bands intended to mute trading when predicted deltas are too small to matter net of fees, which blunts the most obvious criticisms about micro‑trade overactivity.

The claims about venue placement are broadly corroborated but deserve a precise reading. The arXiv v2 notes publication at ICBC 2025 and the conference’s public site documents the event scope and dates, while the Financial AI workshop at ICLR lists CryptoMamba by title and authors on its accepted papers page, indicating workshop‑level dissemination on 2025‑04‑28 shortly before the conference presentation window at ICBC. The precise ICBC session slot fluctuated between the program’s public view and a test page during the event’s scheduling updates, but neither discrepancy affects the technical content of the paper or its code; it only situates how and where the community first encountered the work.

In light of the surrounding literature, the contribution is not that selective SSMs exist, nor that they are competitive on long contexts; those points are established in the Mamba paper and in follow‑on time‑series studies. The new information is that a small, domain‑tuned Mamba stack trained on recent daily BTC‑USD performs at or near the top against strong recurrent and Mamba‑family baselines, that adding a single additional feature—volume—yields consistent incremental gains across models, and that these gains translate into dominance under two simple, behaviorally distinct trading rules. It is also notable that the model wins on both accuracy and parameter count, suggesting that Mamba’s linear scaling lets very small models compete effectively on noisy univariate targets. What remains unknown from the paper is how this architecture fares in multi‑asset, cross‑sectional settings, at intraday frequencies, or under transaction‑cost models that penalize turnover; these are left to future work.

The practical lesson for practitioners is straightforward. If the task is next‑day forecasting of a single crypto asset on daily bars, a compact, selective state space model purpose‑built for volatility and trained with clean splits and small lookbacks is a sensible default, and adding volume is likely to help. If the task shifts to multivariate or cross‑sectional forecasting, bidirectional SSM variants and channel‑inverted Transformers are still worth testing head‑to‑head, but the burden of proof now lies with them to beat a well‑tuned Mamba‑based baseline of the sort CryptoMamba represents on both accuracy and efficiency.

References

More Analysis

Past Work

Companies We've Worked For & Who Use Our Software

Google Fairfax ASRC Mandrivia Linux Mozilla

Contact

Our schedule’s currently full but drop us a line and we’ll see what we can do.