From my experience with crypto currencies and the intelligent trading bot [0] I would say that transformers will not provide significant benefits when applied to the traditional statistical (numeric) forecasting problems. Such models assume that old events do not affect too much current events.
Yet, there exist problems where even old events retain their strength. An example is where we want to take into account discrete events (tokens in LLM) for predicting stock prices. These events might be explicitly defined (holidays, company announcements, important economic figures etc.) or derived from the data like technical patterns. The strength of transformers are in their ability to ignore the order of events and ignore the distances between them. More precisely, transformers can learn when it is important. In language models, this is used to generate output sequences where semantically equal tokens have completely different order than in the input sequence. Something similar can be done in time series forecasting if we accordingly define "tokens", for example, as technical patterns. Then rising stock prices can be explained (and predicted) not only because of recent numeric behavior but also because "something happened" two weeks ago.
Yet, there exist problems where even old events retain their strength. An example is where we want to take into account discrete events (tokens in LLM) for predicting stock prices. These events might be explicitly defined (holidays, company announcements, important economic figures etc.) or derived from the data like technical patterns. The strength of transformers are in their ability to ignore the order of events and ignore the distances between them. More precisely, transformers can learn when it is important. In language models, this is used to generate output sequences where semantically equal tokens have completely different order than in the input sequence. Something similar can be done in time series forecasting if we accordingly define "tokens", for example, as technical patterns. Then rising stock prices can be explained (and predicted) not only because of recent numeric behavior but also because "something happened" two weeks ago.
[0] https://github.com/asavinov/intelligent-trading-bot Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering