Article View - TPM - Testing, Psychometrics, Methodology in Applied Psychology

CHEN HENG , WEI XIANHUA

This paper reproduces and extends the research on “Can ChatGPT and DeepSeek Predict Stock Markets and Macroeconomics?” Based on news texts and A-share market data, features are constructed by integrating sentiment scores from large language models to evaluate linear models, ensemble learning, ARIMA, and various combination forecasting methods. Results indicate that ChatGPT's sentiment features outperform DeepSeek in directional accuracy and out-of-sample R²; sentiment scores emerge as the most significant predictor; and Iterative Weighted Combination (IWC) achieves the best out-of-sample performance. The study demonstrates significant potential for LLM applications in financial forecasting.