Updated version of qt_str2_pkg with pre-training, retraining modules as well as survivor bias purification from data perspective.
This is a Github io page for STR2 Plus, a quantitative trading algorithm that uses TabNet model and GPT-3 fine-tuned model to train for a classification task serving the purpose of short-term stock trends prediction. The algorithm is a combination of technical indicator such as RSI and historical daily close data.
Specifically speaking, we first catch RSI bullish divergence pattern, then gather 53 technical indicators, as well as historical daily close price data during the pattern formation period. These are the key ingredients of training and validation dataset.
The algorithm aims to generate profitable trading signals for short-term investments. what I intend to do is to find out in the upcoming 4 days which day contains the highest probability that the stock will rise over 1% in return(of course this value is a hyperparameter due to change but I’ll settle for it for now), and then label the target of this entry with the number of day which produces the highest uptrend probability.
With help from TabNet and GPT-3 Fine-tuned Model, the algorithm will provide a daily stock pick list with predicted uptrend day as well as probabilities. We have also intergrated Kelly Criterion for muliti-asset portfolio allocation.
The picked symbols, Sharpe Ratio, prediction from both models as well as Kelly Criterion suggested percentage will be sent to user via email on daily base.
Repository Contents
The repository contains the following main components:
data/: a directory with sample data files for training and testing the algorithm.
tests/: a directory with Python scripts for testing.
models/: a directory with Python scripts for data wrangling and model traning. Scripts: wrangle.py, tabnet.py, gpt3.py (currently both models are integrated in one python script. Will seperate later.)
qt_str2_plus/: a directory contains all the key ingedients in relation to catching the RSI Bullish Divergence Pattern. Scripts: tip.py(technical indicator parsing), hbt.py(historical back test), lwr.py(latest winner result), utils.py( utility functions for data preprocessing, feature extraction, and performance evaluation and other features that is shared via scripts).
References
The algorithm is based on several research papers and books. Some of the main references are:
• CHAN, E.R.N.E.S.T.P. (2021) Quantitative trading: How to build your own algorithmic trading business. S.l.: JOHN WILEY & SONS.
• Brown, C.M. (2012) Technical analysis for the trading professional: Strategies and techniques for today’s turbulent Global Financial Markets. New York: McGraw-Hill.
• Arik, S.O. and Pfister, T. (2020) TabNet: Attentive interpretable tabular learning, arXiv.org. Available at: https://arxiv.org/abs/1908.07442 (Accessed: April 10, 2023).
• Vaswani, A. et al. (2017) Attention is all you need, arXiv.org. Available at: https://arxiv.org/abs/1706.03762 (Accessed: April 10, 2023).
• Brown, T.B. et al. (2020) Language models are few-shot learners, arXiv.org. Available at: https://arxiv.org/abs/2005.14165 (Accessed: April 10, 2023).
• Thorp, E.O., MacLean, L.C. and Ziemba, W.T. (2011) in The Kelly Capital Growth Investment Criterion Theory and Practice. Singapore: World Scientific Pub. Co.
There are also some open-source github repositories that I used in my code. Here they are:
• THK3421-models/kellyportfolio, GitHub. Available at: https://github.com/thk3421-models/KellyPortfolio (Accessed: April 10, 2023).
• Dreamquark-ai/tabnet, GitHub. Available at: https://github.com/dreamquark-ai/tabnet (Accessed: April 10, 2023).
Contact Information
My name is Mark, if you have any questions, feedback, or suggestions regarding the algorithm or the repository, please feel free to contact me at xiucatwithmark@gmail.com.