In this repository, I have developed the entire server-side principal architecture for real-time stock market prediction with Machine Learning. I have used Tensorflow.js for constructing ml model architecture, and Kafka for real-time data streaming and pipelining.
Start the ZooKeeper service.
$ bin/zookeeper-server-start.sh config/zookeeper.properties
Start the Kafka broker service
$ bin/kafka-server-start.sh config/server.properties
Here, I have used .csv files in the dataset folder as the source of data. The data-source is pipelined with Kafka Topics. The first topic pipelines logs to MongoDB and the second topic pipelines logs to Tensorflow model for real-time prediction.
The streaming of logs from data-source through producer and consumer makes this architecture suitable for real-time analysis, ML model training and model prediction in parallel.
The producer could be started from
$ node producer.js
# or
$ start.sh
Streaming producer logs.
$ node consumer.js
Streaming consumer logs.
In the consumer(consumer.js) the incoming logs are updated to MongoDB for further model training and analysis.
The machine learning model architecture has been developed with TensorFlow.js. The model is trained with 80% of the stored data and validated against 20% of them. As the problem statement focuses over a time-series problem so we need to pre-process the data before training. Data have been pre-processed with MinMax-Scalar algorithm.
Training the ml model.
$ node tf_train.js
# or
$ server.sh
Validating the model.
$ node tf_validate.js
After validation the real and predicted values along with date and attribute of the stock-market time-series data that the model is trained against are updated to the MongoDB.
The weights of the trained model are saved and loaded at the consumer side that subscribes to the second topic of the Kafka stream and predicts the output of the time-series event in real-time. As both topics of the Kafka pipeline are working in parallel, parallelism is achieved and logs are streamed by Kafka is real-time, which indeed implies the machine learning model could train and predict target in real-time.
$ node ml_consumer.js
Prediction [attribute] [predicted value].
example - Prediction Open 0.12453
This line in the above image(ml_consumer.js output) indicates the prediction of the model in real-time. The model utilizes 7 prior time-series logs as input and predicts the 8th time-series event.