java deep learning algorithms and deep neural networks with gpu acceleration
Update
This is a newer version of the framework, that I developed while working at ExB Research. Currently, you can build the project, but some of the tests are not working. If you want to access the previous version it’s available in the old branch.
This is a Java implementation of some of the algorithms for training deep neural networks. GPU support is provided via the OpenCL and Aparapi.
The architecture is designed with modularity, extensibility and pluggability in mind.
I’m using the git-flow model. The most stable (but older) sources are available in the master branch, while the latest ones are in the develop branch.
If you want to use the previous Java 7 compatible version you can check out this release.
All the algorithms support GPU execution.
Out of the box supported datasets are MNIST, CIFAR-10/CIFAR-100, IRIS and XOR, but you can easily implement your own.
Experimental support of RGB image preprocessing operations - affine transformations, cropping, and color scaling (see Generaltest.java -> testImageInputProvider).
All the functions support GPU execution. They can be applied to all types of networks and all training algorithms. You can also implement new activations.
The samples are organized as unit tests. If you want see examples on various popular datasets you can go to nn-samples/src/test/java/com/github/neuralnetworks/samples/.
There are two projects:
The software design is tiered, each tier depending on the previous ones.
This is the first “tier”. Each network is defined by a list of layers. Each layer has a set of connections that link it to the other layers of the network, making the network a directed acyclic graph. This structure can accommodate simple feedforwad nets, but also more complex architectures like http://www.cs.toronto.edu/~hinton/absps/imagenet.pdf. You can build your own specific network.
This tier is propagating data through the network. It takes advantage of it’s graph structure. There are two main base components:
Most of the ConnectionCalculator implementations are optimized for GPU execution. There are two implementations - Native OpenCL and Aparapi. Aparapi imposes some important restrictions on the code that can be executed on the GPU. The most significant are:
Therefore before each GPU calculation all the data is converted to one-dim arrays and primitive type variables. Because of this all Aparapi neuron types are using either AparapiWeightedSum (for fully connected layers and weighted sum input functions), AparapiSubsampling2D (for subsampling layers) or AparapiConv2D (for convolutional layers).
Most of the data is represented as one-dimensional array by default (for example Matrix).
The native OpenCL implementation does not have these restrictions.
All the trainers are using the Trainer base class. They are optimized to run on the GPU, but you can plug-in other implementations and new training algorithms. The training procedure has training and testing phases. Each Trainer receives parameters (for example learning rate, momentum, etc) via Properties (a HashMap). For the supported properties for each trainer please check the TrainerFactory class.
Input is provided to the neural network by the trainers via TrainingInputProvider interface. Each TrainingInputProvider provides training samples in the form of TrainingInputData (default implementation is TrainingInputDataImpl). The input can be modified by a list of modifiers - for example MeanInputFunction (for subtracting the mean value) and ScalingInputFunction (scaling within a range). Currently MnistInputProvider and IrisInputProvider are implemented.
Ivan Vasilev (ivanvasilev [at] gmail (dot) com)