Computer Science and Computer Engineering

The article is an overview. We carry out the comparison of actual machine learning libraries that can be used the neural networks development. The first part of the article gives a brief description of TensorFlow, PyTorch, Theano, Keras, SciKit Learn libraries, SciPy library stack. An overview of the scope of these libraries and the main technical characteristics, such as performance, supported programming languages, the current state of development is given. In the second part of the article, a comparison of five libraries is carried out on the example of a multilayer perceptron, which is applied to the problem of handwritten digits recognizing. This problem is well known and well suited for testing different types of neural networks. The study time is compared depending on the number of epochs and the accuracy of the classifier. The results of the comparison are presented in the form of graphs of training time and accuracy depending on the number of epochs and in tabular form.


Introduction
Due to the vast development of machine learning and data science, it is not possible to review the diversity of available software solutions. In this section we will consider only the most popular libraries and frameworks for neural networks development and machine learning.
The most common language for building neural networks at the moment is the Python language [1]. There is a number of reasons why this language has occupied this domain.
-Python is easy-to-learn language actively used in the field of school and university education. Because of this, it has gained popularity not only in industrial programming, but also among professionals who use programming as a research tool. -The standard cpython interpreter makes it easy to create bindings for C-function calls, allowing Python to be used as a convenient interface for low-level libraries. -The community has created a wide range of tools for interactive Python code execution and data visualization (e.g. [2]- [4]). Especially it is useful for scientific research, where almost always there is no original clear algorithm of solutions and it is necessary to conduct a scientific search. A significant disadvantage of Python is its low performance, which can be overcome by writing critical parts of software in a compiled language (it is usually C or C++) or by using cython [5] translator.
Many machine learning libraries are also written in two or more languages. The part of software, which handles main part of computations, is usually implemented in C or C++ (the core part or backend). Pure Python is used for bindings to organize a convenient and easy to use interface (interface part or frontend). So, if a library is implemented in pure Python, then in most cases: -It is an add-on to another, lower-level library and provides a more user-friendly and easy-to-learn interface; -It is designed for educational purposes or for prototyping. Note also that all the libraries in this review are free open source software. Also note that Python 2 support will be discontinued from the beginning of 2020 for vast majority of python libs.

Overview of machine learning (ML) libraries
From all reviewed libraries, only TensorFlow and PyTorch directly compete with each other. Other libraries complement each other's functionality and specialize in their own area.

Scientific Python (SciPy)
Scientific Python libraries set is not directly related to machine learning, but many machine learning libraries rely on Scientific Python components in their work. Let us briefly describe the main components included in this set.
-NumPy [6] is the library which implements high performance arrays and tools for them. The computational core is written in C (52% of code base) with the interface part in Python (48% of code base). Linear algebra functions heavily rely on LAPACK library. NumPy implements variety of linear algebra methods for working with vectors, matrices and tensors (multidimensional arrays in this case). It also supports parallel computing by utilizing vector capabilities of modern CPUs. -SciPy [7] is the library that implements many mathematical methods, such as algebraic equations and differential equations solvers, polynomial interpolation, various optimization methods, etc. For a number of methods the Fortran (about 23% of the total code base) and C (20% of the code base) libraries are used. -Pandas [8] is the library designed to work with time series and table data (DataFrame data structure). It is written almost entirely in pure Python using NumPy arrays and is often used in machine learning to organize training and test samples. In addition to these three main libraries, the scientific Python stack also includes Matplotlib [3] for data visualization and plotting, and a set of interactive shells, such as iPython and Jupyter [2].

TensorFlow
TensorFlow [9], [10] is an open source library used primarily for deep machine learning. It was originally developed on by Google's divisions, but in 2015 it was released as free open source software under the Apache License 2.0. The current stable version is 2.1.0.
The computational core is written in C++ (60% of all code) using CUDA technology, which allows one to utilize graphics cards in calculations. The interface part is implemented in Python (30% of all code base). There are also unofficial bindings for other languages, but only C++ and Python interfaces are officially supported.
The library is based on the principle of data flows (dataflow), according to which the program is organized in the form of computational blocks associated with each other in the form of a directed graph which is called computational graph. Data is processed by passing from one block to another.
Such application architecture makes it easy to use parallel calculations on both multi-core CPUs and distributed cluster systems. In addition, it is well suited for building neural networks in which each neuron is presented by an independent component.
In addition to the computational graph, TensorFlow uses a data structure called tensor. It is similar to the tensor from differential geometry in the sense that it is a multidimensional array.

PyTorch
The PyTorch [11] library was created on the basis of Torch [12]. The original Torch library was developed in C and used Lua as the interface. With the growth of Python popularity in machine learning, Torch has been rewritten in C++11/CUDA (60% code) and Python (32% code). Initial development was conducted in the company of Facebook, but currently PyTorch is an OpenSource library, distributed under a BSD-like license. The current version is 1.3.1.
PyTorch, as well as TensorFlow, is built on the basis of dataflow concept. The main difference from TensorFlow is that in TensorFlow computational graph is static, then in PyTorch the graph is dynamic. This means that one can modify the graph on the fly, adding or removing nodes as needed. In TensorFlow, the entire graph must be specified before the model run.
The developers of PyTorch emphasize that Python is tightly integrated into the library (library is more pythonic). This makes it easier to use than TensorFlow, as the user does not have to dive into low-level parts written in C++.
It is worth noting, however, that TensorFlow surpasses PyTorch in popularity, as it appeared earlier and is used in many educational courses on machine learning.

Theano
Theano [13] library is a Python interface for the optimizing compiler. It allows user to specify functions and after that translates them to C++. Then Theano compiles C++ code to run it on the CPU (using g++ for compilation), or on the graphics accelerator (using nvcc to utilize CUDA). In addition, automatic differentiation algorithms are built into the library.
After optimization and compilation, the functions become available as regular python functions, but have high performance. Vector, matrix, and tensor operations are supported and efficiently parallelized on available hardware (multi-core processor or graphics accelerator).
With support for multidimensional array operations and automatic differentiation, Theano is widely used as a backend for building neural networks. In particular, it can be used by the Keras library.
Theano is written almost entirely in Python, but requires NumPy, SciPy, py-CUDA and BLAS, as well as g++ or NVIDIA CUDA compilers (recommended for optimal performance).
Development of the library was suspended in 2017, but resumed in 2018. The current version is 1.0.4.

Keras
The Keras [14] library provides a high-level programming interface for building neural networks. It can work on top of TensorFlow, Microsoft Cognitive Toolkit (CNTK) [15] or Theano [13]. The library is written entirely in Python and is distributed under the MIT license. Current version 2.3. 1 The library is based on the following principles: ease of use, modularity, extensibility.
The modularity principle allows you to separately describe the neural layers, optimizers, activator functions, etc, and then combine them into a single model. The model is fully described in Python. The created model can be saved to disk for further use and distribution.

SciKit Learn
SciKit Learn [16] is the library for data processing. It implements various methods of classification, regression analysis, clustering and other algorithms related to classical machine learning. It is written almost entirely in Python (98% of all code base), but uses NumPy and SciPy for algorithms implementation. Despite the fact that number of current version is 0.21.1, the project is very stable, as it has been developing since 2007.
SciKit Learn is suitable for traditional machine learning and data preprocessing tasks. This library does not support the concept of dataflow and does not allow one to create his own models. The absence of a computational graph does not allow flexible scale of models for multi-core processors and graphics accelerators and forces to limit the degree of parallelism that is implemented in NumPy.

Description and architecture of neural network
For the comparative analysis of deep machine learning libraries, we choose the problem of handwritten digit recognition from the MNIST database and the neural network [17] to solve it.
The MNIST database is contained in a CSV file, where comma-separated digits are written. In a CSV file, the first value is a marker that represents the corresponding digit. Next value is the size of the digit image in pixels, consisting of 784 values and having a dimension of square 28x28.
The training file consists of 60 thousand copies, and the test file of 10 thousand copies. To solve the problem we choose the MLP (multilayer perceptron) architecture. Perceptron was one of the first models of neural networks, which was supposed to simulate the neural processes in human mind. This model was proposed by Frank Rosenblatt in 1957 and first implemented in 1960 [18]. A multilayer perceptron according to Rosenblatt differs from a single layer in that it contains additional hidden layers.
The neural network (Figure 1) consists of an input layer, two hidden layers, and one output layer. The input layer contains 784 neurons, the hidden layers contain 256 neurons, and the output layer 10, according to the number of features. The activation function in hidden layers is ReLU, which has the form ( ) = max(0, ) [19]. Stochastic gradient descent (SGD) [20] is used as the optimization algorithm.

Software implementation of a neural network using various libraries
With each library from the overview above, we built perceptron models. Each neural network was trained, and training time and accuracy were measured for a different number of training eras. We used these measurements for comparative analysis of libraries. The constructed graphs describe the dependence of learning time and accuracy on the number of eras (Figures 2-6).  Below is the summary table of the results of neural network training at 50 epochs for different libraries (see Table 1).
On the diagram (Figure 7     The accuracy of the TensorFlow library does not exceed 0.9, which is low compared to other libraries. Scikit-learn also showed a low accuracy result. The PyTorch library shows a good accuracy result only with a large number of epochs, but with the growth of the number of epochs, the learning time increases greatly. The minimum accuracy was 98.07. The Keras and Theano libraries are the most accurate and their accuracy is kept at 0.98.

Conclusion
Based on the comparison of different libraries, a number of conclusions can be drawn. Almost all libraries except PyTorch show approximately the same learning time. In the case of PyTorch, the longer learning time can be explained by the support of a dynamic computational graph, which apparently imposes additional computational costs. In turn, the TensorFlow library showed an average accuracy result, behind PyTorch and Theano.