Discrete and Continuous Models and Applied Computational ScienceDiscrete and Continuous Models and Applied Computational Science2658-46702658-7149Peoples' Friendship University of Russia2219310.22363/2658-4670-2019-27-1-21-32Research ArticleParallel algorithm for numerical solution of heat equation in complex cylindrical domainAyriyanAlexander S<p>researcher of the Laboratory of Information Technologies</p>ayriyan@jinr.ruBuša JrJán<p>PhD in Mathematics, Senior Researcher of the Laboratory of Information Technologies</p>busa@jinr.ruJoint Institute for Nuclear ResearchInstitute of Experimental Physics, Slovak Academy of Sciences15122019271213220112019Copyright © 2019, Ayriyan A.S., Buša Jr J.2019<p>In this article we present a parallel algorithm for simulation of the heat conduction process inside the so-called pulse cryogenic cell. This simulation is important for designing the device for portion injection of working gases into ionization chamber of ion source. The simulation is based on the numerical solving of the quasilinear heat equation with periodic source in a multilayered cylindrical domain. For numerical solution the Alternating Direction Implicit (ADI) method is used. Due to the non-linearity of the heat equation the simple-iteration method has been applied. In order to ensure convergence of the iteration process, the adaptive time-step has been implemented. The parallelization of the calculation has been realized with shared memory application programming interface OpenMP and the performance of the parallel algorithm is in agreement with the case studies in literature.</p>quasilinear heat equationmultilayer cylindrical geometrical structurepulse periodic sourceparallel algorithmthermal gatesквазилинейное уравнение теплопроводности, многослойная цилиндрическая геометрическая структура, импульсный периодический источник, параллельный алгоритм, тепловые вентили<p>Introduction The purpose of this work is to develop algorithms for simulation of the heat conduction process inside the so-called pulse cryogenic cell [1, 2]. Such simulations are important for designing the cell that implements the thermal gates of a portion injection of working gases into the ionization chamber of a multiply charged ion source [3]. While reliable operation of mechanical valves for pulsed injection of gaseous mixtures in the millisecond range at cryogenic temperatures is practically impossible, the use of gas temperature properties at cryogenic temperatures can be a real alternative. Indeed, the vapor pressures of various gases have strong dependency on the temperature [4] in the interval between temperatures of liquid helium 4.2 K and liquid nitrogen Ayriyan A., Bua Jr. J., 2019 This work is licensed under a Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/ 78 K [5], typical temperature terminals in the cryogenic technique used as thermostats with large capacity. The cryogenic cell is a multilayer cylinder (see Fig. 1) placed inside a vacuum chamber. The thermal process in the cylinder starts by passing an electrical current through its conductive layer. It allows heating the outer layer of the cylinder to upper desired temperature (maximal critical temperature). After switching the current off, the cylinder is let to cool down to lower desired temperature (minimal critical temperature). This process periodically repeats for a given period of time based on the requirements of the experiment. A copper core of the cylinder serves as a cooler for outer layers during and after the heating process. It is connected to a liquid helium temperature terminal. The core is separated by the electrical insulator from the conductive layer. It is made to avoid the flow of the electrical current into the core. The last layer is a thin coating which prevents molecules of working gases from binding to the conductive layer made of graphite. 3 rmax 2 4 0 1 r 2 r 1 r 0 z0 zmax Figure 1. A schematic view of the quarter of the cell slice through the axis. The bottom line is the cylinder axis (the symmetry axis, r = 0). The cooler (the copper core rod) cools the cell by contact with the temperature terminal - 4 (liquid helium). The heater - 2 (the conductive layer) heats the cell up by the way of passing of the electrical current. The inner insulator - 1 - is needed to prevent the electrical current outflow from the heater to the cooler. The outer insulator - 3 - prevents molecules of working gases from binding to the conductive layer Initial-boundary value problem Let us consider the heat equation describing thermal evolution in the closed cylindrical domain Ω = {(r, z) | r [0, rmax(z)], z [0, zmax(r)]}: T 1 1 T \ 1 T \ (r)cV (T, r) t = r r r(T, r) r + (T, r) + z z + X(T, t, r; tper, tsrc), (1) r where the thermal coefficients are the non-linear function of the temperature and they have discontinuities of the first kind along the radial direction at m (m = 0, 1, 2, 3, see Fig. 1). The source function producing the periodic process of heating can be expressed in the form I2(r) S X(T, t, r; tper, tsrc) = (T ) C p(t; tper, tsrc), (2) where p(t; tper, tsrc) is the periodic normalized function with parameters tper (time of period) and tsrc (time of heating less than or equal to tper); n N0 is the index of a period; (T, r) is temperature depended specific resistivity with discontinuities at the given values of r; I(r) is the electrical current amplitude which has a finite value I0 only in the source layer (see Fig. 1), everywhere else it is zero; SC is the cross-section of the source layer. The periodic normalized function p(t; tper, tsrc) is given as a rectangular pulse one [6]: u(t; tper, tsrc) = [ (t - n tper) - (t - n tper - tsrc)] , (3) n=0 where (t) is Heaviside step function [7, 8], however, in order to make the processes of turn on and turn off more realistic, the model of the transient process (4) (see Fig. 2) has been implemented: v(t; tper, tsrc, ttrs, , ) = = 1 2 n=0 Г 1 erf 1 t - ntper \\ t - 1 trs 1 - erf 1 t - n tper - tsrc t - 1 trs \\l . (4) Here erf(t) is the error function [7, 8]. 1.0 0.8 u(t), (t) 0.6 0.4 0.2 ttrs 0.0 n.t per n.t t per +tsrc Figure 2. Dashed line: the periodic step function u(t) as a combination of the Heaviside ones. Solid line: the realistic source function for = 4 and = 2 representing the transient model. The dot-dashed lines show the ends of the transient process at turn on and turn off Such smoothing of the processes of turn on and turn off of the source also helps to stabilize numerical simulations. The initial condition can be set by the assumption that at t = 0 the cryogenic cell has been already cooled down by the temperature terminal: T (r, z, t = 0) = T0, (5) where T0 4.2 K is the temperature of liquid helium. The boundary conditions can be expressed after the assumption that the temperature flow across the boundary of the domain is zero except for the right side where a connection to the temperature terminal exists (see Fig. 1): T n = 0 (r, z) Ω \ {(r, z) : z = zmax}, (6) T = T0 (r, z) {(r, z) : z = zmax}, where Ω is the boundary of the domain Ω, n is the normal vector of the boundary Ω. This assumption is motivated by the following statements: the cryonics cell is installed in the vacuum chamber, therefore, there is no convective heat transfer; the working temperature is too low for the appearance of thermal radia- tion; we neglect the energy for evaporation of gas molecules; there is no temperature flow through the axis r = 0 due to the axial symmetry. Discretization and numerical method Numerical solution of Eq. (1) can be obtained by using a shifted non-uniform grid (see Fig. 3): = {(t, r, z) | 0 � t , tk+1 = tk + k+1, k N0; 0.5h1 � r � rmax - 0.5hNj -1, ri+1 = ri + hi+1, i = 0, . . . , Nj - 1; 0.51 � z � zmax - 0.5Mi-1, zj+1 = zj + j+1, j = 0, . . . , Mi - 1}. (7) r z Figure 3. The discretization of the 2D domain. The non-uniform grid is shifted in order to have points at the centers of boxes. Different layers (materials) have different step size (they are differently colored) It seems usual to make uniform steps in the subdomain corresponding one layer. The shifted grid has no points at the boundary of discontinuity. One can also use a special grid - the grid with points at the boundary of materials, in this case one has to take care of approximation of the thermal coefficients and the source function at the boundary. The initial boundary value problem Eqs. (1)-(6) has been approximated on the grid (7) by the alternating direction implicit (ADI) schemes [9-12]: T i,j - Ti,j i,j cV i,j 0.5 = r [ T i,j ] + z [ Ti,j ] + Xi,j, (8) Ti,j - T i,j i,j cV i,j � 0.5 = r [ T i,j ] + z [ T�i,j ] + Xi,j, (9) where T�i,j is the temperature distribution on the next time layer, T i,j is the temperature distribution on the half layer (in between the next and current time layers), Ti,j is the value on the current time layer and is the time step, i,j = (T i,j ), cV i,j = cV (T i,j ), Xi,j = X(T i,j ). The spatial finite difference operators in Eqs. (8) and (9) are: 1 1 Г Ti+1,j - Ti,j Ti,j - Ti-1,j l r [Ti,j ] = r h ri+ 1 i+ 1 ,j h - ri- 1 i- 1 ,j h , (10) i i 2 2 i+1 2 2 i i,j z [ T ] = 1 j Г 2 i,j+ 1 Ti,j+1 - Ti,j j+1 2 - i,j- 1 Ti,j - Ti,j-1 l j , (11) where i = 1, . . . , Nj - 1, j = 1, . . . , Mi - 1, hi = ri - ri-1, j = zj - zj-1, hi = (hi+1 + hi) /2, j = (j+1 + j ) /2, i,j = m(Ti,j ), cV i,j = cV m(Ti,j ), Xi,j = Xm(Ti,j ), ri1/2 = (ri + ri1)/2, i1/2,j = m(Ti,j + Ti1,j )/2, i,j1/2 = m(Ti,j + Ti,j1)/2. Due to the non-linearity of Eqs. (8)-(9) (when the thermal coefficients and the source function depend on temperature) the simple-iteration method has been applied for calculation of the sought-for function on the half and next time layers. The recursive forms for (8)-(9) are expressed as follows [11-13]: Ts+1 s s i,j - Ti,j i,j cV i,j = 0.5 1 1 г Ts+1 s+1 Ts+1 s+1 l = ri+ 1 s 1 i+1,j - Ti,j - ri 1 s 1 i,j - Ti-1,j + ri hi 2 i+ 2 ,j hi+1 - 2 i- 2 ,j hi i,j + z [ Ti,j ] + Xs , (12) here it is assumed that when s , then Ts T , s , cV s cV , s , and Xs X. i,j The iteration process starts with the initial condition that Ts=0 = Ti,j , and stops after fulfilling the following criteria ||T s+1 - Ts||C = max lTs+1 - Tsl . (13) The values of the sought-for function on next time-layer are obtained as this T s+1 i,j - T i,j i,j cV i,j = 0.5 1 г Ts+1 s+1 Ts+1 s+1 l s i,j+1 - Ti,j s i,j - Ti,j-1 = r [ T i,j ] + j 2 i,j+ 1 j+1 2 j - i,j- 1 + Xi,j, (14) i,j with the initial condition Ts=0 = T i,j . Same as before Ts T� and s . This iteration process stops after fulfilling the same criteria (13). The systems of linear algebraic equations (12) and (14) are solved by the Thomas method [13-15]. In order to ensure convergence of the iteration process (12)-(14), the adaptive time-step has been implemented. If the counter of iterations s exceeds some maximal value, meaning that the process does not converge fast enough, the time-step is divided by 2 and the iteration process is restarted (back to the Eq. (12)). Parallel algorithm Main computational complexity comes from repetitive calculation of Eqs. (12) and (14) across each time layer. A solution of the first of them is needed as a start for solving the second one and it is again used as a start for calculation of the sought-for function on the next time layer. Due to rather low complexity of solving of one system of linear equations ((12) or (14)) it is better to use parallelization based on shared memory since for distributed memory parallelization the cost of data transfer would be too high. Therefore, we opted for OpenMP [16, 17]. In Fig. 4 one can see the flowchart of the algorithm. After initializing the solution - setting tprd, tsrc, I, and t0 = 0, the program repeats iterations until the requested time is reached. In one step of evolution it first initializes the estimate of the time step as being equal ttrs/1000 if we are in the transition process or tsrc/100 elsewhere. After this it starts the iteration process in order to obtain the solution T i,j at the time tk + /2 using (12). If the obtained solution is precise enough, it alternates direction and continues by the way of (14) to the solution T�i,j at the time tk + . If this solution is again precise enough, the algorithm sets the total time to tk = tk-1 + and the actual solution Ti,j = T�i,j . If in any of the previous tests the number of iterations exceeds the number of maximum iterations (iter - see Fig. 4), then the time- step is divided by 2 and the calculation returns to the beginning of the evolution step. The whole algorithm terminates when the total time t reaches the desired value or when we realize that we have entered into the periodic process (temperature changes periodically depending on switching the current in conductive layer on and off). A. Ayriyan, J. Bua Jr., Parallel algorithm Load tprd, tsrc, I; Initialize solution Ti,j ; Set k = 0, tk = 0 Initialize s = ttrs/1000 inside transition process = tsrc/100 elswhere s = 0 0 T i,j = Ti,j T i,j = T i,j s = 0 T�0 i,j = T i,j OpenMP for each j OpenMP for each i s = s + 1 tk+1 = tk + s (12) s-1 s = s + 1 T i,j (T i,j , Ti,j, /2) T� s i,j (T�s 1, T i,j , ) (14) - i,j k = k + 1 i,j T�i,j = T�s - s s 1 lT i,j - T i,j l yes s s-1 yes Ti,j = T�i,j ? no lT�i,j - T�i,j l ? no Termination no condition? yes s iter ? yes yes Return Ti,j s iter ? no = /2 Terminate no 27 Figure 4. Schematics of the algorithm with parts running using OpenMP highlighted by the dashed line Numerical results The calculations have been performed on the HybriLIT computational platform [18, 19] using the Skylake processor Intel Xeon Gold 6154 [20] containing 18 CPU cores providing two threads per core (36 threads in total) under OS Scientific Linux 7.5 (Nitrogen) [21]. As an example we have studied the case when tper = 10-1, tsrc = 10-2, ttrs = 10-4, and I0 = 0.5742. These values of parameters were chosen to realize the thermal gates for maximal and minimal critical temperatures 42 K and 37 K, respectively [22]. The cell size was selected as follows: z0 = 4, zmax = 5, r = 0.24, r = 0.245, r = 0.25, rmax = 0.2501. The whole domain 0 1 2 was split into 100 parts along the z axis at the first layer (core) and to 80 in the other layers. Along the r axis, individual layers were divided (starting from the core) into 800, 200, 200, and 10 parts respectively. This discretization of the domain was chosen in order not to split the domain into too many parts and at the same time have enough information about the solution. It is noteworthy that we have much more steps in the radial direction. It is so because we expect the flow in-between the layers to be more active than the relaxation towards the terminal. Nevertheless, in our experiments we have also densified the grid to see the impact of the grid on both the precision of the solution and the calculation time. The results are given in Fig. 5, it shows the temperature fields at different moments of evolution inside the required steady periodic temperature regime. 50 50 40 40 40 Temperature, K 30 30 30 20 20 20 10 10 0 1 2 3 4 0.15 0.2 0 1 4 0.25 0.3 2 3 0.15 0.2 10 0.25 0.3 z, cm 5 0 0.05 0.1 r, cm z, cm 5 0 0.05 0.1 r, cm Figure 5. Temperature fields when the periodic temperature regime is achieved. Left panel: the temperature field at the moment before turning on the source. Right panel: the temperature field at the moment when the source was just turned off The field is shown in the whole cylindrical domain right before turning on the source (at the left panel). The field is in a state of maximum relaxation in that moment. At the right side the temperature filed is shown in the moment when source is turning off. The results demonstrate that temperature significantly changes only in outer layers, the core rode is practically not heated, it works as thermostat. The evolution of the temperature at the surface of the cryogenic cell at the z = 0 is given in Fig. 6, it is evident that is takes time to achieve the required steady periodic temperature regime. A. Ayriyan, J. Bua Jr., Parallel algorithm 50 50 50 40 40 40 Temperature, K 30 30 30 20 20 20 10 10 10 0 10 20 30 40 50 60 70 0 0.5 1 1.5 2 2.5 3 3.5 66.5 67 67.5 68 68.5 69 69.5 70 Time, s Time, s Time, s Figure 6. Temperature evolution on the surface at the z = 0 (black line). Red colored upper dashed line represents the maximal critical temperature (42 K) and blue colored lower dashed line represents the minimal critical temperature (37 K). Left panel: Temperature dependence on the time from 0 up to 70 sec. Central panel: the beginning of the evolution (the first 3.5 sec.). Right panel: the last 3.5 sec. of the evolution when the periodic steady temperature regime is achieved 20 18 16 Time [hour] 14 12 10 8 6 4 2 0 0 3 6 9 12 15 18 21 24 27 30 33 36 omp threads 20 18 16 14 Speedup 12 10 8 6 4 2 0 0 3 6 9 12 15 18 21 24 27 30 33 36 omp threads 1.0 0.8 Efficiency 0.6 0.4 0.2 0.0 0 3 6 9 12 15 18 21 24 27 30 33 36 omp threads 29 Figure 7. The experimental performance of the algorithm. Left panel: the calculation time in hours. Central panel: the speed up of calculations. Right panel: the efficiency of parallelization Summary and conclusions We have developed the algorithm for numerical simulations of heat evolution inside the multilayered cylindrical domain with the periodic source. One can see from Fig. 6 that the required periodical temperature regime cannot be obtained immediately, it has a setup mode which is much longer than one period, thus it has to be taken into account when designing the pulse cryogenic cell. The simulations show the possibility of the realization of thermal gates for a particular set of parameters. The algorithm has been integrated to the hybrid algorithm MPI+OpenMP for solving the optimization problem of the heat source characteristics (tper, tsrc, and I0) of the pulse cryogenic cell [22]. The performance of the parallel algorithm (see Fig. 7) is in agreement with the case studies in literature, e.g., [23, 24]. As it is shown in the picture that the saturation calculation time been achieved at 18 threads, after acting the hyper-threading the speedup stops. Thus there is no reason to involve in calculations more than this number of threads for the considered problem with the given grid size (1210 100).</p>[D. E. Donets, E. E. Donets, T. Honma, K. Noda, A. Y. Ramzdorf, V. V. Salnikov, V. B. Shutov, E. D. Donets, Physics research and technology developments of electron string ion sources, Review of Scientific Instruments 83 (2012) 02A512. doi:10.1063/1.3678660][K. Katagiri, A. Noda, K. Suzuki, K. Nagatsu, A. Y. Boytsov, D. E. Donets, E. D. Donets, E. E. Donets, A. Y. Ramzdorf, M. Nakao, S. Hojo, T. Wakui, K. Noda, Cryogenic molecular separation system for radioactive 11C ion acceleration, Review of Scientific Instruments 86 (12) (2015) 123303. doi:10.1063/1.4937593][A. Ayriyan, J. Buša Jr., E. E. Donets, H. Grigorian, J. Pribiš, Algorithm and simulation of heat conduction process for design of a thin multilayer technical device, Applied Thermal Engineering 94 (2016) 151-158. doi:10.1016/j.applthermaleng.2015.10.095][R. E. Honig, H. O. Hook, Vapor pressure data for some common gases, David Sarnoff Research Center, RCA, Princeton, 1960][D. B. Chelton, D. B. Mann, Cryogenic data book (1956)][P. Symons, Digital waveform generation, Cambridge University Press, New York, 2013][M. Abramowitz, I. A. Stegun, Handbook of mathematical functions, Tenth Edition, Dover Publications, New York, 1972][G. A. Korn, T. M. Korn, Mathematical handbook for scientists and engineers, Revised Edition, Dover Publications, New York, 2013.][D. W. Peaceman, H. H. Rachford Jr., The numerical solution of parabolic and elliptic differential equations, Journal of the Society for Industrial and Applied Mathematics 3 (1955) 28-41.][N. N. Yanenko, Fractional step methods for solution of multidimensional problems of mathematical physics [Metod drobnykh shagov resheniya mnogomernykh zadach], Nauka, Moscow, 1967, in Russian.][A. A. Samarskii, P. N. Vabishchevich (Eds.), Additive difference schemes for problems of mathematical physics [Additivnyye skhemy dlya zadach matematicheskoy fiziki], Nauka, Moscow, 1999, in Russian.][A. A. Samarsky, The theory of difference schemes, Marcel Dekker Inc., New York, 2001.][N. N. Kalitkin, Numerical methods [Chislennyye metody], Second Edition, BHV-Peterburg, Saint Petersburg, 2013, in Russian.][L. H. Thomas, Elliptic problems in linear differential equations over a network, Tech. rep., Columbia University, New York (1949).][A. A. Samarskii, P. N. Vabishchevich, Computational heat transfer, Vol. 1, John Wiley & Sons Ltd., Chichester, England, 1995.][L. Dagum, R. Menon, OpenMP: an industry standard API for shared-memory programming, Computational Science & Engineering, IEEE 5 (1) (1998) 46-55.][OpenMP Architecture Review Board, OpenMP application programming interface version 5.0 (November 2018). URL https://www.openmp.org/wp-content/uploads/OpenMP-API- Specification-5.0.pdf][HybriLIT group, Heterogeneous platform “HybriLIT” [cited 14.01.2019]. URL http://hlit.jinr.ru/en/][G. Adam, M. Bashashin, D. Belyakov, M. Kirakosyan, M. Matveev, D. Podgainy, T. Sapozhnikova, O. Streltsova, S. Torosyan, M. Vala, L. Valova, A. Vorontsov, T. Zaikina, E. Zemlyanaya, M. Zuev, IT-ecosystem of the HybriLIT heterogeneous platform for high-performance computing and training of IT-specialists, in: V. Korenkov, A. Nechaevskiy, T. Zaikina, E. Mazhitova (Eds.), CEUR Workshop Proceedings, Vol. 2267, 2018, pp. 638-644.][Intel corporation, Intel Xeon Gold 6154 Processor [cited 14.01.2019]. URL https://www.intel.com/content/www/us/en/products/ processors/xeon/scalable/gold-processors/gold-6154.html][Fermilab, CERN, Scientific Linux [cited 14.01.2019]. URL http://scientificlinux.org][A. Ayriyan, J. B. Jr., H. Grigorian, E. E. Donets, Solving the optimization problem for designing a pulsed cryogenic cell, Physics of Particles and Nuclei Letters 16 (5) (2019 (Accepted to print)) 300-309. doi:10.1134/S1547477119030026.][V. A. Tokareva, O. I. Streltsova, M. I. Zuev, Parallel realizations of locally one-dimensional difference schemes for solving the initial-boundary value problems for the multidimensional heat equation, in: A. V. Friesen, V. Chudoba (Eds.), Proceedings of the XX International Scientific Conference of Young Scientists and Specialists (AYSS-2016), 2016, pp. 142-147.][M. D. Schuster, The heat equation: high-performance scientific computing case study, Computing in Science & Engineering 20 (5) (2018) 114-127. doi:10.1109/MCSE.2018.05329820]