Evaluation of firewall performance metrics with ranging the rules for Poisson incoming packet flow and exponential filtering time
- Authors: Botvinko A.Y.1, Samouylov K.E.1
-
Affiliations:
- RUDN University
- Issue: Vol 31, No 4 (2023)
- Pages: 345-358
- Section: Articles
- URL: https://journals.rudn.ru/miph/article/view/37515
- DOI: https://doi.org/10.22363/2658-4670-2023-31-4-345-358
- EDN: https://elibrary.ru/DYDLCY
Cite item
Full Text
Abstract
The given article is a continuation of a number of works devoted to the development of models and methods for ranging the filtration rules to prevent a decrease in the firewall performance caused by the use of a sequential scheme for checking packet compliance with the rules, as well as by the heterogeneity and variability of network traffic. The article includes a description of a firewall mathematical model given in the form of a complex system and a queuing system with a phase-type discipline for request servicing, which formalizes the network traffic filtering process with the functionality of ranging the rules. The purpose of modeling is to obtain estimates for major firewall performance metrics for various network traffic behavior scenarios, as well as to evaluate an increase in the firewall performance due to ranging a filtration rule set. Calculation of estimates for the firewall (FW) performance metrics was made using the analytical method for a Poisson request flow. Based on the analysis of the modeling results, conclusions were drawn on the effectiveness of ranging the filtration rules in order to improve the firewall performance for traffic scenarios that are close to real ones.
Full Text
1. Introduction Sustainable operation of information infrastructure, including for specialpurpose automated systems (AS), the uninterrupted functioning of which is critical for ensuring the security and defense capability of any state, given the avalanche-like growth in the volume of information flows in public networks, high heterogeneity and variability of network traffic parameters, widespread use of multimedia protocols (that are quite sensitive to the length of data transmission delay), as well as a significant increase in the number of various computer attacks, requires firewalls to provide really high performance. A firewall is a local or functional distributing tool that provides control over the incoming and/or outgoing information in the automated system, and ensures the protection for the AS by filtering the information, i.e., providing analysis of the information by the criteria set and making a decision on its distribution [1]. One of the major factors affecting the search time for filtration rules, and therefore the FW performance, is the order in which the filtration rules are arranged in sets that are linear lists of large dimensions. This is due to the fact that the search time for any rule corresponding to the data under filtration, is in proportion to the number of checked rules. And the filtering time for information flow that meets the conditions contained at the end of a large dimension set, will be much longer than the time required to filter data that meet the conditions contained at the beginning of the rule set [1, 2]. The papers [1, 3, 4] published earlier by the authors, describe the developed method for optimizing a filtration rule set (method for ranging the rules). An increase in the efficiency of traffic filtration is achieved by periodically ranging the filtration rules in descending order of their weights, obtained in accordance with the estimates of the parameters of the filtered information flows. A particularity of the developed approach is the use of the nonparametric method of local approximation (MLA) [4, 5] to evaluate the parameters of filtered information flows. In the ranging process for a rule set, the current characteristics and dynamics of changes in the parameters of information flows are considered. At the same time, there is no need to select a parametric model that is acceptable for all evaluated parameters of information flows. The implementation of MLA has provided the adaptability of the method, as well as a high response speed for changes in the parameters of filtered information flows thanks to the specifics of MLA estimates. In earlier works [1-3], the effectiveness of using methods for optimizing a filtration rule set was evaluated with the help of a simulation modeling method. To evaluate the effectiveness of methods for set optimization, the presented paper proposes an analytical solution, as well as an algorithm for calculating the probabilistic and temporal characteristics of the queuing system (QS) that formalizes the network traffic filtering process. 2. Firewall model with ranging the filtration rules As the FW model we chose the previously developed one [2, 3] that reflects the basic patterns and factors of the FW functioning when processing the network traffic. The traffic processing in this model includes two stages that are an initial processing stage and a stage of checking the packet filtration rules. At the initial processing stage, a packet, which is transmitted in the communication network, is received by the network interface card of the FW. After decoding a sequence of electrical or optical signals and verifying the correctness of the delivered information, the packet is written to the input buffer memory of the network interface controller (NIC). After that, the packet is transferred to a common software buffer allocated in RAM for further extraction and processing by the central processing unit. In the proposed model, the process of decoding a sequence of signals and the process of receiving and transferring the packet from the buffer memory of the network card to the random access memory (RAM) of the FW are considered 1. Y. Botvinko, K. E. Samouylov, Evaluation of firewall performance … 347 as a single initial processing of packets with a given service intensity. It’s considered that the packet immediately arrives at the RAM of the firewall. The waiting time of the packet in the buffer memory of the network card isn’t taken into account, as well as the packet losses due to distortions in information transmission. At the stage of checking the packet filtration rules, the central processing unit (CPU), if computing resources are available, provides the sequential check of the compliance of the incoming packet parameters with the conditions of the filtration rules. The remaining incoming packets await the start of servicing in the buffer in the order of their arrival (FCFS, First-Come, First-Served). A similar approach was used in papers [6, 7]. If the packet parameters comply with the conditions of the filtration rules, the packet processing is completed with encoding and transferring the packet to the physical medium. The packets that don’t comply with the filtration rules are discarded. The following operations aren’t considered within the traffic filtering process: fragmentation and reassembly of transmitted packets, reassembly of fragmented packets, network address translation, and packet routing. The level of detail of this scheme doesn’t include the architecture and operating algorithms of individual components of the microprocessor system, as well as the interface lines between them, commands and control signals. Therefore, the scheme isn’t complete, but is sufficient to develop a mathematical model, neglecting (to make things simpler) unimportant secondary factors. Only permissive rules are considered as filtration ones. The logical structure of the rule set is a linear list. When the first match of a packet to a rule is found, the packet is considered to be processed by the FW. Packets that don’t comply with the rules are rejected. One set of filtration rules, implementing the default deny policy, is used. Additional rule sets aren’t considered. Therefore, the traffic filtering process includes the initial processing of the packet and checking if the packet complies with the filtration rules. The packet checking time is considered a random variable distributed according to an exponential law for the initial processing and checking the filtration rules. The time required to calculate weights and to range the rule set is considered negligible. Ranging the filtration rule set is considered as ordering the rules in descending order of their weights in accordance with estimates of the parameters of information flows. We suppose that traffic is filtered at the network and transport levels of the reference model of interaction of open OSI systems. A model with ranging the filtration rules is a complex stochastic system, and, to build it, an aggregative approach was used, which represents the system as an aggregate that has input and output signals. To demonstrate the operation of subsystems, approaches to describing systems adopted in queuing theory, were used [8]. Let’s represent the FW model in the form of a system M (k) = {Z, L, T, Φ, G, X, Y } [3], the moment of transition of which from one state to another one is shown in figure 1: 1. Z = {z0, . . . , zk , . . .} is a set of system states; L = (µ0, µ, N, C) is set of system parameters; T is a time interval of system operation. Changes k in system states occur at discrete time points t- = tk - 0, tk ∈ T , k ?: 1; Φ is a system state transition operator, G is a system output 348 DCM&ACS. 2023, 31 (4) 345-358 operator; X = {x0, . . . , xk , . . .} is a set of input signals entering the system; Y = {y1, . . . , yk , . . .} is a set of system output signals. 2. µ0 is the intensity of the service during initial processing of the packet by the network card of the FW; µ is the intensity of the service during the checking stage (whether a packet complies with one filtration rule in a set); C is the system storage capacity; N is the number of rules in the filtration set. 3. zk = (rk , dk ) ∈ Z is the state of the system on the time interval [tk-1, tk ), where rk = (rk , . . . , rk ) is a rule set, in which the rk component is a rule 1 N i in the i-th position in the set; dk = (dk , . . . , dk ) is a vector of packet 1 N i servicing times, in which dk corresponds to the processing time for i-type packets on the interval [tk-1, tk ). 4. xk = (xk , . . . , xk ) ∈ X is the input signal; xk , i = 1, . . . , N is random 1 N i value that characterizes the number of i-type packets corresponding to i the rk rule, entering the system on the interval [tk -1, tk ). 5. pk = (pk , . . . , pk ) is a vector of rule weights, set in accordance with 1 N pk MLA estimates of the parameters of information flows; the component i , i = 1, . . . , N corresponds to the weight of the rule that takes the i-th position on the interval. 6. yk = (qk , vk , wk , uk ) ∈ Y is an output signal, the components of which correspond to the estimates of performance metrics on the interval [tk-1, tk ); qk corresponds to the average number of packets in a drive, vk corresponds to the average time needed to service the packet in the system, wk corresponds to the average waiting time before the start of servicing the packet in the system, and uk corresponds to the average residence time of the packet in the system. k+1 7. Φ (L, zk , xk ) = zk+1. At the time point t- , this operator calculates the weights of the pk rules in accordance with the estimates of the parameters k of the x information flows. It also determines the state of the system zk+1 = (rk+1, dk+1), where the rk+1 vector is calculated by ranging the rk rule set according to the pk weights, and the dk+1 vector is calculated according to the resulting set rk+1 and intensities µ0, µ. k+1 8. G (L, zk , xk ) = yk+1. At the time point t- , this operator determines the performance metrics on the time interval [tk-1, tk ). Figure 1. Scheme of the FW model with ranging the filtration rules Considering the operation of M (k) on the interval [tk-1, tk ), let’s present the aggregate in the form of a single-line QS with a storage of limited capacity C, A. Y. Botvinko, K. E. Samouylov, Evaluation of firewall performance … 349 heterogeneous Poisson incoming flow, and a request service with a distribution function (DF) Bk (t) for the duration of servicing phase-type requests, which depends on the order of filtration rules. The QS receives a request flow that is a superposition of N independent Poisson flows of requests. According to the Basharin-Kendall classification, this QS is designated as MN /PH/1/C [8]. Representing the system in the form of the QS makes it possible to calculate the performance metrics using an analytical approach for the Poisson flow of requests. Hereinafter, the packets entering the model will be considered as requests, and the QS reflecting the operation of the system on the interval [tk-1, tk ), k ?: 1 will be designated as M (k). 3. Algorithm for calculating the performance metrics for the exponential distribution of service time To calculate the performance metrics, let’s consider the operation of M (k) on the interval [tk-1, tk ), k ?: 1. On this interval, the filtration of one batch of packets takes place, there is no ranging for the rule set, and the state vector zk = (rk , dk ) remains unchanged. Let’s imagine that a system receives a Poisson flow of requests with intensity N i λ.(k) = λk , which is the sum of independent Poisson flows of requests of i=1 N different types. Let’s define a rule set rk = (rk , . . . , rk ) in such a way that 1 N the i-type request with the λk intensity will correspond to the rk filtration i 1 rule for all i = 1, 2, . . . , N . Let’s define the process of servicing the requests as a sequential transition of phases, starting with the first one. Servicing the request in the 0-th phase corresponds to the stage of the initial processing of FW packets, servicing the request from the 1-st to the N -th phases corresponds to checking the filtration rules. Only one request can be served at a time. If there is no free space in the drive, then the incoming request exits the system without being serviced. If the request matches the rule, its service in the system gets completed; otherwise, the request is transferred to the next phase. After the N -th phase, the service gets completed. Let’s consider the case in which the request service times in each phase are independent of each other and distributed exponentially. The process of request servicing is schematically shown in figure 2. Figure 2. Phase representation of the request servicing process in the firewall model According to lemma 3 from [8], the probability of a request transition from the i-th phase to the next i + 1-th phase for the k-th time interval can be 350 DCM&ACS. 2023, 31 (4) 345-358 given as follows: ji(k) = (1, i = 0, (1) 1 - λ (k)/λ (k), i ∈ 1, . . . , N - 1. i . Let’s designate αi(k) = 1-ji(k) as the probability of completing the request service in the i-th phase for the k-th time interval. The distribution function (DF) for the time of servicing the request in the QS M (k) will be as follows: N -1 Bk (t) = j0(k)E(1, µ0) + '\" ji(k)E(i, µ), k ∈ 1, . . . , N - 1, i=1 E(N, µ), k = N, (2) where E(i, µ0) is the Erlang distribution of the i-th order. To analyze the probabilistic and temporal characteristics of the QS, the algorithmic approach proposed by P. Bocharov in [8] was implemented. His main idea is to obtain a solution to the system of global balance equations for the Markov process that describes the system, and to find the parameters of the QS in the form of matrix-recurrent formulas. The use of such an approach makes it possible to effectively calculate the QS parameters. Let’s define a random process (RP) {η(t), t ?: 0} on the set of states X = {(0) (h, i), h = 1, . . . , C + 1, i = 0, . . . , N }. The states of the X set have the following meaning. If at some point in time η(t) = 0, then there are no requests in the system. If η(t) = (h, i), then there are h requests in the system, and the serviced request is in the i-th phase. The RP build in such a way is a homogeneous Markov process (MP). All states of the RP are interconnected, their number is finite and equal to (C + 1)(N + 1) + 1. Therefore, according to the ergodic theorem for the MP with a finite set of states [8], the RP η(t) is ergodic. Using a matrix representation, below we give the DF Bk (t) of the request service time. Hereinafter, for brevity, the index k of the time interval of the M (k) aggregate is omitted. Bk (t) = 1 - βTeMt1, t > 0, βT1 = 1, (3) where the pair of (βT, M) is a PH representation of order N + 1; βT = (β0, . . . , βN ) is the vector of probabilities of sending a request for service to phases 0, 1, . . . , N at the time point tk , the component βi, i ∈ 0, . . . , N corresponds to the probability of starting the request service in the i-th phase at the time point tk ; and M is an infinitesimal matrix that has the following form: 1. Y. Botvinko, K. E. Samouylov, Evaluation of firewall performance … 351 -µ0 µ0 0 0 0 0 0 -µ j1µ 0 0 0 . . M = 0 0 -µ 0 0 0 . . . 0 0 . jN -2µ 0 . (4) 0 0 0 0 -µ jN -1µ 0 0 0 0 0 -µ Vector β, in accordance with the fact that the request service in the considered QS always starts from the zero phase, has the following form: βT = (1, 0 . . . 0). (5) Let’s use the following designations for the probabilities of {η(t), t ?: 0} process states: 1. p0 = P {η(t) = 0, t ?: 0} is the stationary probability of the absence of requests in the system. 2. pi,j = P {η(t) = (i, j), t ?: 0} is the stationary probability of servicing the request in the j-th phase and presence of i requests within the system. h 3. pT = (ph,0, ph,1, . . . , ph,N ), h = 1, . . . , C + 1 is a vector of stationary probabilities. 4. ph is the stationary probability of presence of h requests within the system. So, the system of global balance equations in matrix form (for the QS), which describes the process of the FW operation and takes filtration into account, is as follows: 1 - λ.p0 + pTµ = 0, pT T 1 (-λ.I + M) + λ.β 2 p0 + pTµβT = 0T, (6) T T T T T . . ph (-λ I + M) + λ ph-1 + ph+1µβ = 0 , h = 2, 3 . . . , C, pT T T C+1M + λ.pC = 0 , where µT = (0, α1µ, . . . , αN -1µ, µ) is the vector of intensities for the completion of request servicing in the QS of N + 1 dimension; and I is the identity matrix of N + 1 dimension/degree. Similarly to [8], for the convenience of solving the system of equations (SoE), let’s introduce an extra vector p˜h: T p˜T = ph . (7) h p0 The solution of the SoE (6) allows us to calculate the performance metrics for the stationary operating mode of the QS (see Algorithm 1). Algorithm 1. Algorithm for calculating efficiency metrics for exponential distribution of service time Algorithm for calculating the performance metrics for the exponential distribution of service time. 352 DCM&ACS. 2023, 31 (4) 345-358 Step 1 Calculate matrices and vectors: D = -λ.I + M, (8) β(λ) = 1 + λ.βT D-11, (9) -1 M~ -1 = D-1 - λ.D 1βT D-1 , (10) β(λ) W = -λ.M~ -1, (11) ∗ WR = -λ M-1, (12) ωT 0 = - λ.βT D-1 . (13) Step 2 Calculate the probabilities p˜h: ωT h-1 β(λ.) p˜T = 0 W , h = 1, 2, . . . , C, (14) h ωT C-1 Step 3 0 W WR, h = C + 1. Calculate p0, using the normalization conditions for the system of global balance equations and formula (7): Step 4 Calculate vector: pT p0 = T / C+1 1 + '\" p˜h h=1 \-1 . (15) h = p0p˜h , h = 1, 2, . . . , C + 1. (16) h The calculated vector pT is the desired matrix-geometric solution for the system of global balance equations (6). Step 5 Calculate stationary probabilities ph: h ph = pT 1, h = 1, 2, . . . , C + 1. (17) Step 6 Using ph and formulas (18)-(23), calculate the performance metrics for the QS stationary operating mode. Average number of requests in the QS: C+1 l = '\" hph. (18) h=1 A. Y. Botvinko, K. E. Samouylov, Evaluation of firewall performance … 353 Average queue length: C+1 q = '\" (h - 1)ph. (19) h=2 Probability of losing the requests: π = p T 1. (20) Average residence time: C+1 u = l/λ.(1 - π). (21) Average waiting time for service: w = q/λ.(1 - π). (22) Average service time: v = u - w. (23) 4. Evaluation of firewall performance metrics 0 The initial data selected for calculating the performance metrics are as follows: system storage capacity C = 10; max number of filtration rules in a set N = 1500; initial packet processing time µ-1 = 2.7 · 10-5 [ms]; time for checking one rule µ-1 = 5 · 10-5 [ms]. The packet processing times were taken from papers [6, 7]. The incoming flow is the sum of N independent Poisson flows of requests. The request service time is exponential. To calculate the performance metrics, a program code has been developed in the MATLAB system language. To check the correctness of the model, the following graphs were constructed: 1. Graphs illustrating the dependence of the performance metrics on the total request flow. The intensities of requests entering the system increase at time points tk ∈ T, k ?: 1; the initial value of the total flow intensity is λ. = 25 [ms-1]. The number of filtration rules is constant N = 100. 2. Graphs illustrating the dependence of the performance metrics on the number of filtration rules. The number of rules increases at time points tk . The total intensity of the request flow remains constant λ. = 50 [ms-1]. The values of the performance metrics depend on the types and intensities of the corresponding incoming packets, the rule set and structural parameters of the system. Obviously, the maximum values (given the same system parameters) will be observed when the packets match the last filtration rule, and the minimum ones will be obtained when the packets match the first rule. That’s why, when constructing graphs illustrating the performance metrics, we considered the following cases: 1. Graphs of the performance metrics constructed for the case when incoming requests comply only with the first rule in the rule set, and the time for servicing the request is the least possible. 354 DCM&ACS. 2023, 31 (4) 345-358 2. Graphs of the performance metrics constructed for the case when incoming requests comply with the last rule in the rule set, and the time for servicing the request is the largest possible. 3. Graphs of the performance metrics constructed for the case when incoming requests comply with a random rule. Further in the description, despite the fact that in all three cases the service time is random, for convenience, the graphs will be called as follows: graph of the first rule, graph of the last rule, graph of the random rule, respectively. The average service time (see figures 3-4) depends on the types of incoming packets and the order of filtration rules and doesn’t depend on the total intensity of received requests and the operating mode of the QS. Therefore, when the total request flow increases, the average service time remains constant. Meanwhile, if the number of filtration rules increases, the average service time grows linearly. Figure 3. Average service time (constant number of rules) Figure 4. Average service time (constant total intensity of requests received) Thus, the efficiency of reducing the average service time when ranging the filtration rule set for constant values of the probability of receiving of each type of request will grow with the increase of the rule set and won’t depend on the total flow intensity. The figure 5 shows the dynamics of changes in the average residence time of requests with an increase in the total intensity of the incoming request flow. The average residence time of requests increases with a growth in the flow intensity. For values of the total request flow [ms-1], the type of dependence for the graph of the random rule will change, which corresponds to functioning of the QS in overload mode. Such an overload in the system doesn’t result in an unlimited increase in the QS parameters due to the limited storage capacity. The value of the average residence time tends to 0.056 [ms]. At the same time, the difference between the values of the average residence time on the graph of the random rule and the graph of the last rule indicates the possibility of obtaining a significant reduction in the average residence time when ranging the filtration rule set. For larger rule sets, the difference between the values of the average residence time of requests for the graph with a minimum service time and the graph with a random service time increases with the number of rules (see figure 6). A. Y. Botvinko, K. E. Samouylov, Evaluation of firewall performance … 355 Figure 5. Average residence time in the system (constant number of rules) Figure 6. Average residence time in the system (constant total intensity of requests received) This complies with the peculiarities of the functioning of the FW, since the search time for a rule that matches the filtered data is proportional to the number of checked rules, and indicates the advisability of ranging the filtration rule set. Overload in the system doesn’t lead to an unlimited increase in the QS parameters. Thus, the value of the decrease in the average residence time when ranging the filtration rule set will grow with an increase of the set itself. Also, it won’t depend on the intensity of the total request flow starting from the moment when the system gets overloaded. Figures 7-8 show graphs of changes in the average queue length. Increase of the rule set results in faster filling of the storage, which corresponds to the logic of the FW, because checking a rule set of a large dimension takes more time. Figure 7. Average queue length (constant number of rules) Figure 8. Average queue length (constant total intensity of requests received) When the QS gets significantly overloaded, the average queue length will be equal to the storage capacity. And the average queue length in the graph of the random rule will approach the values given in the graph of the last rule (see figure 7). 356 DCM&ACS. 2023, 31 (4) 345-358 Therefore, the efficiency of ranging the rule set (in order to reduce the queue length) will grow with increasing the load on the system until the system operates in the mode of losing the requests. 5. Conclusion The obtained estimates of the firewall performance metrics allow us to draw a conclusion on the adequacy of the built analytical model of the FW. We can also draw a conclusion about the possibility to increase the firewall performance by implementing the method for ranging the rule set.About the authors
Anatoly Yu. Botvinko
RUDN University
Author for correspondence.
Email: botvinko_ayu@rudn.ru
ORCID iD: 0000-0003-1412-981X
Candidate of Physical and Mathematical Sciences, assistant professor of Department of Probability Theory and Cyber Security
6 Miklukho-Maklaya St., Moscow, 117198, Russian FederationKonstantin E. Samouylov
RUDN University
Email: samuylov_ke@rudn.ru
ORCID iD: 0000-0002-6368-9680
Professor, Doctor of Technical Sciences, Head of the Department of Probability Theory and Cyber Security
6 Miklukho-Maklaya St., Moscow, 117198, Russian FederationReferences
- A. Y. Botvinko and K. E. Samouylov, “Evaluation of firewall performance when ranging a filtration rule set,” Discrete and Continuous Models and Applied Computational Science, vol. 29, no. 3, pp. 230-241, 2013. doi: 10.22363/2658-4670-2021-29-3-230-241.
- A. Y. Botvinko and K. E. Samouylov, “Firewall simulator development for performance evaluation of ranging a filtration rules set,” Distributed Computer and Communication Networks: Control, Computation, Communications. DCCN 2022. Lecture Notes in Computer Science. Lecture Notes in Computer Science, vol. 13766, no. 3, pp. 221-229, 2022. doi: 10.1007/978-3-031-23207-7_15.
- A. Y. Botvinko and K. E. Samouylov, “Firewall simulation model with filtering rules ranking,” Distributed Computer and Communication Networks: Control, Computation, Communications. DCCN 2020. Communications in Computer and Information Science, vol. 1337, pp. 533- 545, 2020. doi: 10.1007/978-3-030-66242-4_42.
- V. Katkovnik, Non-parametric data identification and smoothing: local approximation method [Neparametricheskaya identifikaciya i sglazhivanie danny‘x: metod lokal‘noj approksimacii]. The science. Main editorial office of physical and mathematical literature Publ., 1985, 336 pp., in Russian.
- W. Hardle, Applied nonparametric regression. Cambridge: Cambridge university press, 1990, 349 pp.
- M. Cheminod, L. Durante, L. Seno, and A. Valenzano, “Performance evaluation and modeling of an industrial application-layer firewall,” IEEE Transactions on Industrial Informatics, vol. 14, no. 5, pp. 2159- 2170, 2018. doi: 10.1109/TII.2018.2802903.
- K. Salah, K. Elbadawi, and R. Boutaba, “Performance modeling and analysis of network firewalls,” IEEE Transactions on network and service management, vol. 9, no. 1, pp. 12-21, 2011. doi: 10.1109/TNSM.2011.122011.110151.
- P. P. Bocharov and A. V. Pechenkin, Queuing theory [Teoriya massovogo obsluzhivaniya]. Moscow: RUDN, 1995, 529 pp., in Russian.