The Problem of Data Placement in Distributed Systems
- Authors: Zhipa AV1
-
Affiliations:
- Peoples’ Friendship University of Russia
- Issue: No 2 (2015)
- Pages: 46-54
- Section: Articles
- URL: https://journals.rudn.ru/miph/article/view/8272
Cite item
Full Text
Abstract
Distributed system effectiveness depends dramatically on the way it manages incoming tasks and data against limited computational resources that are at its disposal. Due to ever-inreasing amount of incoming data distributed systems are required to efficiently manage the way its storage and processing are being made. Nowadays the distributed system design is significantly flounced by the manner it leverages high load scenarios, provides data storage functionality and uses the underlying resources. An effective distributed system’s resource management has to balance trade-offs between single node resource consumption and the overall loss of data locality, that is inevitable due to data fragmentation. In this article we will formalize the problem of data placement by maximizing data storage locality in distributed data systems, which as it turns out is a NP-complete task. We will later describe a polynomial-time algorithm that is capable of providing us a solution that is within an additive constant from the optimal one.
About the authors
A V Zhipa
Peoples’ Friendship University of RussiaDepartment of Information Technology