A Data-Placement Strategy Based on Genetic Algorithm in Cloud Computing


With the development of Computerized Business Application, the amount of data is increasing exponentially. Cloud computing provides high performance computing resources and mass storage resources for massive data processing. In distributed cloud computing systems, data intensive computing can lead to data scheduling between data centers. Reasonable data placement can reduce data scheduling between the data centers effectively, and improve the data acquisition efficiency of users. In this paper, the mathematical model of data scheduling between data centers is built. By means of the global optimization ability of the genetic algorithm, generational evolution produces better approximate solution, and gets the best approximation of the data placement at last. The experimental results show that genetic algorithm can effectively work out the approximate optimal data placement, and minimize data scheduling between data centers.

Share and Cite:

Xu, Q. , Xu, Z. and Wang, T. (2015) A Data-Placement Strategy Based on Genetic Algorithm in Cloud Computing. International Journal of Intelligence Science, 5, 145-157. doi: 10.4236/ijis.2015.53013.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Labrinidis, A. and Jagadish, H. (2012) Challenges and Opportunities with Big Data. Proceedings of the VLDB Endowment, 5, 2032-2033.
[2] (2008) Big Data. Nature, 455, 1-136.
[3] Manyika, J., Chui, M., Brown, B., et al. (2011) Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute.
[4] Qiu, L., Padmanabhan, V.N. and Voelker, G.M. (2001) On the Placement of Web Server Replicas. Proceedings—IEEE INFOCOM, Anchorage, 22-26 April 2001, 1587-1596.
[5] Wolf, J. and Pattipati, K. (1990) A File Assignment Problem Model for Extended Local Area Network Environments. Proceedings of 10th International Conference on Distributed Computing Systems, Paris, 28 May-1 Jun 1990, 554-561.
[6] Scheuermann, P., Weikum, G. and Zabback, P. (1998) Data Partitioning and Load Balancing in Parallel Disk Systems. The VLDB Journal, 7, 48-66.
[7] Zhou, X. and Xu, C. (2002) Optimal Video Replication and Placement on a Cluster of Video-on-Demand Servers. Proceedings of International Conference on Parallel Processing (ICPP), 2002, 547-555.
[8] Doraimani, S. and Iamnitchi, A. (2008) File Grouping for Scientific Data Management: Lessons from Experimenting with Real Traces. Proceedings of the 17th International Symposium on High Performance Distributed Computing, ACM, Boston, 2008, 153-164.
[9] Fedak, G., He, H. and Cappello, F. (2008) BitDew. A Programmable Environment for Large-Scale Data Management and Distribution. ACM/IEEE Conference on Supercomputing, Austin, 15-21 November 2008, 1-12.
[10] Kosar, T. and Livny, M. (2005) A Framework for Reliable and Efficient Data Placement in Distributed Computing Systems. Journal of Parallel and Distributed Computing, 65, 1146-1157.
[11] Yuan, D., Yang, Y., Liu, X. and Chen, J.J. (2010) A Data Placement Strategy in Scientific Cloud Workflows. Future Generation Computer Systems, 26, 1200-1214.
[12] Zheng, P., Cui, L.Z., Wang, H.Y. and Xu, M. (2010) A Data Placement Strategy for Data-Intensive Applications in Cloud. Chinese Journal of Computers, 33, 1472-1480.
[13] Nukarapu, D.T., Tang, B., Wang, L.Q. and Lu, S.Y. (2011) Data Replication in Data Intensive Scientific Applications with Performance Guarantee. IEEE Transactions on Parallel and Distributed Systems, 22, 1299-1306.
[14] Agrawal, D., Das, S. and El Abbadi, A. (2011) Big Data and Cloud Computing: Current State and Future Opportunities. Proceedings of the 14th International Conference on Extending Database Technology, Uppsala, 21-25 March 2011, 530-533.
[15] Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Boston.
[16] Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA.
[17] Grant, K. (1995) An Introduction to Genetic Algorithms. C/C++ Users Journal, 13, 45-58.
[18] Zhou, M. and Sun, S.D. (1999) Genetic Algorithms and Applications. National Defense Industry Press, Beijing.
[19] Mitchell, M. (1996) Introduction to Genetic Algorithm. MIT Press, Cambridge, MA.
[20] Pan, W., Diao, H.Z. and Jing, Y.W. (2006) A Improved Real Adaptive Genetic Algorithm. Control and Decision, 21, 792-795.
[21] Polgar, O., Fried, M., Lohner, T. and Barsony, I. (2000) Comparison of Algorithms Used for Evaluation of Ellipsometric Measurements Random Search, Genetic Algorithms, Simulated Annealing and Hill Climbing Graph-Searches. Surface Science, 457, 157-177.
[22] Tan, B.C., et al. (2008) A Kind of Improved Genetic Algorithms Based on Robot Path Planning Method. Journal of Xi’an University of Technology, 28, 456-459.
[23] Back, T. (1996) Evolutionary Algorithms in Theory and Practice. Oxford University Press, Oxford, 120.
[24] Liu, Z.G., Wang, J.H. and Di, Y.S. (2004) A Modified Genetic Simulated Annealing Algorithm and Its Application. China Journal of System Simulation, 16, 1099-1101.
[25] Deb, K., Pratap, A., Agarwal, S. and Meyarivan, T. (2002) A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6, 182-197.
[26] Wong, M. and Wong, T. (2009) Implementation of Parallel Genetic Algorithms on Graphics Processing Units. Intelligent and Evolutionary Systems, 187, 197-216.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.