Wednesday, March 6, 2019
What is Big Data?
Its a platform managed under the Apache Software Foundation, and its an open source, and its fill in with big selective information and give the result in very pitiful judgment of conviction . It allows to work with structured and unstructured data arrays of dimension from 10 to 100 Gb and even to a greater extent (V.Burunova)And its structer is a group of clusters or one , each of them contains groups of invitees too and each cluster has two typeface of knob name node and data node name node is a unique node on cluster and it knows any data block location on cluster and Data node is the remining node in cluster . and that have done by using a set of servers which called a cluster.Hadoop has two seams cooperate together , first layer is MapReduce and it task is divided data processing crossways ternary servers and the twinkling one is Hadoop Distributed File System (HDFS) and its task is storing data on multiple clusters and these data are separated as a set of blocks.Hadoo p shake sure the work is correct on clusters and it female genitalia detect and heal any error or failure for one or more of connecting nodes and by this way Hadoop efforts increasing in core processing and transshipment center size and high availability. Hadoop is usually used in a life-sized cluster or a public cloud service much(prenominal) as Yahoo, Facebook, Twitter, and Amazon (Hadeer Mahmoud, 2018).Hadoops FeaturesScalableHadoop able to work with extensive applications and it can run ,analyze, store, process, distribute large amount of data across thousands of nodes and servers which handle thousands terabytes of data or more, also it can add extra nodes to clusters And these servers work parallel.Hadoop better than traditional relational database systems because (RDBMS) cant expand to deal with broad data..Single write Multiple read The data on cluster can be read from multiple source at the same time .Data avalibilitywhen data is sent to a Data node, that Hadoop c reates multiple copies of data on other nodes in the cluster, to keep data available if there a failure on one of nodes on cluster.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment