- HDFS is a spread across multiple machines (Simple with commodity Hardware)
- Nothing unique about individual machine but unique part is a cluster as a whole is highly fault tolerant
- Well Suited for large Batch Jobs
- Not a low latency system
- Data is HDFS is very very large (Semi-Structured)
- Any data in HDFS in split across multiple disks where each disk in present on a diff machine in a cluster
- File system manage machine and space
- Setup by Master-Salve Nodes
- Master Node (Name Node) coordinates with Slave Nodes(Data Node)
- One Namenode/Cluster
- For Example - Name note is like a table of content of a book and data node are the actual chapters
- NameNode has 2 responsibilities
- Manage the overall file system
- Stores (Directory Structure)
- Other File metadata
- DataNode
- Physically stores the data
0 Comments
|
Join DFIR Global Slack ChannelMac Forensics
|