Mastering Hadoop 3
上QQ阅读APP看书,第一时间看更新

Disadvantages of erasure coding

Erasure coding can help us save lot of storage space but still has some limitations, as follows:

  • Data locality: Erasure coding only keeps one replica for data blocks, and so programs like MapReduce that work on data locality need to be run on a machine where this block resides. If not, then the data block needs to be transferred across the network.
  • Encoding and decoding operations are computationally expensive.
  • Expansive copy operation: Erasure coding keeps only one replica and encodes data. Encoded data can't be read without moving most of the data over the network. The computation time for decoding, network transfer, and parallel reading with only one replica for a block makes copy operations expansive.