How to move data block from datanode to other datanode during mapreduce?

146 Views Asked by At

I implemented 4 node cluster for running hadoop following the site(https://www.linode.com/docs/guides/how-to-install-and-set-up-hadoop-cluster/)

By the way, I want to move data block to other datanode after map task or during map or reduce task.

Is there any method? and then After data blocks are moved, can it complete map reduce job?

I try to use scp and moving data block to other data node.

However, I just use yarn file and mapreduce-example jar(hadoop 3.1.2) file.

I don't know how to modify the codes

Also, after datablock is moved, namenode automatically change metadata the block?

1

There are 1 best solutions below

6
OneCricketeer On

You can't just use scp, or similar methods.

The namenode tracks individual block locations, and manually moving block files around will cause corruption of unreplicated files. For any replicated files, HDFS may replicate it back, anyway, if you did move one block. That metadata will not be automatically updated.

Stick to hadoop fs -mv commands, which you can use Filesystem Java API, too. Mapreduce is not required