Changing replication of existing files in HDFS

3.4k Views Asked by At

I tried changing the replica factor from 3 to 1 and restarting the services. But the replication factor remains the same

Can anyone suggest me how to change the replication factor of existing files?

This is the fsck report:

 Minimally replicated blocks:   45 (100.0 %)

 Over-replicated blocks:        0 (0.0 %)

 Under-replicated blocks:       45 (100.0 %)

 Mis-replicated blocks:         0 (0.0 %)

 Default replication factor:    1

 Average block replication:     2.0

 Corrupt blocks:                0

 Missing replicas:              45 (33.333332 %)

 DecommissionedReplicas:        45

 Number of data-nodes:          2

 Number of racks:               1
2

There are 2 best solutions below

0
Abhinav On BEST ANSWER

For anyone who is facing the same issue just run this command :

hdfs dfs -setrep -R 1 /

Because when the blocks are under-replicated and you change the replication factor from 3 to 1(or any changes) then these changes are for the new files which will be created in HDFS, not for the old ones.

You have to change the replication factor of old files on your own.

0
siddhartha jain On

There are two scenarios in changing the replication factor for a file in hdfs:

  1. When the file is already present, in that case you need to go to that particular file or directory and change the replication factor. For changing replication factor of a directory :

    hdfs dfs -setrep -R -w 2 /tmp 
    

    OR for changing replication factor of a particular file

    hdfs dfs –setrep –w 3 /tmp/logs/file.txt
    
  2. When you want to make this change of replication factor for the new files that are not present currently and will be created in future. For them you need to go to hdfs-site.xml and change the replication factor there

    < property>
       < name>dfs.replication< /name>
        < value>2< /value>
    < /property>