I want to clone a large volumne data in Elasticsearch to HDFS. My index grows about 1TB of data in a day. I have used Nifi by ScrollElasticsearchHTTP processor but the performance is very slow (about 32Mb in 5 minutes). I want to clone 1TB/1 days so it's too slow.
What can I do to speed things up?
What I've tried so far:
- Nifi: I have used ScrollElasticsearchHTTP.
- Logstash: I'm trying to use Logstash.