how to fetch data in a batch from hbase in Geomesa?

184 Views Asked by luway At 23 November 2018 at 01:07

GeoTools api is one way for Geomesa ingest method to get data from Hbase, but when I use org.geotools.data.simple.SimpleFeatureCollection, it seems that only a Iterator can be manipulated by SimpleFeatureCollection.features(), one problem occurs in which when I want to traverse the results , the iterator.hasNext() method costs too much time, Can I fetch data in a batch way from hbase in Geomesa not only by the Iterator?

Original Q&A

There are 1 best solutions below

Emilio Lahr-Vivaz On 26 November 2018 at 14:01

Behind the scenes, there is some batching being done, but the batches are fetched lazily (i.e. on a call to hasNext, if there isn't any local data it will do a remote fetch). You can control the HBase read-ahead through the system property geomesa.hbase.client.scanner.caching.size (see here). The GeoTools API doesn't provide any batch mechanisms per-say, however.

For simple use cases, if you just want to fetch everything up front, you can pull the iterator into an ArrayList, then operate on it afterwards. To avoid waiting for the entire result set to be fetched, you could set up producer/consumer threads, so that one thread is continuously pre-fetching data and the second thread is operating on the results that have come back.

For more advanced use cases, you can use Spark (or map/reduce directly) to load an entire result set at once.

how to fetch data in a batch from hbase in Geomesa?

There are 1 best solutions below

Related Questions in HBASE

Related Questions in GEOMESA

Trending Questions

Popular # Hahtags

Popular Questions