Group by and Count(*) in Datastax Search/Solr

445 Views Asked by At

Hi we have a solr index with diff fields in it like business,businessType, regionName, StateName, .....

Now I need a solr query to get the number of business of type businessType ='event' group by regionName.

if I want to write a sql query for this it would be select region_name , Count(business) from solr where businessType='event' group by region_name

Any pointer would be helpful

2

There are 2 best solutions below

2
Uttkarsh Jain On

Use facets. Your solr query will look like, q=:&fq=businessType:event&facet=true&facet.field=region_name&rows=0

if want to group by on multiple fields then we need to do facet.pivot=state,region_name

0
quickreaction On

I finally figured out how to do this. Note, if you need to query on a field with a space or a special character, you need to put the search term in quotes, e.g. businessType:"(fun) event".

curl http://localhost:8983/solr/yourCollection/query -d 
{ "query"="*:*",
  "fq"="businessType:event",
  "rows"=0,
  "json.facet"= { "category" : {
    "type": "terms",
    "field" : "region_name",
    "limit" : -1 }}
}

One more Note: if you want to count over 2 fields, you have to do a nested facet.

curl http://localhost:8983/solr/yourCollection/query -d 
{ "query"="*:*",
  "fq"="businessType:event",
  "rows"=0,
  "json.facet"= { "category1" : {
    "type": "terms",
    "field" : "regionName",
    "limit" : -1,
      "facet" : { "category2" : {
        "type": "terms",
        "field" : "stateName",
        "limit" : -1
  }}}}
}

Add another facet chunk after the "limit":-1 item if you need to group by a third dimension. I tried this on my company's Solr and it hung, never returning anything but a timeout error. In general, working with Solr isn't very easy... and the documentation, IMO, is pretty terrible. And absolutely nothing about the syntax or names of the commands seem intuitive at all...