This is a simple benchmark of performing couple of aggregations on a 50 million data set of <date>
:<value>
pairs.
Original articles:
- http://vladmihalcea.wordpress.com/2013/12/19/mongodb-facts-lightning-speed-aggregation/
- http://www.javacodegeeks.com/2013/12/mongodb-lightning-fast-aggregation-challenged-with-oracle.html
- https://www.evernote.com/shard/s17/sh/f90752d1-7903-4d1c-a042-29a8ddbd3eeb/1df9e175dd1f218ced9f1608a85bdd35
Scripts:
01_prepare.rb
: Set up the index02_load.rb
: Load the data from CSV file(s) into Elasticsearch. See source for notes on parallization etc.03_aggregate.rb
: Perform the search request using aggregations
curl -X GET 'http://localhost:9200/random_data/_search?pretty&search_type=count' -d '{
"query":{
"match_all":{}
},
"aggregations":{
"years":{
"terms":{
"script":"doc['created_on'].date.getYear()",
"size":1000,
"order":{
"_term":"asc"
}
},
"aggregations":{
"stats":{
"stats":{
"field":"value"
}
}
}
},
"days":{
"terms":{
"script":"doc['created_on'].date.getDayOfYear()",
"size":1000,
"order":{
"_term":"asc"
}
},
"aggregations":{
"stats":{
"stats":{
"field":"value"
}
}
}
},
"hours":{
"terms":{
"script":"doc['created_on'].date.getHourOfDay()",
"size":1000,
"order":{
"_term":"asc"
}
},
"aggregations":{
"stats":{
"stats":{
"field":"value"
}
}
}
}
}
}'
TOOK: 5ms
========================================================================================================================================================== DAYS
1 count: 36 min: 0.04614763706922531 max: 0.9183247685432434 avg: 0.4549943816123737 sum: 16.379797738045454
2 count: 24 min: 0.06304323673248291 max: 0.9934419393539429 avg: 0.5252716839313507 sum: 12.606520414352417
3 count: 22 min: 0.02263391949236393 max: 0.9755882024765015 avg: 0.5638084034858779 sum: 12.403784876689315
...
0 count: 406 min: 0.0016236789524555206 max: 0.9998151063919067 avg: 0.4859336178647721 sum: 197.28904885309748
1 count: 425 min: 0.0008579888381063938 max: 0.998132586479187 avg: 0.4998712091975133 sum: 212.44526390894316
2 count: 442 min: 0.0011103407014161348 max: 0.9958078861236572 avg: 0.5092608069061255 sum: 225.09327665250748
3 count: 397 min: 0.0015195843297988176 max: 0.9939184188842773 avg: 0.5028650308132256 sum: 199.63741723285057
...
2011 count: 1 min: 0.998817503452301 max: 0.998817503452301 avg: 0.998817503452301 sum: 0.998817503452301 2012 count: 9998 min: 0.00018035457469522953 max: 0.9999924898147583 avg: 0.5038798629001606 sum: 5037.7908692758065
See the output.txt
file for the output of running the request on a real data set.