Here’s the non-blocking map reduce class that I was talking about. It uses at most 75% of the CPU, so that the browser remains responsive while your job is processed. You can destroy the job at any stage.
class MapReduceJob
constructor: (inputsOriginal, mapFunc, reduceFunc) ->
utilisation = 75
inputs = inputsOriginal.slice()
# todo - replace with a better clone function
inputs = for input in inputsOriginal
input
outputs = []
@interval = setInterval( =>
tzero = (new Date).getTime()
# Process at least one
if inputs.length > 0
outputs.push mapFunc(inputs.pop())
# Process more
while ((new Date).getTime() - tzero < utilisation) and (inputs.length > 0)
outputs.push mapFunc(inputs.pop())
if inputs.length == 0
clearInterval(@interval)
reduceFunc(outputs)
, 100)
destroy: ->
clearInterval(@interval)
delete @interval
You use it like this:
new MapReduceJob(
['a', 'b', 'c', 'd'],
function(input){
// do something computationally expensive here, like uppercasing the input
return input.toUpperCase();
},
function(outputs){
console.log(outputs);
}
);
This is excellent for things like clustering markers on a map, or doing a backbone filtering on large collections. You need to structure your code to be asyncronous.
I’ve got a few things I might do with this code:
- Port underscore.js to this continuations style, so that you can use all the existing functions. I’ve already ported the
collect
andinject
functions. - Add a progress() method to the class.
- Add support for webworkers for desktop browsers.
- Port backbone.js to use these continuations, so that for example, the sort function doesn’t block when sorting large arrays.
- Assuming the mapFunction is constant time, measure the number of inputs to process per interval, then stop checking the date function.
It’s pretty interesting stuff, we’ll see how we go.