These are some notes I am compiling on map/reduce implementations, the most common of which is Hadoop. There are a lot of of options beyond Hadoop though, and many perform much better on high-performance resources.

Hadoop

Spark

Disco

MapReduce-MPI

MARIANE (MApReduce Implementation Adapted for HPC Environments)

MARISSA (MApReduce Implementation for Streaming Science Applications)

Phoenix