Tuesday, June 10, 2008

Sample programs in DryadLINQ

A new technical report out of Microsoft Research, "Some sample programs written in DryadLINQ" (PDF), shows off some examples of large scale distributed computations possible with Dryad.

The paper provides code for the EM and PCA algorithms, computing PageRank, and mining astronomical data, among many other things. There are also fairly detailed descriptions of how the computations are executed across the cluster.

Dryad is a programming infrastructure designed to supports large scale computations over clusters. It currently only is available inside of Microsoft, but it "is now widely used internally by Microsoft product groups" [1].

Please see also my earlier post, "Yahoo, Hadoop, and Pig Latin".

No comments: