Tutorial and Blog

Distributed Launch Protocol

FlexGP implements a robust, decentralized, peer-to-peer (P2P) startup algorithm. Every FlexGP instance is capable of launching other FlexGP instances. Immediately after booting, every FlexGP instance retrieves parameters from the node which started it.

Below we illustrate how FlexGP would launch 7 instances when the fan-out k=2. Node A is launched and runs NodeStart(7, []), where [] indicates an empty list. A then boots nodes B and X, each of which will go on to boot 2 more nodes each.


The next figure illustrate the timeline of two nodes during startup, showing the concurrency present in the FlexGP startup. As soon as node A finishes executing NodeStart and started nodes B and X, it starts a new thread to begin running the learner MRGP.



This page has been created by the Any-Scale Learning For All (ALFA) group at MIT. Please contact us at: flexgp@csail.mit.edu