Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters

Past Dates

Friday, November 7, 2014 - 11:00am to 12:30pm
Faculty
Staff
Grad
Undergrad
Visitors

SPEAKER:  Alexey Tumanov, Carnegie Mellon University

HOST:     Andrew Warfield 

TITLE:    Tetrisched: Space-Time Scheduling for Heterogeneous Datacenters

ABSTRACT:

I'm planning to talk about TetriSched. The key idea behind TetriSched
is that it's a cluster scheduler that understands spatial (types of
resources) and temporal (when to run) job preferences and leverages
the flexibility and placement choices afforded by those preferences to
come up with efficient heterogeneous resource allocations. It supports
complex combinatorial soft constraints in a general way through an
algebraic expression language that captures arbitrary forms of
placement preferences, including the commonly mentioned locality and
anti-affinity. Recently, we integrated TetriSched within the Hadoop
YARN framework and ran real system experiments to demonstrate that
TetriSched outperforms YARN Capacity Scheduler significantly due to
the combination of TetriSched's support for gang scheduling,
plan-ahead (ability to reason about deferred resource allocation), and
soft constraints.

BIO:

I'm a PhD Candidate at Carnegie Mellon, pursuing my PhD in Systems under the supervision of Greg Ganger. At CMU, I've also been fortunate to collaborate closely with Onur Mutlu, Mor Harchol-Balter, and Michael
Kozuch. At a high level, my research interests revolve around systems
support for large-scale and data-intensive distributed computing. For
more detail, please refer to my publication list.
http://www.cs.toronto.edu/~atumanov/

At the University of Toronto, I worked under the supervision of Eyal de
Lara and Michael Brudno, as a full-time research assistant, contributing
to SnowFlock and SnowFlock - related projects. I received my
research-based M.Sc. in Computer Science from York University in
Toronto, working with Centre for Vision Research affiliated advisors -
Robert Allison and Wolfgang Stuerzlinger. My thesis focused on
mitigating and compensating for the delay and its variability inherent
to the distributed interactive virtual reality applications.

Lastly, I worked on distributed cluster technology R&D in the industry
as well. I was involved with the development of cluster middleware
responsible for distributed datacenter resource management, allocation,
and scheduling. I was also one of the key contributors to the
development of the Intel Cluster Ready Open Cluster Stack(OCS).