Tools and software
ClsuterSchedSim: A Cluster Scheduling Strategy Simulation Package
ClusterSchedSim is a unified simulation
framework, which models a wide range of scheduling strategies for
cluster systems. The core of this framework lies in a detailed
cluster simulation model, ClusterSim. ClusterSim
simulates nodes across the cluster and the interconnect. Based on
this core, we have built the following modules: (1) a set of
parallel workloads that are often hosted on clusters; (2)
scheduling strategies, including space sharing, exact
co-scheduling and dynamic co-scheduling strategies; (3) detailed
instrumentation patches that can profile the executions at
different levels; and (4) a complete set of configurable
parameters, both for the scheduling schemes and the system
settings.
Downdload
clusterschedsim.tar.gz
STF: A Spatio-Temporal Filter for Failure Logs
STF is designed to remove redundancy and identify unique fatal events
for RAS event logs collected from IBM BlueGene/L, but
it can be easily extended to work with event logs from other parallel
systems. Its compression
rate is shown to be above 99.6%. STF consists of three steps: (1)
extracting failure events and categorizing them based on the subsystem
in which the failures occur, (2) compressing failures at the same
location (temporal compression), and (3) compressing failures across
multiple locations (spatial compression).
Downdload
clusterschedsim.tar.gz
Back
to Research Home