Optimizing Cache Performance for Graph Analytics
A series of efficient cache optimizations for graph analytics. Currently we have the fastest shared-memory PageRank performance, beating GraphMat, Galois, Ligra significantly.
https://arxiv.org/abs/1608.01362
Optimizing Spark for Multi-core Systems
Applying some of the shared memory integration ideas in Spark to reduce memory footprint of Kmeans by 2.5x on a 6-core node and speed up PageRank 15% on a local computer. Project Report for 6.824 at MIT
Optimizing Spark for Multi-core Systems
Habanero-Hadoop: An Optimized MapReduce Runtime for Multi-core Systems
Currently I am working on it to further improve the implementation and plan to develop it into a full conference publication. More information can be found in the blog posts. Below are the best presentations, posters and workshop papers on the research project.
Tech Report for the research. socc14-draft
Master thesis on Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters
ACM Student Research Competition at SPLASH 13, 3rd Place, Undergraduate Oct 2013
(The improved poster and slides with more applications and results)
https://wiki.rice.edu/confluence/download/attachments/4425835/HJ-Hadoop-presentation-v3.pdf [slides]
HJ-Hadoop-posterv4 [poster]
The original HotPar 2013 Poster Accompanying Workshop Paper June 2013
https://wiki.rice.edu/confluence/download/attachments/4425835/HabaneroJava-Hadoop.pdf [workshop paper submission, accepted as a poster]
HJ-MPI (Building ArrayView based MPI APIs in Habanero Java Library)
YunmingZhang-COMP 522-ProjectReport