March 30th, Week Meeting Notes

This is a summary of this week’s meetings, outlining a few to do items and potential issues.

  • Analysis of the data
    • Use separate lines for delineating different configurations of sequential mappers running in Sequential JVM.
    • Rerun a few surprisingly fast data points
    • Look into possible lock contention in the parallel JVM implementation as parallel version is consistently slower
    • Use top scripts instead of PS script and rerun the benchmarks
      • so that there shouldn’t be times where the overall utilization went beyond 800%
      • so that there should be a sharp drop in CPU utilization (currently wcpu in PS uses a weighted average)
    • Need automated scripts to generate the data for the CPU graphs
  • Finalizing the set of graphs we will present for each application
    • The throughput over different table size graph
      • show the impact of memory wall
      • show the impact of increasing number of mappers for IO intensive applications
    • The memory footprint over different table size
      • show the improvement of parallel JVM
    • The CPU over time graph
      • show that the IO is not a problem at first, distributed cache loads the table into memory once (explains why CPU spiked to 800%)
      • show that Hashjoin is an IO intensive application, KMeans is a Compute Intensive application

Next steps

Set up the test scripts for KMeans and run them, get the same data for KMeans

Set up KNN to run, get the data on that application

Design a new mapper interface with synchronized set up methods, and automatic state checking, static data structures possibly.

Thesis writing plan

  • introduction
    • mostly done, 3┬ápages
  • related work
    • mostly done, 7 pages
  • background (by April 7th)
    • focus on the application
  • implementation (by April 7th)
  • results
  • conclusions
This entry was posted in Hadoop Research, HJ-Hadoop Improvements. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s