Paper Summary:The Road Ahead for Mining Software Repositories

Paper Summary:The Road Ahead for Mining Software Repositories

A paper that summarized a lot of work done in the Software Engineering community in mining source code repositories.

The Road Ahead for Mining Software Repositories

  1. Focus / Problem to be solved
    1. leverage large code base to help programmers better design and write code
  2. Importance
  3. Method
    1. Understanding Software Systems
      1. understand rationale for certain unexpected designs
    2. Propagating Changes
      1. changes to interface (Kathyrin’s paper)
      2. automate change propagation can help avoid bugs
      3. code that change frequently together in the past are likely to change frequently in the future (historical repositories)
    3. Predicting and Identifying Bugs
      1. best bug predicators are prior bugs and prior changes, i.e., chose that has bugs in the past is likely to have bugs in the future
    4. Understanding Team Dynamics
    5. Improving the User Experience
      1. prevent users perform actions that are reported to be “buggy” by other users
    6. Reusing Code
      1. locate uses of code such as library APIs, and attempt to match these uses to the needs of a developer
    7. Automating Empirical Studies
  4. Unique contributions (overview)
    1. creation of techniques to automate and improve the extraction of information from repositories
    2. discovery and validation of novel techniques and approaches to mine important information from these repositories
  5. Possible applications
    1. Use data to guide developers to make important decisions, such as resource allocation for development and testing,
    2. Predict resolution time of a bug
  6. Future directions
    1. Taming the complexity of Mining Source Code Repository
      1. building good shared infrastructure for MSR
      2. Simplify the Extraction of High-Quality Data
      3. dealing with skew in repository data
    2. Going beyond Code and Bugs
      1. Exploring Non-Structured Data
      2. Linking Data Between Repositories
      3. Seeking Non-Traditional Repositories
      4. Understanding the Limitations of Repository Data
    3. Showing the Value of Repositories
      1. Understand the needs of practitioners
      2. Studying the Performance of Techniques in Practice
This entry was posted in Programming Language. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s