Research Summary for Test Driven Code Reuse Tools

This is a post summarizing my search on Test Driven Code Reuse Tools. The code reuse tools are usually based on a  core code search engine. The user inputs unit tests/ test functions, comments, method signatures and the code search engine will look up relevant code snippets in the repository to match the signatures. In the end, the code snippets are transformed to make sure the unit tests pass.

 

  1. Semantics-based code search
    1. Citation:
      1. @INPROCEEDINGS{5070525, 
        author={Reiss, S.P.}, 
        booktitle={Software Engineering, 2009. ICSE 2009. IEEE 31st International Conference on}, 
        title={Semantics-based code search}, 
        year={2009}, 
        month={May}, 
        pages={243-253}, 
        keywords={Java;formal specification;program compilers;programming language semantics;software reusability;Java;Web interface;class signature;contract;keyword;method signature;open source code repository;program transformation;security constraint;semantics-based code search;software reuse;user specification;Computer science;Contracts;Java;Open source software;Programming profession;Prototypes;Search engines;Security;Testing;Writing}, 
        doi={10.1109/ICSE.2009.5070525}, 
        ISSN={0270-5257},} 
    2. Design
      1. specify (1) natural language description
      2. (2) method signature
      3. (3) test cases
      4. (4) can even further specify contract, security and other constraints
    3. Implementation
      1. uses text matching
      2. uses abstract syntax tree to perform program transformation
      3. test driven
        1. first find candidate files using natural language description
        2. transform the files in a few ways to pass the user specified test cases (heavily leverages structural information)
          1. Signature transform
          2. Generative transform
          3. Compilation transforms
    4. Evaluation and usefulness
      1. simple tokenizer
      2. quote tokenizer
      3. log2
      4. toRoman
      5. primes
      6. day of week
      1. appears to be good for simple functions such as toint, parse string
    5. Future Work
      1. extract dependencies
      1. additional transforms
      2. using context information
      3. additional semantics
      4. improved user interface
  2. Code Conjurer
    1. @ARTICLE{4602673, 
      author={Hummel, O. and Janjic, W. and Atkinson, C.}, 
      journal={Software, IEEE}, 
      title={Code Conjurer: Pulling Reusable Software out of Thin Air}, 
      year={2008}, 
      month={Sept}, 
      volume={25}, 
      number={5}, 
      pages={45-52}, 
      keywords={object-oriented programming;software reusability;IT industry;code conjurer;component-based reuse;programming language;reusable software;software assets;software development;software functionality;Acceleration;Application software;Assembly;Computer industry;Computer languages;Java;Programming;Software reusability;Software systems;Web services;Eclipse plug-in;component-based development;open source software;reuse recommendation;software reuse;software search engines;test-driven search}, 
      doi={10.1109/MS.2008.110}, 
      ISSN={0740-7459},}
    2. What does it do:
      1. An eclipse plugin, a tool that automatically finds and presents reusable software components
      2. Input: user describe how to use a certain class, or an interface
      3. output: code snippet that have been tested to match user’s intention
    3. Implementation
      1. Code search engine
        1. Identifies basic abstraction implemented by a module and stores it in a language agnostic formats
        2. Compile together a repository from source forge, google code, apache projects
        3. All java files
        4. 10 million code modules
        5. Keyword, signature, interface based matching
        6. Test driven retrievals
          1. Merobase automatically tests these candidates in a secured virtual machine to filter out those that don’t pass the test
        7. Eclipse plugin, source code available
        8. Merobase indexed by Lucene,
        9. data is available
    4. How useful it is? (Evaluation)
      1. Not very useful, it is very difficult to demonstrate that it is of any use. The ideas are good.
      2. A later paper indicates that the result isn’t as good as S6, Semantics-based Code Search. 
Advertisements
This entry was posted in Programming Language. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s