Paper Summary: Phrase-Based Statistical Translation of Programming Languages

This is a summary of the paper “Phrase-Based Statistical Translation of Programming Languages” by Svetoslav (ETH Zurich). The paper is to be presented at SPLASH 14, Onward! program. The link to the paper presentation is at

The paper can be found at

This is paper is currently under development. The post will be frequently updated in the next few days. 


  • Problem/Focus

Migrating/Translating software from different programming languages is difficult. 

  • Importance

Enable easy translation of softwares. 

  • Context

Programming languages and statistical methods first proposed in “on the naturalness of software”. 

Nguyen et. al published work on phrased-based SMT system. However, the work has high rates of syntactic errors (49.5% – 58.6%).

Code Completion with Statistical models. (PLDI 2014).


  • Contribution

“An optimized procedure for translating a context-free grammar into a prefix grammar that is suitable for phrase-based statistical machine translation. 

An implementation which extends phrase-based translation to take language grammar into account. 

A detailed experimental evaluation indicating that with our most advanced configuration, about 605 of the translated methods compile and many of the translated methods are semantically equivalent to the reference solution. 

  • Implementation
  • Evaluation 


