Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People

The Taub Faculty of Computer Science Events and Talks

Automatic Feature Generation for Predicting Program Properties
event speaker icon
Uri Alon (M.Sc. Thesis Seminar)
event date icon
Thursday, 17.08.2017, 10:00
event location icon
Taub 601
event speaker icon
Advisor: Prof. E. Yahav
We present a novel approach for automatic feature generation for predicting program properties. Our approach automatically produces features that can capture long-distance syntactic relationships between program elements. The features are purely syntactic, and the method is useful for any programming language. Inspired by Parse Tree Paths in Natural Language Processing (NLP), we generate features that capture relationships in an Abstract Syntax Tree (AST). We show that these features are general and can: (i) cover a number of different prediction tasks, (ii) drive two different learning algorithms (for both generative and discriminative models), and (iii) work across different programming languages. We evaluate our approach on the tasks of predicting variable names, method names, and types of expressions. We use the generated features to drive both CRF-based and word2vec-based learning, for programs of four languages: JavaScript, Java, Python and C#. Our evaluation shows that automatically generated features capture semantic similarities and produce better results than existing methods. By representing program elements using path features, we believe that our approach can be used in a variety of other machine learning tasks for programming languages, including different applications and different learning models.