BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20190719T085744Z
LOCATION:HG F 3
DTSTART;TZID=Europe/Stockholm:20190613T164500
DTEND;TZID=Europe/Stockholm:20190613T171500
UID:submissions.pasc-conference.org_PASC19_sess161_msa150@linklings.com
SUMMARY:Neural Code Comprehension: A Learnable Representation of Code Sema
 ntics
DESCRIPTION:Minisymposium\nComputer Science and Applied Mathematics, Emerg
 ing Application Domains\n\nNeural Code Comprehension: A Learnable Represen
 tation of Code Semantics\n\nBen-Nun\n\nIn the era of “Big Code&rdquo
 ;, research is being conducted into automating the understanding of comput
 er programs. Most of the current works base on techniques from Natural Lan
 guage Processing and Deep Learning, which have been successful recently, a
 ttempting to process the code directly or using syntactic representations 
 (e.g., ASTs and AST paths). However, to comprehend program semantics robus
 tly, structural features of code have to be taken into account as well, in
 cluding function calls, branching, and interchangeable order of statements
 . In this talk, we present a novel processing technique to use Machine Lea
 rning for code semantics, and show how it applies to a variety of program 
 analysis tasks. In particular, we define an embedding space, inst2vec, bas
 ed on an Intermediate Representation (IR) that is independent of the sourc
 e programming language. We provide a novel definition of contextual flow f
 or this IR, leveraging both the underlying data- and control-flow of the p
 rogram. We then analyze the embeddings quantitatively using analogies and 
 clustering, and evaluate the representation on three high-level tasks. We 
 show that even without fine-tuning, a single RNN architecture and fixed in
 st2vec embeddings outperform specialized approaches for performance predic
 tion and algorithm classification from raw code, where we set a new state-
 of-the-art.
END:VEVENT
END:VCALENDAR