17-355/17-665/17-819 Program Analysis

Course Description

This course covers both foundations and practical aspects of the automated analysis of programs, which is becoming increasingly critical to find software errors and assure program correctness. The theory of abstract interpretation captures the essence of a broad range of program analyses and supports reasoning about their correctness. Building on this foundation, the course will describe program representations, data flow analysis, alias analysis, interprocedural analysis, dynamic analysis, and symbolic execution. Through assignments and projects, students will design and implement practical analysis tools that find bugs and verify properties of software.

This course fulfills the Logic and Languages constrained elective category for the Computer Science major as well as the Theoretical Foundations requirement of the Computer Science master's degree.

Why take this course?

Logistics and People

(all times below in US Eastern Time; see Canvas for Zoom meeting links)

Class: Tue/Thu 10:40 a.m. — 12:00 noon in PH 100 and via Zoom (in-person + remote)
Recitation: Fri 2:10 p.m. — 3:00 p.m. via Zoom (remote only)
Spring 2021
12 units

Professor Jonathan Aldrich
TCS 422
Office hours: Tue 5-6pm via Zoom
Email: aldrich@cs.cmu.edu
Headshot of Professor Jonathan Aldrich
Professor Rohan Padhye
TCS 325
Office hours: Thu 5-6pm via Zoom
Email: rohanpadhye@cmu.edu
Headshot of Professor Jonathan Aldrich
TA: Priya Varra

Office hours: Mon 8-9pm via Zoom
Email: pvarra@andrew.cmu.edu
Headshot of teaching assistant Priya Varra

Course Syllabus and Policies

The syllabus covers course learning objectives, supplemental textbooks, assessments, late work policy, and policies. Consult Canvas and especially Piazza for up-to-date announcements and discussion (links in header).

Schedule

As of January 2021, we have endeavored to make this schedule, including assignment due dates, as accurate as possible. However, it is subject to change as the semester advances. "Hybrid" modality indicates in-person + remote.

Date Topic Reading/Material Assignments Due Modality Optional Reading
Feb 2 Introduction, Program Representation, and Syntactic Analysis Text ch. 1-2, slides Remote PPA ch. 1
Feb 4 Program Semantics Text ch. 3, slides Remote
Feb 5 recitation CodeQL notes Remote
Feb 9 Dataflow Analysis and Abstract Interpretation Text ch. 4, slides Remote PPA ch. 2
Feb 11 Dataflow Analysis and Abstract Interpretation (contd.) Text ch. 4, slides hw1 hw1 Remote PPA ch. 6
Feb 12 recitation Semantics exercises, notes Remote
Feb 16 Dataflow Analysis examples Text ch. 5, slides Hybrid
Feb 18 Dataflow Analysis examples (contd.) (see previous) hw2 pdf mathpartir Hybrid
Feb 19 recitation Specifying/Implementing Dataflow Analysis Remote
Feb 23 No class
Feb 25 Dataflow Analysis termination and correctness Text ch. 6, slides Hybrid PPA ch. 4
Feb 26 recitation Correctness of Program Analysis exercises, notes Remote
Mar 2 Widening Operators and Collecting Semantics Text ch. 7, slides hw3 pdf Hybrid
Mar 4 Interprocedural analysis Text ch. 8, slides, simplified algorithm Hybrid
Mar 5 recitation Introduction to Soot notes Remote
Mar 9 Context-Sensitive Analysis Text ch. 8 & ch. 10.2 slides, simplified algorithm hw4 pdf Hybrid
Mar 11 Pointer Analysis Text ch. 10 slides Hybrid
Mar 12 recitation Midterm review Midterm Study Guide Remote
Mar 15 midterm
8am take-home exam released
Mar 16 Hoare-style Verification Text ch. 11, slides hw5 (checkpoint) pdf Hybrid PPA ch. 3
Mar 18 Control-Flow Analysis for Functional Languages Text ch. 8 slides midterm
take-home exam due
Hybrid
Mar 19 no recitation
Mar 23 Hoare-style Verification (contd.) Text ch. 11, slides Hybrid
Mar 25 Symbolic Execution Text ch. 13, slides hw5 pdf Hybrid
Mar 26 recitation exercises, notes Remote
Mar 30 Concolic Testing Text ch. 14, slides Hybrid
Apr 1 Fuzz Testing Text ch. 16, slides hw6 pdf Hybrid
Apr 2 recitation repository Remote
Apr 6 Satisfiability Modulo Theories (SMT) Text ch. 12, slides Hybrid
Apr 8 Program Synthesis Text ch. 15, slides Hybrid
Apr 9 recitation repository Remote
Apr 13 Oracle-Guided Synthesis Text ch. 15, slides hw7 pdf Hybrid
Apr 15 No class
Apr 16 no recitation
Apr 20 Introduction to Program Repair slides, CACM article Hybrid
Apr 22 Beyond Program Repair slides hw8 (incl. project proposal) pdf Hybrid
Apr 23 recitation material Remote
Apr 27 Model Checking slides Hybrid
Apr 29 Software Model Checking slides Hybrid
Apr 30 recitation Remote
May 4 Dynamic Analysis for Data Race Detection slides checkpoint for project Hybrid
May 6 Scaling Up Verification slides Hybrid
May 7 recitation Remote
May 14, 1—4pm Project presentations Project presentations Remote
May 17 project final report due