Who, when and where?
The seminar will be held as block seminar on Jan. 10th and 11th in room A1.02. We will start at 10am.
The seminar will be jointly held by Profs. Carsten Binnig, Kristian Kersting, Andreas Koch, and Mira Mezini.
It is not necessary to have prior knowledge in artificial intelligence, but prior knowledge in software/hardware systems and machine learning is helpful. Participation is limited to 20 students.
For further questions feel free to send an email to firstname.lastname@example.org. No prior registration is needed, however, please still send us an email so that we are able to estimate beforehand the number of participants, and have your E-mail address for possible announcements. Also make sure that you are registered in TUCaN.
This seminar serves the purpose of discussing new research papers in the intersection of hardware/software-systems and machine learning. The seminar aims to elicit new connections amongst these fields and discusses important topics regarding systems questions machine learning including topics such as hardware accelerators for ML, distributed scalable ML systems, novel programming paradigms for ML, Automated ML approaches, as well as using ML for systems.
The topics will be assigned based on an on-line bidding process, which will be opened after the kick-off. The final assignment will be made a week later.
What is “Extended” about this seminar? Students are not only expected to give a short talk, but also to prepare a small write-up. The write-up will be prepared in groups, each group will cover one theme, consisting of four topics. The final write-up must be concise and short, and should give a short overview of the theme (not necessarily limited to the studied papers).
In addition, we will also do a peer reviewing process, as it is usually done at scientific conferences. This means that you also have to read (some) of the other write-ups and provide feedback by filling out a review form.
Because they are more work for students, students receive 4 CPs for Extended Seminars (instead of 3 CPs for regular seminars).
Although each topic is typically associated with a single paper, the point of the talk is not to exactly reproduce the entire contents of the paper, but to communicate the key ideas of the methods that are introduced in the paper. Thus, the content of the talk should exceed the scope of the paper, and demonstrate that a thorough understanding of the material was achieved. See also our general advices on giving talks.
Students are expected to give a 20 (!) minute talk on the material they are assigned, followed by 10 minutes of questions. Note that the comparably short period of time forces you to get the most important points of your topic across. You are not expected to present everything.
The talks are expected to be accompanied by slides. In case you do not own a laptop, please send us the slides in advance, so that we can prepare and test the slides. The talk and the slides should be in English.
The talks will be presented in a block on 10 and 11 January 2019.
The talks are organized in topical groups. Each group must prepare one short write-up of their work.
Content: The papers are related to each other. Your task is to use these papers to create a mini-survey that combines the results of all papers, and possibly other papers. The contribution of each individual paper can be limited to the most important points that are contributed by this paper to the topic. There must be a clear “red thread” within each survey, a concatenation of individual paper summaries is not enough. A possible outline can consist of an introduction to set the stage and outline the cross-cutting themes of all papers, multiple sections on individual contributions w.r.t. cross-cutting themes and comparison of different approaches, a joined related work section, and a summary and outlook.
Format: The format for the write-up is predefined, and follows conventions that are typically used for publications in computer science. In particular, we require each paper to be formatted according to the Template for Proceedings in ACM Conferences (2-column layout). Each paper should have no more than 6 pages in this format (the bibliography is not counted, and can be as long as necessary). The format must not be changed in order to generate more space. Each paper also must, of course, have a title, authors, and an abstract. The templates are available in Word and LaTeX, but we strongly recommend that you try to use LaTeX. Environments such as MiKTeX and TeXstudio make local LaTeX-editing quite easy, and web-sites like Overleaf offer collaborative working environments for LaTeX.
Deadline: The write-ups are due 29 January, 2018.
Reviews are required for all three other writeups. A reviewing form will be provided by then. The deadline of the students’ reviews will be 19 February, 2018.
The slides, the presentation, the answers given to questions in your talk will influence the overall grade, as will the write-up and the reviews. Furthermore, it is expected that students actively participate in the discussions, and this will also be part of the final grade.
To achieve a grade in the 1.x range, the talk and write-up needs to exceed the recitation of the given material and include own ideas, own experience or even examples/demos. An exact recitation of the papers will lead to a grade in the 2.x range. A weak presentation and lack of engagement in the discussions may lead to a grade in the 3.x range, or worse. For the write-ups it is important that they provide a coherent view (like a survey paper), and do not simply consist of a concatenation of four paper summaries.
Day 1: Machine Learning and Hardware/Machine Learning and Databases (Thursday, Jan. 10th)
Session 1: Hardware for Machine Learning (Jan. 10th, 10:00h)
- Prabhash Kumar Jha: (D1) C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello and Y. LeCun, “NeuFlow: A runtime reconfigurable dataflow processor for vision,” CVPR 2011 WORKSHOPS, Colorado Springs, CO, 2011, pp. 109-116.
- Isheeta Jha: (D2) Jouppi, Norman P., et al. “In-datacenter performance analysis of a tensor processing unit.” Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on. IEEE, 2017.
Session 2: Machine Learning for Databases (Jan. 10th, 11:00h)
- Lasse Beck Thostrup: (A3) Krishnan, S., Yang, Z., Goldberg, K., Hellerstein, J., & Stoica, I. Learning to optimize join queries with deep reinforcement learning. arXiv 2018.
- Andreas Zimpfer: (A1) Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis: The Case for Learned Index Structures. SIGMOD Conference 2018.
- Markus Stefani: (A4) Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P., & Kemper, A. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. arXiv 2018.
Lunch break (Jan. 10th, 12:30h-14:00h)
Session 3: Distributed Machine Learning (Jan. 10th, 14:00h)
- Moritz Nottebaum: (A6) Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, Bor-Yiing Su: Scaling Distributed Machine Learning with the Parameter Server. OSDI 2014
- Julian Keßel: (A8) Jiawei Jiang, Fangcheng Fu, Tong Yang, Bin Cui: SketchML: Accelerating Distributed Machine Learning with Data Sketches. SIGMOD Conference 2018
Day 2: Automated Machine Learning/Machine Learning and Software Engineering (Friday, Jan. 11th)
Session 4: Automated Machine Learning (Jan. 11th, 10:00h)
- Benedikt Gross: (B9) James Robert Lloyd, David K. Duvenaud, Roger B. Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani: Automatic Construction and Natural-Language Description of Nonparametric Regression Models. AAAI 2014: 1242-1250.
- Frederik Wegner: (B7) Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matthew Botvinick, Nando de Freitas: Learning to Learn without Gradient Descent by Gradient Descent. ICML 2017: 748-756.
Session 5: Machine Learning and Software Engineering (Jan. 11th, 11:00h)
- Mohammad Braei: (C12) Proksch, Lerch, Mezini. Intelligent code completion with Bayesian networks. TSE 2015.
- Julian Haas: (C8) Baudart, Hirzel, Mandel. Deep probabilistic programming languages: A Qualitative Study.
- Sakshi Goyal: (C5) Gelman, Lee, Guo. Stan: A probabilistic programming language for Bayesian inference and optimization. Journal of Educational and Behavioral Statistics 40.5 (2015): 530-543.
All papers should be available on the internet or in the ULB. Note that Springer link often only works on campus networks (sometimes not even via VPN). If you cannot find a paper, contact us.
Machine Learning and Data Management
Machine Learning to enhance Database Systems (Binnig)
- (A1) Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis: The Case for Learned Index Structures. SIGMOD Conference 2018.
- (A2) Ma, Lin, et al. Query-based Workload Forecasting for Self-Driving Database Management Systems. SIGMOD Conference 2018.
- (A3) Krishnan, S., Yang, Z., Goldberg, K., Hellerstein, J., & Stoica, I. Learning to optimize join queries with deep reinforcement learning. arXiv 2018
- (A4) Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P., & Kemper, A. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. arXiv 2018.
- (A5) Li, T., Xu, Z., Tang, J., & Wang, Y. (2018). Model-free control for distributed stream data processing using deep reinforcement learning. PVLDB 2018
Machine Learning for Knowledge Base Construct (Kersting)
- (B1) Ismail Ilkan Ceylan, Adnan Darwiche, Guy Van den Broeck: Open-World Probabilistic Databases. KR 2016: 339-348.
- (B2) Benny Kimelfeld, Christopher Ré: A Relational Framework for Classifier Engineering. SIGMOD Record 47(1): 6-13 (2018).
- (B3) Ce Zhang, Christopher Ré, Michael J. Cafarella, Jaeho Shin, Feiran Wang, Sen Wu: DeepDive: declarative knowledge base construction. Commun. ACM 60(5): 93-102 (2017).
- (B4) Parisa Kordjamshidi, Dan Roth, Kristian Kersting: Systems AI: A Declarative Learning Based Programming Perspective. IJCAI 2018: 5464-5471.
Machine Learning Systems
Distributed Machine Learning (Binnig)
- (A6) Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, Bor-Yiing Su: Scaling Distributed Machine Learning with the Parameter Server. OSDI 2014
- (A7) Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, Ion Stoica: Ray: A Distributed Framework for Emerging AI Applications. Arxiv 2017
- (A8) Jiawei Jiang, Fangcheng Fu, Tong Yang, Bin Cui: SketchML: Accelerating Distributed Machine Learning with Data Sketches. SIGMOD Conference 2018
- (A9) Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang:
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning. ICML 2017
- (A10) Anthony Thomas, Arun Kumar: A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics. PVLDB 2018
- (A11) Tian Li, Jie Zhong, Ji Liu, Wentao Wu, Ce Zhang: Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads. PVLDB 2018
Automating Machine Learning (Kersting)
- (B5) Alexander J. Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, Christopher Ré: Data Programming: Creating Large Training Sets, Quickly. NIPS 2016: 3567-3575.
- (B6) Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springenberg, Manuel Blum, Frank Hutter: Efficient and Robust Automated Machine Learning. NIPS 2015: 2962-2970.
- (B7) Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matthew Botvinick, Nando de Freitas:
Learning to Learn without Gradient Descent by Gradient Descent. ICML 2017: 748-756.
- (B8) Antonio Vergari, Alejandro Molina, Robert Peharz, Zoubin Ghahramani, Kristian Kersting, Isabel Valera: Automatic Bayesian Density Analysis. CoRR abs/1807.09306 (2018).
- (B9) James Robert Lloyd, David K. Duvenaud, Roger B. Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani: Automatic Construction and Natural-Language Description of Nonparametric Regression Models. AAAI 2014: 1242-1250.
Machine Learning and Software Engineering
Programming Abstractions for Machine Learning (Mezini)
- (C1) Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.
- (C2) Tran, Hoffman, Saurous, Brevdo, Murphy, Blei. Deep probabilistic programming. arXiv preprint arXiv:1701.03757 (2017).
- (C3) PyTorch Machine Learning Library.
- (C4) Pyro Programming Language.
- (C5) Gelman, Lee, Guo. Stan: A probabilistic programming language for Bayesian inference and optimization. Journal of Educational and Behavioral Statistics 40.5 (2015): 530-543.
- (C6) Gordon, Henzinger, Nori, Rajamani. Probabilistic programming. In International Conference on Software Engineering (ICSE, FOSE track), 2014.
- (C7) Gordon et al. Probabilistic Programs as Spreadsheet Queries. ESOP 2015.
- (C8) Baudart, Hirzel, Mandel. Deep probabilistic programming languages: A Qualitative Study.
- (C9) Wang, Wu, Essertel, Decker, Rompf. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator.
- (C10) Hur, Nori, Rajamani, Samuel. Slicing Probabilistic Programs. PLDI 2014.
Machine Learning for Software Engineering (Mezini)
- (C11) Raychev, Vechev, Krause. Predicting Program Properties from “Big Code”. POPL 2015.
- (C12) Proksch, Lerch, Mezini. Intelligent code completion with Bayesian networks. TSE 2015.
- (C13) Raychev, Vechev, Yahav. Code completion with statistical language models. PLDI 2014.
- (C14) Bichsel, Raychev, Tsankov, Vechev. Statistical deobfuscation of android applications. CCS 2016.
- (C15) Amann, Nguyen, Nadi, Nguyen, Mezini. A Systematic Evaluation of Static API-Misuse Detectors. TSE 2018.
- (C16) Amann, Nguyen, Nadi, Nguyen, Mezini. MuDetect: The Next Step in Static API-Misuse Detection (Available on request).
Hardware for Machine Learning (Koch)
- (D1) C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello and Y. LeCun, “NeuFlow: A runtime reconfigurable dataflow processor for vision,” CVPR 2011 WORKSHOPS, Colorado Springs, CO, 2011, pp. 109-116.
- (D2) Jouppi, Norman P., et al. “In-datacenter performance analysis of a tensor processing unit.” Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on. IEEE, 2017.
- (D3) Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., & Temam, O. (2014). Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Notices, 49(4), 269-284.
- (D4) Y. Chen, T. Krishna, J. S. Emer and V. Sze, “Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks,” in IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127-138, Jan. 2017.
- (D5) S. Han et al., “EIE: Efficient Inference Engine on Compressed Deep Neural Network,” 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, 2016, pp. 243-254.
- (D6) A. Parashar et al., “SCNN: An accelerator for compressed-sparse convolutional neural networks,” 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), Toronto, ON, 2017, pp. 27-40.
- (D7) H. Sharma, J. Park, D. Mahajan, E. Amaro, J. K. Kim, C. Shao, A. Mishra, H. Esmaeilzadeh, “From High-Level Deep Neural Models to FPGAs”, in the Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2016.
- (D8) Kwon, Hyoukjun, Ananda Samajdar, and Tushar Krishna. “MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects.” Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2018.