NeurIPS2018 — Conference Summary & Reading List

13 min readDec 10, 2018

The goal of this blog is three-fold :

1st - For those who didn’t attend, give a quick impression about the conference;
2nd - A list of curated papers to read (I have a link of ~60 papers).

The conference has accepted ~1000 papers, so it is impossible to go thru all of them.

3. Finally, this blog also for my reference — otherwise I will lose the thoughts as life’s more urgent trifles overcome the important academic pursuits.

P.S: (Dec 12, 2018) The blog is in a rough form. The graphics do not translate well from Word to medium. I will clean up the blog. If I wait for long, I will lose the motivation to publish. An ugly blog will motivate me to clean it up ! Also am leaving Montreal tomorrow and a stack of work waits me on the other end. So need to publish before I take off.

Down below, I have included links to all papers, by category, medium links to selected papers and finally facebook videos of sessions. Again, as a single reference point for me and others.

There are two ways of looking at a massive intellectually packed conference like the NeurIPS.

If one has a research (or an implementation) focus, it is relatively easier to parse thru all the relevant papers
But if one tries to absorb the conference from a broader perspective, it is a herculean task. The breadth and depth will overwhelm you — at least that is what I experience.

Suggestions on how to get the most out of NeurIPS or similar packed conference:

Go in a group — I really missed my colleagues !!
Read and discuss papers before going in
Have people with different focus — one person can’t cover them all

Initial impressions

Everybody knew the “AI Conference” at Montreal. From the airport to the ice sculptures, everyone welcomed us ! Of course they knew the nerds are not going to conquer Canada ! And Montreal is an AI startup powerhouse !

The conference is Yuuuge ! 1011 papers, ~8000 attendees, posters and workshops covering a spectrum of algorithms, concepts, practices, experiments and ideas — all well researched with due amount of theory.

To match the Yuuge-ness of the conference, the blog is also Yuuge ! Read Part 1 and you will get a good idea. Part 2 has links to papers that you can go thru at your convenience. I plan to do the same.

When looking for papers and posters, one main question is the proverbial exploit vs explore. i.e. exploit already known domains and algorithms vs look for new and interesting ideas. Moreover, w.r.t new ideas, we won’t know the influence until later, sometimes only in 1–2 years from now. The algorithms and ideas need to studied, applied, tweaked, combined and finally deployed to solve problems. And, of course, implemented in frameworks like TensorFlow and pytorch.

NeurIPS’18 Conference Theme:

From what I saw, the broader themes of the conference were Accountability and Algorithmic Bias, Trust, Robustness and most importantly diversity. You can see from the keynotes and workshops.

These questions resonated well with me.:
- Is ML truly ready for real-world deployment?
- Can We Truly Rely on ML?
- ML Predictions Are (Mostly) Accurate but Brittle !
- Need to understand the “failure modes” of ML

My focus:

I was looking for work in the following areas:

DRL
Conversational AI
Machine Reasoning
Structured Memory & Long term memory
Capsule Networks
Smart mobility (orchestration and fleet management using AI/RL)
Object detection (always my favorite)
Embedded ml (to try out on my new iPad pro neural chip and the Swift language)

9. Integration of knowledge and knowledge graph

NeurIPS’18 Topics Summary:

I did find all of my topics in some measure — not equally represented. The actual topics were varied:

Embedding, DRL, GANs, all are represented very well. Probably majority of papers

Many papers on other adversarial topics incl Wasserstein methods
Graphical GANs to represent structured data

2. Question and Answer algorithms were covered somewhat well

3. VAEs (Variational Autoencoders)

4. Another important topic stream was Training, optimizations, scalability (ie train massive dimensional data) and speedup, showing maturing domains

Learning from noisy data, ability to learn noise in data, learning from smaller and more diverse data set — this is a topic that can make models robust to changing data distributions

5. Lots of papers on Bayesian methods at multiple levels

6. Casual Models — many papers and one very good tutorial

7. Interesting topics — a few papers on each

Relational reasoning and relational networks — interesting, moving towards knowledge graph ?
SACNNS (Structure aware CNNs) — interesting concept
Even some time series papers !
Attention mechanisms — even a paper on nagging DRL networks (kind of)!
Spiking Neural Networks — a few papers on this interesting concept
Langevin Dynamics and associated methods
Even a paper on Runge-Kutta Discretization !

8. Many papers on optimizing, clustering algorithms, quantifying uncertainty in clustering and so forth

9. Only 3 papers on CapsuleNet (from the titles). I was expecting more

For the detail oriented, I have good links to follow (in addition to my Top 52 at the end of this blog, … yes there is an end …)

NeurIPS Workshops:

There were multiple excellent workshops and I wanted to attend them all ! I attended two.

DRL Workshop — Yuuuge — Room Capacity ~3000. ~2000 attendees in & out

NeurIPS Keynotes:

Some interesting keynotes and talks.

The videos are published at facebook, link down below. I have captured the time and date of the events so that you can match with the fb videos. It is not ideal, but at least there is a way

This talk was well received. A view from the biological world

Youtube : https://www.youtube.com/watch?v=RjD1aLm4Thg
Sides : https://www.youtube.com/watch?time_continue=1&v=RjD1aLm4Thg

Tutorial on Adversarial was interesting

Counterfactual Inference — definitely requires a second look

Slides : https://media.neurips.cc/Conferences/NIPS2018/Slides/stastical_learning_theory.pdf
Interesting view on Statistical Learning Theory. Can we use the ideas to infer data distributions and alert if it has changed ? Model evaluation and evolution — one of my topics of interest

Missed this one on Financial Services. Plan to ask the organizers for materials

https://nips.cc/Conferences/2018/Schedule?showEvent=10908

Interesting Invited talks

https://media.neurips.cc/Conferences/NIPS2018/Slides/Felten_NeurIPS_2018.pdf

https://media.neurips.cc/Conferences/NIPS2018/Slides/jpineau-NeurIPS-dec18-fb.pdf

https://media.neurips.cc/Conferences/NIPS2018/Slides/Trustworthy_Algorithms.pdf

Finally, The Montreal Declaration by Prof. Yoshua Bengio

NeurIPS’18 — Awards

Best Paper Awards:

Non-delusional Q-learning and Value-iteration
By: Tyler Lu · Dale Schuurmans · Craig Boutilier
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks
By: Kevin Scaman · Francis Bach · Sebastien Bubeck · Laurent Massoulié · Yin Tat Lee
Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes
By: Hassan Ashtiani · Shai Ben-David · Nick Harvey · Christopher Liaw · Abbas Mehrabian · Yaniv Plan
Neural Ordinary Differential Equations
By: Tian Qi Chen · Yulia Rubanova · Jesse Bettencourt · David Duvenaud
https://www.technologyreview.com/s/612561/a-radical-new-neural-network-design-could-overcome-big-challenges-in-ai/

2. Test of time award paper

https://ai.googleblog.com/2018/12/the-neurips-2018-test-of-time-award.html

Part 2 : NeurIPS’18 — The Gory details & Reading List

Now let me dig into the core of the conference — the papers, posters, discussions and the rest. I have listed the links for all the 1011 papers and then my reading list of ~52 papers.

List of papers subject area : https://nips.cc/Conferences/2018/Schedule?bySubject
Full paper List : https://papers.nips.cc/book/advances-in-neural-information-processing-systems-31-2018
Paper Reviews : https://openreview.net/submissions?id=NIPS.cc
Machine generated summary : https://github.com/contentinnovation/NeurIPS-2018-papers
Summary of RL papers : https://medium.com/@jianzhang_23841/a-comprehensive-summary-and-categorization-on-reinforcement-learning-papers-at-icml-2018-787f899b14cb
RL Paper Summary : https://medium.com/@yuxili/reinforcement-learning-in-nips-2018-967ab53ab211
Suggested Reading List : https://medium.com/@yuxili/nips-2018-rl-papers-to-read-5bc1edb85a28
Poster summary : https://towardsdatascience.com/neurips-2018-reading-list-from-tue-poster-sessions-a-b-fce561e56be8
NeurIPS Session Videos at facebook : https://www.facebook.com/pg/nipsfoundation/videos/
Youtube videos https://www.youtube.com/results?search_query=NIPS+NeurIPS+2018
https://www.youtube.com/watch?v=U8B_06uhYXI Yann LeCun The Next Step Towards Artificial Intelligence
NIPS 2017 Lectures https://www.youtube.com/playlist?list=PL-myaKI4DslXTmtTgTiNtYRpQI2UmTJBv
DL Practice & trends from NIPS’17 is still interesting https://www.youtube.com/watch?v=YJnddoa8sHk
And finally, if you are a glutton for more papers, https://towardsdatascience.com/the-10-coolest-papers-from-cvpr-2018-11cb48585a49

My Reading List

Finally, the reading list from my explorations of the 1011 papers !

Ran out of time to do it by last week, unfortunately. Have to pack, sleep and catch a plane. I have the list in word, but it doesn’t translate well to medium. So I have to do it one by one. I will finish it on Tuesday. [11/16/18 : Finished !]
Also many times, the materials are distributed i.e. the talk session shows spotlight slides while the poster session has link to the paper and viseo. I have tried to hunt down all the materials and then keep them together in one place along with the paper.

Dendritic cortical microcircuits approximate the backpropagation algorithm by João Sacramento · Rui Ponte Costa · Yoshua Bengio · Walter Senn
Spectral Filtering for General Linear Dynamical Systems by Elad Hazan · HOLDEN LEE · Karan Singh · Cyril Zhang · Yi Zhang
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors by Arash Vahdat · Evgeny Andriyash · William Macready
GumBolt: Extending Gumbel trick to Boltzmann priors by Amir H Khoshaman · Mohammad Amin
Banach Wasserstein GAN by In Tue Poster Session A
Jonas Adler · Sebastian Lunz (P.S: I have an interest in Wasserstein Generative Adversarial Networks (WGANs))
Are GANs Created Equal? A Large-Scale Study by Mario Lucic · Karol Kurach · Marcin Michalski · Sylvain Gelly · Olivier Bousquet
Fast and Effective Robustness Certification by Gagandeep Singh · Timon Gehr · Matthew Mirman · Markus Püschel · Martin Vechev (Interesting idea - certifying neural network robustness based on abstract interpretation !)
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction by Shuyang Sun · Jiangmiao Pang · Jianping Shi · Shuai Yi · Wanli Ouyang
Pelee: A Real-Time Object Detection System on Mobile Devices by Jun Wang · Tanner Bohn · Charles Ling (Plan to try on apple’s neural chip — work for my new iPad Pro !)
Kalman Normalization: Normalizing Internal Representations Across Network Layers by Guangrun Wang · jiefeng peng · Ping Luo · Xinjiang Wang · Liang Lin
CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces By Liheng Zhang · Marzieh Edraki · Guo-Jun Qi (One of the three papers on Capsule Networks — I was expecting more)
HitNet: Hybrid Ternary Recurrent Neural Network by Peiqi Wang · Xinfeng Xie · Lei Deng · Guoqi Li · Dongsheng Wang · Yuan Xie (Interesting approach to balance accuracy and quantization)
The Importance of Sampling inMeta-Reinforcement Learning by Bradly Stadie · Ge Yang · Rein Houthooft · Peter Chen · Yan Duan · Yuhuai Wu · Pieter Abbeel · Ilya Sutskever
Variational Memory Encoder-Decoder by Hung Le · Truyen Tran · Thin Nguyen · Svetha Venkatesh (Conversational)
On the Dimensionality of Word Embedding by Zi Yin · Yuanyuan Shen
Mesh-TensorFlow: Deep Learning for Supercomputers by Noam Shazeer · Youlong Cheng · Niki Parmar · Dustin Tran · Ashish Vaswani · Penporn Koanantakool · Peter Hawkins · HyoukJoong Lee · Mingsheng Hong · Cliff Young · Ryan Sepassi · Blake Hechtman
Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias by Abhinav Gupta · Adithyavairavan Murali · Dhiraj Prakashchand Gandhi · Lerrel Pinto
Evolved Policy Gradients by Rein Houthooft · Yuhua Chen · Phillip Isola · Bradly Stadie · Filip Wolski · OpenAI Jonathan Ho · Pieter Abbeel
Bias and Generalization in Deep Generative Models: An Empirical Study by Shengjia Zhao · Hongyu Ren · Arianna Yuan · Jiaming Song · Noah Goodman · Stefano Ermon
How Does Batch Normalization Help Optimization? by Shibani Santurkar · Dimitris Tsipras · Andrew Ilyas · Aleksander Madry (A medium Post on this topic)
Step Size Matters in Deep Learning by Kamil Nar · Shankar Sastry (Slides)
Precision and Recall for Time Series By Nesime Tatbul · Tae Jun Lee · Stan Zdonik · Mejbah Alam · Justin Gottschlich (Spotlight Slides)
Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation by Matthew O’Kelly · Aman Sinha · Hongseok Namkoong · Russ Tedrake · John Duchi
Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding By Nan Rosemary Ke · Anirudh Goyal ALIAS PARTH GOYAL · Olexa Bilaniuk · Jonathan Binas · Michael Mozer · Chris Pal · Yoshua Bengio (Spotlight Slides)
Chain of Reasoning for Visual Question Answering by Chenfei Wu · Jinlai Liu · Xiaojie Wang · Xuan Dong
Distilled Wasserstein Learning for Word Embedding and Topic Modeling By Hongteng Xu · Wenlin Wang · Wei Liu · Lawrence Carin
Exploration in Structured Reinforcement Learning By Jungseul Ok · Alexandre Proutiere · Damianos Tranos
Recurrent Transformer Networks for Semantic Correspondence by Seungryong Kim · Stephen Lin · SANG RYUL JEON · Dongbo Min · Kwanghoon Sohn (Spotlight Slides)
Hamiltonian Variational Auto-Encoder By Anthony L Caterini · Arnaud Doucet · Dino Sejdinovic
How to Start Training: The Effect of Initialization and Architecture By Boris Hanin · David Rolnick
Revisiting (ϵ,γ,τ)-similarity learning for domain adaptation By Sofiane Dhouib · Ievgen Redko (Spotlight Slides)
Is Q-Learning Provably Efficient? By Chi Jin · Zeyuan Allen-Zhu · Sebastien Bubeck · Michael Jordan
Monte-Carlo Tree Search for Constrained POMDPs By Jongmin Lee · Geon-hyeong Kim · Pascal Poupart · Kee-Eung Kim
Policy Optimization via Importance Sampling By Alberto Maria Metelli · Matteo Papini · Francesco Faccio · Marcello Restelli
Reducing Network Agnostophobia By Akshay Raj Dhamija · Manuel Günther · Terrance Boult (A very real problem when deploying models: Agnostophobia, the fear of the unknown, can be experienced by deep learning engineers while applying their networks to real-world applications. Unfortunately, network behavior is not well defined for inputs far from a network’s training set)
Are ResNets Provably Better than Linear Predictors? By Ohad Shamir
Reinforcement Learning for Solving the Vehicle Routing Problem By MohammadReza Nazari · Afshin Oroojlooy · Lawrence Snyder · Martin Takac
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning By Tom Zahavy · Matan Haroush · Nadav Merlis · Daniel J Mankowitz · Shie Mannor
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents By Edoardo Conti · Vashisht Madhavan · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune (The concept of novelty seeking agents somehow seems wrong ;o))
Dual Policy Iteration By Wen Sun · Geoffrey Gordon · Byron Boots · J. Bagnell (Dual Policy Iteration looks very interesting. Might be able to solve class of problems that a single policy layer can’t solve)
Online Robust Policy Learning in the Presence of Unknown Adversaries By Aaron Havens · Zhanhong Jiang · Soumik Sarkar
Learning to Navigate in Cities Without a Map By Piotr Mirowski · Matt Grimes · Mateusz Malinowski · Karl Moritz Hermann · Keith Anderson · Denis Teplyashin · Karen Simonyan · koray kavukcuoglu · Andrew Zisserman · Raia Hadsell
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning By Romain WARLOP · Alessandro Lazaric · Jérémie Mary (Interesting concept. I am always partial to adding serendipity to recommendations)
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments By Sriram Srinivasan · Marc Lanctot · Vinicius Zambaldi · Julien Perolat · Karl Tuyls · Remi Munos · Michael Bowling
Learning to Share and Hide Intentions using Information Regularization By Daniel Strouse · Max Kleiman-Weiner · Josh Tenenbaum · Matt Botvinick · David Schwab
Teaching Inverse Reinforcement Learners via Features and Demonstrations By Luis Haug · Sebastian Tschiatschek · Adish Singla (Different world view between a teacher and student is an interesting problem)
Why Is My Classifier Discriminatory? By Irene Chen · Fredrik Johansson · David Sontag (Spotlight Slides)
Wasserstein Variational Inference By Luca Ambrogioni · Umut Güçlü · Yağmur Güçlütürk · Max Hinne · Marcel A. J. van Gerven · Eric Maris
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network By Aditya Kusupati · Manish Singh · Kush Bhatia · Ashish Kumar · Prateek Jain · Manik Varma
Understanding Batch Normalization By Nils Bjorck · Carla P Gomes · Bart Selman · Kilian Weinberger
Towards Deep Conversational Recommendations By Raymond Li · Samira Ebrahimi Kahou · Hannes Schulz · Vincent Michalski · Laurent Charlin · Chris Pal
Non-delusional Q-learning and value-iteration By Tyler Lu · Dale Schuurmans · Craig Boutilier (Google’s paper, one of the best paper awards)
Fully Understanding The Hashing Trick by Lior Kamma · Casper B. Freksen · Kasper Green Larsen(Spotlight Slides)
When do random forests fail? By Cheng Tang · Damien Garreau · Ulrike von Luxburg
Norm matters: efficient and accurate normalization schemes in deep networks By Elad Hoffer · Ron Banner · Itay Golan · Daniel Soudry(Spotlight Slides)
Out-of-Distribution Detection using Multiple Semantic Label Representations by Gabi Shalev · Yossi Adi · Joseph Keshet
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks By Kimin Lee · Kibok Lee · Honglak Lee · Jinwoo Shin(Spotlight Slides)
Recurrent World Models Facilitate Policy Evolution by David Ha · Jürgen Schmidhuber