Nips 2018 most cited papers

Below are NIPS 2018 papers ranked by number of citations. The citation count was obtained by hand from Google Scholar on September 25, 2019 and may be outdated or subject to human error.

Rank	Cited by	Paper name
0	210	Glow: Generative Flow with Invertible 1×1 Convolutions
1	186	Are GANs Created Equal? A Large-Scale Study
2	180	Neural Ordinary Differential Equations
3	176	Visualizing the Loss Landscape of Neural Nets
4	123	How Does Batch Normalization Help Optimization?
5	114	Isolating Sources of Disentanglement in Variational Autoencoders
6	110	Video-to-Video Synthesis
7	98	Natasha 2: Faster Non-Convex Optimization Than SGD
8	95	PointCNN: Convolution On X-Transformed Points
9	93	Adversarially Robust Generalization Requires More Data
10	84	Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
11	81	Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
12	77	Scaling provable adversarial defenses
13	76	Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
14	74	Derivative Estimation in Random Design
15	73	Neural Tangent Kernel: Convergence and Generalization in Neural Networks
16	70	An intriguing failing of convolutional neural networks and the CoordConv solution
17	70	Neural Architecture Optimization
18	70	Data-Efficient Hierarchical Reinforcement Learning
19	69	Sanity Checks for Saliency Maps
20	68	Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
21	67	Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
22	65	Neural Architecture Search with Bayesian Optimisation and Optimal Transport
23	62	TADAM: Task dependent adaptive metric for improved few-shot learning
24	58	Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels
25	57	Probabilistic Model-Agnostic Meta-Learning
26	57	On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
27	56	Searching for Efficient Multi-Scale Architectures for Dense Image Prediction
28	56	SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator
29	55	Playing hard exploration games by watching YouTube
30	55	Recurrent World Models Facilitate Policy Evolution
31	55	Conditional Adversarial Domain Adaptation
32	54	CatBoost: unbiased boosting with categorical features
33	54	Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
34	53	Co-teaching: Robust training of deep neural networks with extremely noisy labels
35	52	Neural Voice Cloning with a Few Samples
36	52	Adversarial vulnerability for any classifier
37	51	Hierarchical Graph Representation Learning with Differentiable Pooling
38	50	Gradient Sparsification for Communication-Efficient Distributed Optimization
39	49	Stochastic Cubic Regularization for Fast Nonconvex Optimization
40	48	Bilinear Attention Networks
41	47	SNIPER: Efficient Multi-Scale Training
42	46	NEON2: Finding Local Minima via First-Order Oracles
43	44	First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time
44	43	DropBlock: A regularization method for convolutional networks
45	43	Visual Reinforcement Learning with Imagined Goals
46	42	Empirical Risk Minimization Under Fairness Constraints
47	41	Link Prediction Based on Graph Neural Networks
48	41	Learning to Navigate in Cities Without a Map
49	41	PacGAN: The power of two samples in generative adversarial networks
50	41	Gradient Descent for Spiking Neural Networks
51	40	Implicit Bias of Gradient Descent on Linear Convolutional Networks
52	40	Learning to Infer Graphics Programs from Hand-Drawn Images
53	40	Is Q-Learning Provably Efficient?
54	38	Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks
55	38	Meta-Reinforcement Learning of Structured Exploration Strategies
56	37	A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks
57	37	Pelee: A Real-Time Object Detection System on Mobile Devices
58	36	Understanding Batch Normalization
59	36	Unsupervised Text Style Transfer using Language Models as Discriminators
60	36	DeepProbLog: Neural Probabilistic Logic Programming
61	36	Recurrent Relational Networks
62	35	Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
63	35	Predictive Uncertainty Estimation via Prior Networks
64	35	Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks
65	35	Why Is My Classifier Discriminatory?
66	35	Non-Local Recurrent Network for Image Restoration
67	35	Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
68	34	Generalisation in humans and deep neural networks
69	34	Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator
70	34	Discrimination-aware Channel Pruning for Deep Neural Networks
71	34	Long short-term memory and Learning-to-learn in networks of spiking neurons
72	34	Implicit Reparameterization Gradients
73	33	Joint Autoregressive and Hierarchical Priors for Learned Image Compression
74	33	Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise
75	33	Efficient Neural Network Robustness Certification with General Activation Functions
76	33	Multi-Task Learning as Multi-Objective Optimization
77	32	Constrained Graph Variational Autoencoders for Molecule Design
78	32	A Probabilistic U-Net for Segmentation of Ambiguous Images
79	32	Assessing Generative Models via Precision and Recall
80	31	Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
81	31	Randomized Prior Functions for Deep Reinforcement Learning
82	31	LF-Net: Learning Local Features from Images
83	31	Adversarial Examples that Fool both Computer Vision and Time-Limited Humans
84	31	Meta-Gradient Reinforcement Learning
85	31	Image-to-image translation for cross-domain disentanglement
86	31	Large Margin Deep Networks for Classification
87	30	Semidefinite relaxations for certifying robustness to adversarial examples
88	30	Reinforcement Learning for Solving the Vehicle Routing Problem
89	30	Evolved Policy Gradients
90	30	Byzantine Stochastic Gradient Descent
91	30	Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization
92	29	A Unified View of Piecewise Linear Neural Network Verification
93	29	Sparsified SGD with Memory
94	29	Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
95	29	Tree-to-tree Neural Networks for Program Translation
96	28	Unsupervised Attention-guided Image-to-Image Translation
97	28	Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning
98	27	3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data
99	27	The challenge of realistic music generation: modelling raw audio at scale
100	27	Speaker-Follower Models for Vision-and-Language Navigation
101	27	Entropy and mutual information in models of deep neural networks
102	27	FRAGE: Frequency-Agnostic Word Representation
103	26	Fast and Effective Robustness Certification
104	26	Flexible neural representation for physics prediction
105	26	Does mitigating ML’s impact disparity require treatment disparity?
106	26	Verifiable Reinforcement Learning via Policy Extraction
107	25	Balanced Policy Evaluation and Learning
108	25	Reinforcement Learning of Theorem Proving
109	25	Learning Plannable Representations with Causal InfoGAN
110	25	A Lyapunov-based Approach to Safe Reinforcement Learning
111	25	Neural Arithmetic Logic Units
112	25	Training Deep Neural Networks with 8-bit Floating Point Numbers
113	25	Relational recurrent neural networks
114	25	ResNet with one-neuron hidden layers is a Universal Approximator
115	25	Optimal Algorithms for Non-Smooth Distributed Optimization in Networks
116	25	How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD
117	25	Which Neural Net Architectures Give Rise to Exploding and Vanishing Gradients?
118	25	Learning to Decompose and Disentangle Representations for Video Prediction
119	24	Deep State Space Models for Time Series Forecasting
120	24	Towards Robust Interpretability with Self-Explaining Neural Networks
121	24	Learning Attentional Communication for Multi-Agent Cooperation
122	24	The Convergence of Sparsified Gradient Methods
123	24	Task-Driven Convolutional Recurrent Models of the Visual System
124	24	SimplE Embedding for Link Prediction in Knowledge Graphs
125	24	How to Start Training: The Effect of Initialization and Architecture
126	24	IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis
127	23	Bayesian Model-Agnostic Meta-Learning
128	23	Memory Replay GANs: Learning to Generate New Categories without Forgetting
129	23	LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning
130	23	Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
131	23	Online Learning with an Unknown Fairness Metric
132	23	Neural Nearest Neighbors Networks
133	22	ATOMO: Communication-efficient Learning via Atomic Sparsification
134	22	GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration
135	22	End-to-End Differentiable Physics for Learning and Control
136	22	Efficient Formal Safety Analysis of Neural Networks
137	22	Probabilistic Matrix Factorization for Automated Machine Learning
138	22	Re-evaluating evaluation
139	22	Delta-encoder: an effective sample synthesis method for few-shot object recognition
140	22	GIANT: Globally Improved Approximate Newton Method for Distributed Optimization
141	22	Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
142	22	Neighbourhood Consensus Networks
143	22	Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search
144	21	Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
145	21	Insights on representational similarity in neural networks with canonical correlation
146	21	Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
147	21	Direct Runge-Kutta Discretization Achieves Acceleration
148	21	SLAYER: Spike Layer Error Reassignment in Time
149	20	Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
150	20	Phase Retrieval Under a Generative Prior
151	20	Differentiable MPC for End-to-end Planning and Control
152	20	On gradient regularizers for MMD GANs
153	20	To Trust Or Not To Trust A Classifier
154	20	Fairness Through Computationally-Bounded Awareness
155	20	Learning to Optimize Tensor Programs
156	20	Evidential Deep Learning to Quantify Classification Uncertainty
157	20	Moonshine: Distilling with Cheap Convolutions
158	20	A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem
159	20	Deep Attentive Tracking via Reciprocative Learning
160	20	A^2-Nets: Double Attention Networks
161	19	Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network
162	19	Latent Alignment and Variational Attention
163	19	Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects
164	19	Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
165	19	Reward learning from human preferences and demonstrations in Atari
166	19	Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
167	19	Banach Wasserstein GAN
168	19	Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching
169	19	Amortized Inference Regularization
170	19	MetaGAN: An Adversarial Approach to Few-Shot Learning
171	19	Reinforced Continual Learning
172	19	Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives
173	18	Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds
174	18	Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization
175	18	Constructing Unrestricted Adversarial Examples with Generative Models
176	18	Hybrid Macro/Micro Level Backpropagation for Training Deep Spiking Neural Networks
177	18	Dimensionally Tight Bounds for Second-Order Hamiltonian Monte Carlo
178	18	Spectral Filtering for General Linear Dynamical Systems
179	18	Adaptive Sampling Towards Fast Graph Representation Learning
180	18	On the Dimensionality of Word Embedding
181	18	Are ResNets Provably Better than Linear Predictors?
182	18	Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced
183	17	Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies
184	17	Adaptive Methods for Nonconvex Optimization
185	17	Learning to Play With Intrinsically-Motivated, Self-Aware Agents
186	17	Communication Compression for Decentralized Training
187	17	Masking: A New Perspective of Noisy Supervision
188	17	Hyperbolic Neural Networks
189	17	Faster Neural Networks Straight from JPEG
190	17	Online Structured Laplace Approximations for Overcoming Catastrophic Forgetting
191	17	Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
192	17	Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
193	17	The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal
194	17	Norm matters: efficient and accurate normalization schemes in deep networks
195	17	Generalized Zero-Shot Learning with Deep Calibration Network
196	17	FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification
197	16	Deep Generative Models with Learnable Knowledge Constraints
198	16	Deepcode: Feedback Codes via Deep Learning
199	16	The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization
200	16	Watch Your Step: Learning Node Embeddings via Graph Attention
201	16	FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
202	16	Adversarial Multiple Source Domain Adaptation
203	16	Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments
204	16	Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples
205	16	Multi-Agent Generative Adversarial Imitation Learning
206	16	A Bayes-Sard Cubature Method
207	16	Generalizing to Unseen Domains via Adversarial Data Augmentation
208	16	On Learning Intrinsic Rewards for Policy Gradient Methods
209	16	Towards Robust Detection of Adversarial Examples
210	16	Adding One Neuron Can Eliminate All Bad Local Minima
211	16	Unsupervised Learning of Shape and Pose with Differentiable Point Clouds
212	16	Representation Balancing MDPs for Off-policy Policy Evaluation
213	16	A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
214	16	A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication
215	16	Embedding Logical Queries on Knowledge Graphs
216	16	Learning Deep Disentangled Embeddings With the F-Statistic Loss
217	15	Mesh-TensorFlow: Deep Learning for Supercomputers
218	15	A Stein variational Newton method
219	15	Learning Conditioned Graph Structures for Interpretable Visual Question Answering
220	15	cpSGD: Communication-efficient and differentially-private distributed SGD
221	15	Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction
222	15	Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders
223	15	On the Convergence and Robustness of Training GANs with Regularized Optimal Transport
224	15	Multimodal Generative Models for Scalable Weakly-Supervised Learning
225	15	RetGK: Graph Kernels based on Return Probabilities of Random Walks
226	15	Multi-Layered Gradient Boosting Decision Trees
227	15	Domain-Invariant Projection Learning for Zero-Shot Recognition
228	15	Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis
229	15	MetaAnchor: Learning to Detect Objects with Customized Anchors
230	15	Visual Object Networks: Image Generation with Disentangled 3D Representations
231	14	Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters
232	14	Robust Learning of Fixed-Structure Bayesian Networks
233	14	Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions
234	14	Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing
235	14	Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias
236	14	Dendritic cortical microcircuits approximate the backpropagation algorithm
237	14	Spectral Signatures in Backdoor Attacks
238	14	VideoCapsuleNet: A Simplified Network for Action Detection
239	14	Simple, Distributed, and Accelerated Probabilistic Programming
240	14	Learning towards Minimum Hyperspherical Energy
241	14	On GANs and GMMs
242	14	Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model
243	14	Scalable methods for 8-bit training of neural networks
244	14	Adversarial Text Generation via Feature-Mover’s Distance
245	14	Importance Weighting and Variational Inference
246	14	Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
247	14	Learning to Reconstruct Shapes from Unseen Classes
248	14	Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation
249	14	Where Do You Think You’re Going?: Inferring Beliefs about Dynamics from Behavior
250	14	Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making
251	14	FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction
252	13	Co-regularized Alignment for Unsupervised Domain Adaptation
253	13	RenderNet: A deep convolutional network for differentiable rendering from 3D shapes
254	13	Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization
255	13	Distributed Multi-Player Bandits – a Game of Thrones Approach
256	13	Learning to Teach with Dynamic Loss Functions
257	13	Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance
258	13	Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization
259	13	Minimax Statistical Learning with Wasserstein distances
260	13	Non-monotone Submodular Maximization in Exponentially Fewer Iterations
261	13	Evolution-Guided Policy Gradient in Reinforcement Learning
262	13	Empirical Risk Minimization in Non-interactive Local Differential Privacy Revisited
263	13	Self-Erasing Network for Integral Object Attention
264	13	PAC-learning in the presence of adversaries
265	12	Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound
266	12	Neural Proximal Gradient Descent for Compressive Imaging
267	12	Confounding-Robust Policy Improvement
268	12	Reducing Network Agnostophobia
269	12	DeepPINK: reproducible feature selection in deep neural networks
270	12	Data-dependent PAC-Bayes priors via differential privacy
271	12	Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding
272	12	Knowledge Distillation by On-the-Fly Native Ensemble
273	12	Dual Policy Iteration
274	12	Differentially Private Testing of Identity and Closeness of Discrete Distributions
275	12	On Fast Leverage Score Sampling and Optimal Learning
276	12	How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
277	12	A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization
278	12	COLA: Decentralized Linear Learning
279	12	A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization
280	12	DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors
281	12	Simple random search of static linear policies is competitive for reinforcement learning
282	12	Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization
283	12	Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
284	12	KDGAN: Knowledge Distillation with Generative Adversarial Networks
285	12	Do Less, Get More: Streaming Submodular Maximization with Subsampling
286	12	Learning Disentangled Joint Continuous and Discrete Representations
287	12	Dialog-based Interactive Image Retrieval
288	11	Context-aware Synthesis and Placement of Object Instances
289	11	Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution
290	11	Learning with SGD and Random Features
291	11	A Retrieve-and-Edit Framework for Predicting Structured Outputs
292	11	Deep Dynamical Modeling and Control of Unsteady Fluid Flows
293	11	Group Equivariant Capsule Networks
294	11	Adversarial Regularizers in Inverse Problems
295	11	Depth-Limited Solving for Imperfect-Information Games
296	11	Online Adaptive Methods, Universality and Acceleration
297	11	Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences
298	11	Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming
299	11	Unsupervised Video Object Segmentation for Deep Reinforcement Learning
300	11	Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity
301	11	Can We Gain More from Orthogonality Regularizations in Training Deep Networks?
302	11	Unsupervised Learning of Object Landmarks through Conditional Image Generation
303	11	Data center cooling using model-predictive control
304	11	Adversarial Attacks on Stochastic Bandits
305	11	Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals
306	11	Deep Reinforcement Learning of Marked Temporal Point Processes
307	11	One-Shot Unsupervised Cross Domain Translation
308	11	Distilled Wasserstein Learning for Word Embedding and Topic Modeling
309	11	Deep Defense: Training DNNs with Improved Adversarial Robustness
310	11	Sparse DNNs with Improved Adversarial Robustness
311	11	Learning long-range spatial dependencies with horizontal gated recurrent units
312	10	The Price of Fair PCA: One Extra dimension
313	10	Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions
314	10	Human-in-the-Loop Interpretability Prior
315	10	Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation
316	10	Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport
317	10	Deep Anomaly Detection Using Geometric Transformations
318	10	Hardware Conditioned Policies for Multi-Robot Transfer Learning
319	10	Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes
320	10	Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation
321	10	Plug-in Estimation in High-Dimensional Linear Inverse Problems: A Rigorous Analysis
322	10	Chaining Mutual Information and Tightening Generalization Bounds
323	10	Causal Inference with Noisy and Missing Covariates via Matrix Factorization
324	10	Scalable Hyperparameter Transfer Learning
325	10	Generative Probabilistic Novelty Detection with Adversarial Autoencoders
326	10	BRITS: Bidirectional Recurrent Imputation for Time Series
327	10	Compact Generalized Non-local Network
328	10	Recurrent Transformer Networks for Semantic Correspondence
329	10	A Dual Framework for Low-rank Tensor Completion
330	10	Policy Optimization via Importance Sampling
331	10	Boolean Decision Rules via Column Generation
332	10	MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare
333	10	Leveraging the Exact Likelihood of Deep Latent Variable Models
334	10	The committee machine: Computational to statistical gaps in learning a two-layers neural network
335	10	Paraphrasing Complex Network: Network Compression via Factor Transfer
336	10	Learning Hierarchical Semantic Image Manipulation through Structured Representations
337	10	Leveraged volume sampling for linear regression
338	10	3D-Aware Scene Manipulation via Inverse Graphics
339	10	On Oracle-Efficient PAC RL with Rich Observations
340	10	Adaptive Online Learning in Dynamic Environments
341	10	Posterior Concentration for Sparse Deep Learning
342	10	Deep Neural Nets with Interpolating Function as Output Activation
343	10	Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning
344	9	Learning to Share and Hide Intentions using Information Regularization
345	9	Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition
346	9	Reversible Recurrent Neural Networks
347	9	Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
348	9	Transfer Learning with Neural AutoML
349	9	Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo
350	9	Training Neural Networks Using Features Replay
351	9	On Coresets for Logistic Regression
352	9	Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation
353	9	Deep Generative Models for Distribution-Preserving Lossy Compression
354	9	Learning Task Specifications from Demonstrations
355	9	Deep Generative Markov State Models
356	9	TopRank: A practical algorithm for online stochastic ranking
357	9	Escaping Saddle Points in Constrained Optimization
358	9	Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates
359	9	Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes
360	9	Multivariate Time Series Imputation with Generative Adversarial Networks
361	9	Toddler-Inspired Visual Object Learning
362	9	Image Inpainting via Generative Multi-column Convolutional Neural Networks
363	8	Learning Temporal Point Processes via Reinforcement Learning
364	8	With Friends Like These, Who Needs Adversaries?
365	8	Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems
366	8	Learning Abstract Options
367	8	Improving Simple Models with Confidence Profiles
368	8	Robustness of conditional GANs to noisy labels
369	8	Blockwise Parallel Decoding for Deep Autoregressive Models
370	8	Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams
371	8	Maximizing acquisition functions for Bayesian optimization
372	8	Global Non-convex Optimization with Discretized Diffusions
373	8	Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation
374	8	Beyond Grids: Learning Graph Representations for Visual Recognition
375	8	Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data
376	8	Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features
377	8	Online Learning of Quantum States
378	8	Automatic differentiation in ML: Where we are and where we should be going
379	8	Generalisation of structural knowledge in the hippocampal-entorhinal system
380	8	Hamiltonian Variational Auto-Encoder
381	8	Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training
382	8	Approximate Knowledge Compilation by Online Collapsed Importance Sampling
383	8	Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo
384	8	Distributed k-Clustering for Data with Heavy Noise
385	8	Learning Libraries of Subroutines for Neurally–Guided Bayesian Program Induction
386	8	Learning Loop Invariants for Program Verification
387	8	Towards Text Generation with Adversarially Learned Neural Outlines
388	8	Out-of-Distribution Detection using Multiple Semantic Label Representations
389	8	Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks
390	8	M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
391	8	Incorporating Context into Language Encoding Models for fMRI
392	8	Approximating Real-Time Recurrent Learning with Random Kronecker Factors
393	8	Turbo Learning for CaptionBot and DrawingBot
394	8	L4: Practical loss-based stepsize adaptation for deep learning
395	8	Online convex optimization for cumulative constraints
396	8	Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning
397	8	CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces
398	8	Content preserving text generation with attribute controls
399	8	On the Local Minima of the Empirical Risk
400	8	End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems
401	8	Mean-field theory of graph neural networks in graph partitioning
402	8	Differentially Private Uniformly Most Powerful Tests for Binomial Data
403	8	Heterogeneous Bitwidth Binarization in Convolutional Neural Networks
404	8	Acceleration through Optimistic No-Regret Dynamics
405	8	Bayesian Inference of Temporal Task Specifications from Demonstrations
406	8	BinGAN: Learning Compact Binary Descriptors with a Regularized GAN
407	8	Neural Code Comprehension: A Learnable Representation of Code Semantics
408	8	Inequity aversion improves cooperation in intertemporal social dilemmas
409	8	Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks
410	8	Local Differential Privacy for Evolving Data
411	8	Attention in Convolutional LSTM for Gesture Recognition
412	8	Symbolic Graph Reasoning Meets Convolutions
413	8	Collaborative Learning for Deep Neural Networks
414	8	Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners
415	8	Global Geometry of Multichannel Sparse Blind Deconvolution on the Sphere
416	8	MetaReg: Towards Domain Generalization using Meta-Regularization
417	8	Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks
418	8	LinkNet: Relational Embedding for Scene Graph
419	8	Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling
420	8	Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions
421	8	Self-Supervised Generation of Spatial Audio for 360° Video
422	8	See and Think: Disentangling Semantic Scene Completion
423	8	Geometrically Coupled Monte Carlo Sampling
424	7	Understanding Regularized Spectral Clustering via Graph Conductance
425	7	Connecting Optimization and Regularization Paths
426	7	Nonparametric Density Estimation under Adversarial Losses
427	7	Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming
428	7	Generalization Bounds for Uniformly Stable Algorithms
429	7	Towards Deep Conversational Recommendations
430	7	Ex ante coordination and collusion in zero-sum multi-player extensive-form games
431	7	Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization
432	7	Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis
433	7	DAGs with NO TEARS: Continuous Optimization for Structure Learning
434	7	Quadrature-based features for kernel approximation
435	7	Differential Privacy for Growing Databases
436	7	HOUDINI: Lifelong Learning as Program Synthesis
437	7	A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks
438	7	How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective
439	7	Robust Hypothesis Testing Using Wasserstein Uncertainty Sets
440	7	Streaming Kernel PCA with \tilde{O}(\sqrt{n}) Random Features
441	7	Learning Latent Subspaces in Variational Autoencoders
442	7	Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization
443	7	Information Constraints on Auto-Encoding Variational Bayes
444	7	Dual Swap Disentangling
445	7	A Convex Duality Framework for GANs
446	7	ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions
447	7	Neural Networks Trained to Solve Differential Equations Learn General Representations
448	7	Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
449	7	But How Does It Work in Theory? Linear SVM with Random Features
450	7	Faithful Inversion of Generative Models for Effective Amortized Inference
451	7	Weakly Supervised Dense Event Captioning in Videos
452	7	Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes
453	7	Wasserstein Variational Inference
454	7	BourGAN: Generative Networks with Metric Embeddings
455	7	The Description Length of Deep Learning models
456	7	Trajectory Convolution for Action Recognition
457	7	Distributed Stochastic Optimization via Adaptive SGD
458	7	Bayesian Semi-supervised Learning with Graph Gaussian Processes
459	7	Multi-Class Learning: From Theory to Algorithm
460	7	Hybrid Knowledge Routed Modules for Large-scale Object Detection
461	7	A Game-Theoretic Approach to Recommendation Systems with Strategic Content Providers
462	7	Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN
463	7	A Model for Learned Bloom Filters and Optimizing by Sandwiching
464	7	How Many Samples are Needed to Estimate a Convolutional Neural Network?
465	7	Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with \beta-Divergences
466	6	Forward Modeling for Partial Observation Strategy Games – A StarCraft Defogger
467	6	The Sparse Manifold Transform
468	6	Learning to Solve SMT Formulas
469	6	Bayesian Nonparametric Spectral Estimation
470	6	Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform
471	6	Online Robust Policy Learning in the Presence of Unknown Adversaries
472	6	Object-Oriented Dynamics Predictor
473	6	Improving Explorability in Variational Inference with Annealed Variational Objectives
474	6	Learning Compressed Transforms with Low Displacement Rank
475	6	Orthogonally Decoupled Variational Gaussian Processes
476	6	Wasserstein Distributionally Robust Kalman Filtering
477	6	Teaching Inverse Reinforcement Learners via Features and Demonstrations
478	6	Credit Assignment For Collective Multiagent RL With Global Rewards
479	6	Learning to Repair Software Vulnerabilities with Generative Adversarial Networks
480	6	Generative modeling for protein structures
481	6	Disconnected Manifold Learning for Generative Adversarial Networks
482	6	REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis
483	6	BRUNO: A Deep Recurrent Model for Exchangeable Data
484	6	Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks
485	6	Bayesian Alignments of Warped Multi-Output Gaussian Processes
486	6	Sharp Bounds for Generalized Uniformity Testing
487	6	Constructing Fast Network through Deconstruction of Convolution
488	6	Adversarially Robust Optimization with Gaussian Processes
489	6	Bandit Learning in Concave N-Person Games
490	6	Occam’s razor is insufficient to infer the preferences of irrational agents
491	6	The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network
492	6	Unsupervised Adversarial Invariance
493	6	Densely Connected Attention Propagation for Reading Comprehension
494	6	Training deep learning based denoisers without ground truth data
495	6	NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
496	6	Norm-Ranging LSH for Maximum Inner Product Search
497	6	Learning a High Fidelity Pose Invariant Model for High-resolution Face Frontalization
498	6	Answerer in Questioner’s Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog
499	6	Model Agnostic Supervised Local Explanations
500	6	Modular Networks: Learning to Decompose Neural Computation
501	6	Structured Local Minima in Sparse Blind Deconvolution
502	6	Smoothed analysis of the low-rank approach for smooth semidefinite programs
503	6	Efficient Stochastic Gradient Hard Thresholding
504	6	Random Feature Stein Discrepancies
505	6	Variational Memory Encoder-Decoder
506	6	On Misinformation Containment in Online Social Networks
507	6	Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation
508	6	Sigsoftmax: Reanalysis of the Softmax Bottleneck
509	6	Supervised autoencoders: Improving generalization performance with unsupervised regularizers
510	6	Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
511	6	Structure-Aware Convolutional Neural Networks
512	6	Efficient Algorithms for Non-convex Isotonic Regression through Submodular Optimization
513	5	Bias and Generalization in Deep Generative Models: An Empirical Study
514	5	Benefits of over-parameterization with EM
515	5	Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices
516	5	Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
517	5	Gaussian Process Prior Variational Autoencoders
518	5	Learning To Learn Around A Common Mean
519	5	Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch
520	5	Blind Deconvolutional Phase Retrieval via Convex Programming
521	5	Coupled Variational Bayes via Optimization Embedding
522	5	Improving Online Algorithms via ML Predictions
523	5	e-SNLI: Natural Language Inference with Natural Language Explanations
524	5	Invariant Representations without Adversarial Training
525	5	SING: Symbol-to-Instrument Neural Generator
526	5	A Structured Prediction Approach for Label Ranking
527	5	Uniform Convergence of Gradients for Non-Convex Learning and Optimization
528	5	Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs
529	5	Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images
530	5	On preserving non-discrimination when combining expert advice
531	5	Algorithms and Theory for Multiple-Source Adaptation
532	5	Variational Bayesian Monte Carlo
533	5	Adversarial Scene Editing: Automatic Object Removal from Weak Supervision
534	5	Non-Adversarial Mapping with VAEs
535	5	Stochastic Chebyshev Gradient Descent for Spectral Optimization
536	5	Implicit Probabilistic Integrators for ODEs
537	5	Provably Correct Automatic Sub-Differentiation for Qualified Programs
538	5	Heterogeneous Multi-output Gaussian Process Prediction
539	5	Contamination Attacks and Mitigation in Multi-Party Machine Learning
540	5	Bayesian Distributed Stochastic Gradient Descent
541	5	Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling
542	5	Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
543	5	Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification
544	5	Automating Bayesian optimization with Bayesian optimization
545	5	Exact natural gradient in deep linear networks and its application to the nonlinear case
546	5	Binary Classification from Positive-Confidence Data
547	5	Learning to Multitask
548	5	Variational Inference with Tail-adaptive f-Divergence
549	5	Learning Others’ Intentional Models in Multi-Agent Settings Using Interactive POMDPs
550	5	Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing
551	5	Estimating Learnability in the Sublinear Data Regime
552	5	Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance
553	5	Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning
554	5	Supervising Unsupervised Learning
555	5	The Physical Systems Behind Optimization Algorithms
556	5	The Price of Privacy for Low-rank Factorization
557	5	Distributed Weight Consolidation: A Brain Segmentation Case Study
558	5	Learning sparse neural networks via sensitivity-driven regularization
559	5	Lipschitz regularity of deep neural networks: analysis and efficient estimation
560	5	A Bandit Approach to Sequential Experimental Design with False Discovery Control
561	5	Optimal Subsampling with Influence Functions
562	5	Modern Neural Networks Generalize on Small Data Sets
563	5	Boosting Black Box Variational Inference
564	5	Single-Agent Policy Tree Search With Guarantees
565	5	Q-learning with Nearest Neighbors
566	5	Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models
567	5	Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base
568	5	Mirrored Langevin Dynamics
569	5	Computing Higher Order Derivatives of Matrix and Tensor Expressions
570	5	Gaussian Process Conditional Density Estimation
571	5	Sequential Context Encoding for Duplicate Removal
572	5	Precision and Recall for Time Series
573	5	Partially-Supervised Image Captioning
574	5	Temporal Regularization for Markov Decision Process
575	5	Neural Guided Constraint Logic Programming for Program Synthesis
576	5	Learning Versatile Filters for Efficient Convolutional Neural Networks
577	5	Found Graph Data and Planted Vertex Covers
578	5	Generative Neural Machine Translation
579	5	Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection
580	5	Unsupervised Learning of View-invariant Action Representations
581	5	A flexible model for training action localization with varying levels of supervision
582	5	Solving Large Sequential Games with the Excessive Gap Technique
583	5	On Learning Markov Chains
584	5	Rest-Katyusha: Exploiting the Solution’s Structure via Scheduled Restart Schemes
585	5	Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC
586	5	Chain of Reasoning for Visual Question Answering
587	5	Snap ML: A Hierarchical Framework for Machine Learning
588	5	Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation
589	4	GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking
590	4	Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons
591	4	Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language
592	4	Data-Driven Clustering via Parameterized Lloyd’s Families
593	4	Complex Gated Recurrent Neural Networks
594	4	Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior
595	4	Temporal alignment and latent Gaussian process factor inference in population spike trains
596	4	PCA of high dimensional random walks with comparison to neural network training
597	4	Using Large Ensembles of Control Variates for Variational Inference
598	4	Non-delusional Q-learning and value-iteration
599	4	Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models
600	4	Entropy Rate Estimation for Markov Chains with Large State Space
601	4	Invertibility of Convolutional Generative Networks from Partial Measurements
602	4	Multi-objective Maximization of Monotone Submodular Functions with Cardinality Constraint
603	4	Learning and Testing Causal Models with Interventions
604	4	The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation
605	4	Learning Attractor Dynamics for Generative Memory
606	4	PAC-Bayes bounds for stable algorithms with instance-dependent priors
607	4	Learning Safe Policies with Expert Guidance
608	4	Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes
609	4	Exploration in Structured Reinforcement Learning
610	4	Data Amplification: A Unified and Competitive Approach to Property Estimation
611	4	Contextual Stochastic Block Models
612	4	Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks
613	4	Diffusion Maps for Textual Network Embedding
614	4	Constrained Cross-Entropy Method for Safe Reinforcement Learning
615	4	Bandit Learning with Implicit Feedback
616	4	Model-Agnostic Private Learning
617	4	Causal Inference via Kernel Deviance Measures
618	4	Scaling Gaussian Process Regression with Derivatives
619	4	A no-regret generalization of hierarchical softmax to extreme multi-label classification
620	4	Deep Structured Prediction with Nonlinear Output Transformations
621	4	Transfer of Value Functions via Variational Methods
622	4	Variational Learning on Aggregate Outputs with Gaussian Processes
623	4	Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
624	4	Multi-Task Zipping via Layer-wise Neuron Sharing
625	4	Computing Kantorovich-Wasserstein Distances on d-dimensional histograms using (d+1)-partite graphs
626	4	Reparameterization Gradient for Non-differentiable Models
627	4	Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization
628	4	Geometry-Aware Recurrent Neural Networks for Active Visual Recognition
629	4	Bandit Learning with Positive Externalities
630	4	Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections
631	4	Differentially Private Contextual Linear Bandits
632	4	Scalable Coordinated Exploration in Concurrent Reinforcement Learning
633	4	Bilevel Distance Metric Learning for Robust Image Recognition
634	4	An Information-Theoretic Analysis for Thompson Sampling with Many Actions
635	4	GumBolt: Extending Gumbel trick to Boltzmann priors
636	4	Variational PDEs for Acceleration on Manifolds and Application to Diffeomorphisms
637	4	Direct Estimation of Differences in Causal Graphs
638	4	Convergence of Cubic Regularization for Nonconvex Optimization under KL Property
639	4	Tight Bounds for Collaborative PAC Learning via Multiplicative Weights
640	4	Differentially Private Bayesian Inference for Exponential Families
641	4	Representation Learning for Treatment Effect Estimation from Observational Data
642	4	Revisiting Decomposable Submodular Function Minimization with Incidence Relations
643	4	SEGA: Variance Reduction via Gradient Sketching
644	4	Virtual Class Enhanced Discriminative Embedding Learning
645	4	Relating Leverage Scores and Density using Regularized Christoffel Functions
646	4	DifNet: Semantic Segmentation by Diffusion Networks
647	4	Regularization Learning Networks: Deep Learning for Tabular Datasets
648	4	Joint Active Feature Acquisition and Classification with Variable-Size Set Encoding
649	4	Quadratic Decomposable Submodular Function Minimization
650	4	A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents
651	4	Uncertainty-Aware Attention for Reliable Interpretation and Prediction
652	4	Generalizing Graph Matching beyond Quadratic Assignment Model
653	4	Informative Features for Model Comparison
654	4	Training DNNs with Hybrid Block Floating Point
655	4	Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems
656	4	An Off-policy Policy Gradient Theorem Using Emphatic Weightings
657	4	Generalized Inverse Optimization through Online Learning
658	4	Kalman Normalization: Normalizing Internal Representations Across Network Layers
659	3	Transfer of Deep Reactive Policies for MDP Planning
660	3	Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices
661	3	Point process latent variable models of larval zebrafish behavior
662	3	Differentially Private Change-Point Detection
663	3	Learning Beam Search Policies via Imitation Learning
664	3	Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders
665	3	A Simple Cache Model for Image Recognition
666	3	On Markov Chain Gradient Descent
667	3	Unsupervised Depth Estimation, 3D Face Rotation and Replacement
668	3	Learning convex bounds for linear quadratic control policy synthesis
669	3	The Effect of Network Width on the Performance of Large-batch Training
670	3	The Importance of Sampling inMeta-Reinforcement Learning
671	3	Coordinate Descent with Bandit Sampling
672	3	Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
673	3	Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity
674	3	A convex program for bilinear inversion of sparse vectors
675	3	The promises and pitfalls of Stochastic Gradient Langevin Dynamics
676	3	Efficient Online Portfolio with Logarithmic Regret
677	3	Proximal Graphical Event Models
678	3	Learning Signed Determinantal Point Processes through the Principal Minor Assignment Problem
679	3	GILBO: One Metric to Measure Them All
680	3	Bayesian Adversarial Learning
681	3	Extracting Relationships by Multi-Domain Matching
682	3	Unsupervised Learning of Artistic Styles with Archetypal Style Analysis
683	3	The Limits of Post-Selection Generalization
684	3	SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient
685	3	Deep Neural Networks with Box Convolutions
686	3	Graphical Generative Adversarial Networks
687	3	Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability
688	3	Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation
689	3	Learning a latent manifold of odor representations from neural responses in piriform cortex
690	3	GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
691	3	Scalable Robust Matrix Factorization with Nonconvex Loss
692	3	Practical exact algorithm for trembling-hand equilibrium refinements in games
693	3	Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks
694	3	Adaptive Negative Curvature Descent with Applications in Non-convex Optimization
695	3	Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions
696	3	Exponentiated Strongly Rayleigh Distributions
697	3	A Bridging Framework for Model Optimization and Deep Propagation
698	3	Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models
699	3	Meta-Learning MCMC Proposals
700	3	The streaming rollout of deep networks – towards fully model-parallel execution
701	3	Solving Non-smooth Constrained Programs with Lower Complexity than \mathcal{O}(1/\varepsilon): A Primal-Dual Homotopy Smoothing Approach
702	3	Learning from discriminative feature feedback
703	3	Bipartite Stochastic Block Models with Tiny Clusters
704	3	Equality of Opportunity in Classification: A Causal Approach
705	3	Sequence-to-Segment Networks for Segment Detection
706	3	Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation
707	3	Step Size Matters in Deep Learning
708	3	From Stochastic Planning to Marginal MAP
709	3	Constructing Deep Neural Networks by Bayesian Network Structure Learning
710	3	Optimization over Continuous and Multi-dimensional Decisions with Observational Data
711	3	Metric on Nonlinear Dynamical Systems with Perron-Frobenius Operators
712	3	Safe Active Learning for Time-Series Modeling with Gaussian Processes
713	3	Processing of missing data by neural networks
714	3	A Practical Algorithm for Distributed Clustering and Outlier Detection
715	3	Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms
716	3	DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning
717	3	Regularizing by the Variance of the Activations’ Sample-Variances
718	3	Automatic Program Synthesis of Long Programs with a Learned Garbage Collector
719	3	Nonparametric learning from Bayesian models with randomized objective functions
720	3	Learning Optimal Reserve Price against Non-myopic Bidders
721	3	Enhancing the Accuracy and Fairness of Human Decision Making
722	3	Learning to Exploit Stability for 3D Scene Parsing
723	3	Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning
724	3	Geometry Based Data Generation
725	3	New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity
726	3	Alternating optimization of decision trees, with application to learning sparse oblique trees
727	3	Synthesized Policies for Transfer and Adaptation across Tasks and Environments
728	3	Interactive Structure Learning with Structural Query-by-Committee
729	3	Efficient nonmyopic batch active search
730	3	\ell_1-regression with Heavy-tailed Distributions
731	3	Frequency-Domain Dynamic Pruning for Convolutional Neural Networks
732	3	Visual Memory for Robust Path Following
733	3	Maximum-Entropy Fine Grained Classification
734	3	A Unified Framework for Extensive-Form Game Abstraction with Bounds
735	3	HitNet: Hybrid Ternary Recurrent Neural Network
736	3	Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution
737	3	HOGWILD!-Gibbs can be PanAccurate
738	2	Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements
739	2	Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations
740	2	Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues
741	2	Robust Subspace Approximation in a Stream
742	2	Bayesian Structure Learning by Recursive Bootstrap
743	2	Total stochastic gradient algorithms and applications in reinforcement learning
744	2	Synaptic Strength For Convolutional Neural Network
745	2	A Spectral View of Adversarially Robust Features
746	2	Testing for Families of Distributions via the Fourier Transform
747	2	Scalable Laplacian K-modes
748	2	Learning to Reason with Third Order Tensor Products
749	2	Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization
750	2	Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach
751	2	Identification and Estimation of Causal Effects from Dependent Data
752	2	Representer Point Selection for Explaining Deep Neural Networks
753	2	Learning SMaLL Predictors
754	2	Iterative Value-Aware Model Learning
755	2	Improving Neural Program Synthesis with Inferred Execution Traces
756	2	Estimators for Multivariate Information Measures in General Probability Spaces
757	2	Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity
758	2	Distributionally Robust Graphical Models
759	2	Bilevel learning of the Group Lasso structure
760	2	Graphical model inference: Sequential Monte Carlo meets deterministic approximations
761	2	Learning to Specialize with Knowledge Distillation for Visual Question Answering
762	2	Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
763	2	A General Method for Amortizing Variational Filtering
764	2	Scalar Posterior Sampling with Applications
765	2	Improved Algorithms for Collaborative PAC Learning
766	2	Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks
767	2	Training Deep Models Faster with Robust, Approximate Importance Sampling
768	2	Efficient Loss-Based Decoding on Graphs for Extreme Classification
769	2	Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies
770	2	Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks
771	2	Provable Gaussian Embedding with One Observation
772	2	Model-based targeted dimensionality reduction for neuronal population data
773	2	Representation Learning of Compositional Data
774	2	Modeling Dynamic Missingness of Implicit Feedback for Recommendation
775	2	Query K-means Clustering and the Double Dixie Cup Problem
776	2	On the Local Hessian in Back-propagation
777	2	On Controllable Sparse Alternatives to Softmax
778	2	Multi-domain Causal Structure Learning in Linear Systems
779	2	Deep State Space Models for Unconditional Word Generation
780	2	Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer
781	2	Diverse Ensemble Evolution: Curriculum Data-Model Marriage
782	2	Loss Functions for Multiset Prediction
783	2	Efficient inference for time-varying behavior during learning
784	2	Contextual Pricing for Lipschitz Buyers
785	2	Manifold Structured Prediction
786	2	Middle-Out Decoding
787	2	Differentially Private k-Means with Constant Multiplicative Error
788	2	Fully Understanding The Hashing Trick
789	2	Contour location via entropy reduction leveraging multiple information sources
790	2	Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task
791	2	Porcupine Neural Networks: Approximating Neural Network Landscapes
792	2	Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates
793	2	Context-dependent upper-confidence bounds for directed exploration
794	2	Recurrently Controlled Recurrent Networks
795	2	Hunting for Discriminatory Proxies in Linear Regression Models
796	2	Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima
797	2	Explaining Deep Learning Models — A Bayesian Non-parametric Approach
798	2	Semi-Supervised Learning with Declaratively Specified Entropy Constraints
799	2	Maximum Causal Tsallis Entropy Imitation Learning
800	2	Mallows Models for Top-k Lists
801	2	Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates
802	2	Binary Rating Estimation with Graph Side Information
803	2	Inexact trust-region algorithms on Riemannian manifolds
804	2	Differentially Private Robust Low-Rank Approximation
805	2	Probabilistic Neural Programmed Networks for Scene Generation
806	2	Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization
807	2	Sublinear Time Low-Rank Approximation of Distance Matrices
808	2	Scaling the Poisson GLM to massive neural datasets through polynomial approximations
809	2	Infinite-Horizon Gaussian Processes
810	2	Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds
811	2	Deep, complex, invertible networks for inversion of transmission effects in multimode optical fibres
812	2	Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward
813	2	Learning latent variable structured prediction models with Gaussian perturbations
814	2	Practical Methods for Graph Two-Sample Testing
815	2	Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation
816	2	Causal Discovery from Discrete Data using Hidden Compact Representation
817	2	Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
818	2	Structural Causal Bandits: Where to Intervene?
819	2	Active Learning for Non-Parametric Regression Using Purely Random Trees
820	2	Breaking the Span Assumption Yields Fast Finite-Sum Minimization
821	2	Universal Growth in Production Economies
822	2	High Dimensional Linear Regression using Lattice Basis Reduction
823	2	Fighting Boredom in Recommender Systems with Linear Reinforcement Learning
824	2	Generalizing Tree Probability Estimation via Bayesian Networks
825	2	Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks
826	2	A Block Coordinate Ascent Algorithm for Mean-Variance Optimization
827	2	Boosted Sparse and Low-Rank Tensor Regression
828	2	DropMax: Adaptive Variational Softmax
829	2	Connectionist Temporal Classification with Maximum Entropy Regularization
830	2	A Neural Compositional Paradigm for Image Captioning
831	2	An Efficient Pruning Algorithm for Robust Isotonic Regression
832	2	Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units
833	1	Sparse PCA from Sparse Linear Regression
834	1	Computationally and statistically efficient learning of causal Bayes nets using path queries
835	1	Removing Hidden Confounding by Experimental Grounding
836	1	MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization
837	1	Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning
838	1	Fast deep reinforcement learning using online adjustments from the past
839	1	Streamlining Variational Inference for Constraint Satisfaction Problems
840	1	Convex Elicitation of Continuous Properties
841	1	Learning and Inference in Hilbert Space with Quantum Graphical Models
842	1	Uplift Modeling from Separate Labels
843	1	Dynamic Network Model from Partial Observations
844	1	Theoretical guarantees for EM under misspecified Gaussian mixture models
845	1	Statistical and Computational Trade-Offs in Kernel K-Means
846	1	GLoMo: Unsupervised Learning of Transferable Relational Graphs
847	1	Adaptive Path-Integral Autoencoders: Representation Learning and Planning for Dynamical Systems
848	1	A Statistical Recurrent Model on the Manifold of Symmetric Positive Definite Matrices
849	1	Stein Variational Gradient Descent as Moment Matching
850	1	A Bayesian Nonparametric View on Count-Min Sketch
851	1	Deep Poisson gamma dynamical systems
852	1	Information-theoretic Limits for Community Detection in Network Models
853	1	Online Reciprocal Recommendation with Theoretical Performance Guarantees
854	1	Statistical mechanics of low-rank tensor decomposition
855	1	Modelling and unsupervised learning of symmetric deformable object categories
856	1	Efficient Anomaly Detection via Matrix Sketching
857	1	Improved Expressivity Through Dendritic Neural Networks
858	1	Stochastic Expectation Maximization with Variance Reduction
859	1	Monte-Carlo Tree Search for Constrained POMDPs
860	1	Breaking the Activation Function Bottleneck through Adaptive Parameterization
861	1	Rectangular Bounding Process
862	1	Adaptive Learning with Unknown Information Flows
863	1	A Bayesian Approach to Generative Adversarial Imitation Learning
864	1	Constant Regret, Generalized Mixability, and Mirror Descent
865	1	How to tell when a clustering is (approximately) correct using convex relaxations
866	1	Stimulus domain transfer in recurrent models for large scale cortical population prediction on video
867	1	Efficient online algorithms for fast-rate regret bounds under sparsity
868	1	Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation
869	1	Unorganized Malicious Attacks Detection
870	1	Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss
871	1	rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions
872	1	Maximizing Induced Cardinality Under a Determinantal Point Process
873	1	Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms
874	1	Stochastic Nonparametric Event-Tensor Decomposition
875	1	Diminishing Returns Shape Constraints for Interpretability and Regularization
876	1	Policy Regret in Repeated Games
877	1	Large-Scale Stochastic Sampling from the Probability Simplex
878	1	An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression
879	1	Proximal SCOPE for Distributed Sparse Learning
880	1	The Everlasting Database: Statistical Validity at a Fair Price
881	1	Size-Noise Tradeoffs in Generative Networks
882	1	Exponentially Weighted Imitation Learning for Batched Historical Data
883	1	The Cluster Description Problem – Complexity Results, Formulations and Approximations
884	1	MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
885	1	Approximation algorithms for stochastic clustering
886	1	Gamma-Poisson Dynamic Matrix Factorization Embedded with Metadata Influence
887	1	Mental Sampling in Multimodal Representations
888	1	Critical initialisation for deep signal propagation in noisy rectifier neural networks
889	1	Learning convex polytopes with margin
890	1	Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames
891	1	Sample Efficient Stochastic Gradient Iterative Hard Thresholding Method for Stochastic Sparse Linear Regression with Limited Attribute Observation
892	1	Horizon-Independent Minimax Linear Regression
893	1	Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models
894	1	Learning in Games with Lossy Feedback
895	1	Learning Confidence Sets using Support Vector Machines
896	1	Fast greedy algorithms for dictionary selection with generalized sparsity constraints
897	1	Non-metric Similarity Graphs for Maximum Inner Product Search
898	1	A Mathematical Model For Optimal Decisions In A Representative Democracy
899	1	Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels
900	1	PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
901	1	Learning filter widths of spectral decompositions with wavelets
902	1	Lifelong Inverse Reinforcement Learning
903	1	Expanding Holographic Embeddings for Knowledge Completion
904	1	Submodular Field Grammars: Representation, Inference, and Application to Image Parsing
905	1	BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training
906	1	Flexible and accurate inference and learning for deep generative models
907	1	KONG: Kernels for ordered-neighborhood graphs
908	1	Minimax Estimation of Neural Net Distance
909	1	Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization
910	1	Multiplicative Weights Updates with Constant Step-Size in Graphical Constant-Sum Games
911	1	Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization
912	1	Stochastic Spectral and Conjugate Descent Methods
913	1	Semi-crowdsourced Clustering with Deep Generative Models
914	1	Parsimonious Bayesian deep networks
915	1	Asymptotic optimality of adaptive importance sampling
916	1	When do random forests fail?
917	1	Adaptation to Easy Data in Prediction with Limited Advice
918	1	Gradient Descent Meets Shift-and-Invert Preconditioning for Eigenvector Computation
919	1	Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces
920	1	Provable Variational Inference for Constrained Log-Submodular Models
921	1	Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling
922	1	Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning
923	1	Mixture Matrix Completion
924	1	Algorithmic Linearly Constrained Gaussian Processes
925	1	SplineNets: Continuous Neural Decision Graphs
926	1	The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning
927	1	Video Prediction via Selective Sampling
928	1	Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities
929	1	A loss framework for calibrated anomaly detection
930	1	Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution
931	1	Multitask Boosting for Survival Analysis with Competing Risks
932	1	The Lingering of Gradients: How to Reuse Gradients Over Time
933	1	(Probably) Concave Graph Matching
934	1	Fast Similarity Search via Optimal Sparse Lifting
935	0	Contrastive Learning from Pairwise Measurements
936	0	Support Recovery for Orthogonal Matching Pursuit: Upper and Lower bounds
937	0	Sketching Method for Large Scale Combinatorial Inference
938	0	Regret Bounds for Online Portfolio Selection with a Cardinality Constraint
939	0	Improved Network Robustness with Adversary Critic
940	0	Discretely Relaxing Continuous Variables for tractable Variational Inference
941	0	Bounded-Loss Private Prediction Markets
942	0	Lifted Weighted Mini-Bucket
943	0	Predictive Approximate Bayesian Computation via Saddle Points
944	0	Learning Invariances using the Marginal Likelihood
945	0	Variance-Reduced Stochastic Gradient Descent on Streaming Data
946	0	Trading robust representations for sample complexity through self-supervised visual experience
947	0	PAC-Bayes Tree: weighted subtrees with guarantees
948	0	The emergence of multiple retinal cell types through efficient coding of natural movies
949	0	The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models
950	0	Inferring Latent Velocities from Weather Radar Data using Gaussian Processes
951	0	Wavelet regression and additive models for irregularly spaced data
952	0	Distributed Multitask Reinforcement Learning with Quadratic Convergence
953	0	Legendre Decomposition for Tensors
954	0	Compact Representation of Uncertainty in Clustering
955	0	Clustering Redemption–Beyond the Impossibility of Kleinberg’s Axioms
956	0	Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds
957	0	Submodular Maximization via Gradient Ascent: The Case of Deep Submodular Functions
958	0	Dirichlet belief networks for topic structure learning
959	0	A Reduction for Efficient LDA Topic Reconstruction
960	0	Preference Based Adaptation for Learning Objectives
961	0	On Neuronal Capacity
962	0	Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation
963	0	Deep Homogeneous Mixture Models: Representation, Separation, and Approximation
964	0	A probabilistic population code based on neural samples
965	0	Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses
966	0	A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice
967	0	Algebraic tests of general Gaussian latent tree models
968	0	Online Improper Learning with an Approximation Oracle
969	0	Community Exploration: From Offline Optimization to Online Learning
970	0	Learning Concave Conditional Likelihood Models for Improved Analysis of Tandem Mass Spectra
971	0	Experimental Design for Cost-Aware Learning of Causal Graphs
972	0	Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression
973	0	Multi-armed Bandits with Compensation
974	0	Power-law efficient neural codes provide general link between perceptual bias and discriminability
975	0	Learning from Group Comparisons: Exploiting Higher Order Interactions
976	0	Objective and efficient inference for couplings in neuronal networks
977	0	Neural Edit Operations for Biological Sequences
978	0	Measures of distortion for machine learning
979	0	Information-based Adaptive Stimulus Selection to Optimize Communication Efficiency in Brain-Computer Interfaces
980	0	A Smoother Way to Train Structured Prediction Models
981	0	Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making
982	0	Active Matting
983	0	Limited Memory Kelley’s Method Converges for Composite Convex and Submodular Objectives
984	0	Completing State Representations using Spectral Learning
985	0	Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification
986	0	TETRIS: TilE-matching the TRemendous Irregular Sparsity
987	0	Efficient Projection onto the Perfect Phylogeny Model
988	0	Beauty-in-averageness and its contextual modulations: A Bayesian statistical account
989	0	Early Stopping for Nonparametric Testing
990	0	Inferring Networks From Random Walk-Based Node Similarities
991	0	Communication Efficient Parallel Algorithms for Optimization on Manifolds
992	0	Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments
993	0	On Binary Classification in Extreme Regions
994	0	Optimistic optimization of a Brownian
995	0	Fast Estimation of Causal Interactions using Wold Processes
996	0	Factored Bandits
997	0	Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net
998	0	Query Complexity of Bayesian Private Learning
999	0	Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data
1000	0	MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval
1001	0	Overlapping Clustering Models, and One (class) SVM to Bind Them All
1002	0	Bayesian Model Selection Approach to Boundary Detection with Non-Local Priors
1003	0	Genetic-Gated Networks for Deep Reinforcement Learning
1004	0	Foreground Clustering for Joint Segmentation and Localization in Videos and Images
1005	0	Learning semantic similarity in a continuous space
1006	0	Quantifying Learning Guarantees for Convex but Inconsistent Surrogates
1007	0	Removing the Feature Correlation Effect of Multiplicative Noise
1008	0	Optimization for Approximate Submodularity

Nips 2018 most cited papers

Share this:

Available translations