ICML 2019 most cited papers

Below are ICML 2019 papers ranked by number of citations (no posters, oral presentations only). The citation count was obtained by hand from Google Scholar on September 26, 2019 and may be outdated or subject to human error.

Rank Cited by Paper name
0 280 Self-Attention Generative Adversarial Networks
1 95 A Convergence Theory for Deep Learning via Over-Parameterization
2 95 Gradient Descent Finds Global Minima of Deep Neural Networks
3 50 Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
4 49 Learning Latent Dynamics for Planning from Pixels
5 41 Adversarial examples from computational constraints
6 38 Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
7 35 Quantifying Generalization in Reinforcement Learning
8 33 Theoretically Principled Trade-off between Robustness and Accuracy
9 32 Sever: A Robust Meta-Algorithm for Stochastic Optimization
10 27 Invertible Residual Networks
11 27 EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
12 27 AdaGrad stepsizes: sharp convergence over nonconvex landscapes
13 26 TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing
14 26 Certified Adversarial Robustness via Randomized Smoothing
15 25 Graphite: Iterative Generative Modeling of Graphs
16 24 Do ImageNet Classifiers Generalize to ImageNet?
17 22 AReS and MaRS – Adversarial and MMD-Minimizing Regression for SDEs
18 20 Adversarial Examples Are a Natural Consequence of Test Error in Noise
19 20 Simplifying Graph Convolutional Networks
20 19 On the Spectral Bias of Neural Networks
21 18 Optimal Auctions through Deep Learning
22 17 The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects
23 15 Adaptive Neural Trees
24 14 MASS: Masked Sequence to Sequence Pre-training for Language Generation
25 14 Obtaining Fairness using Optimal Transport Theory
26 14 Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?
27 13 NAS-Bench-101: Towards Reproducible Neural Architecture Search
28 13 Rademacher Complexity for Adversarially Robust Generalization
29 12 Multi-Object Representation Learning with Iterative Variational Inference
30 12 Imitating Latent Policies from Observation
31 12 The Evolved Transformer
32 12 SGD: General Analysis and Improved Rates
33 11 Actor-Attention-Critic for Multi-Agent Reinforcement Learning
34 11 Noise2Self: Blind Denoising by Self-Supervision
35 11 Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design
36 11 Stochastic Gradient Push for Distributed Deep Learning
37 11 Optimal Transport for structured data with application on graphs
38 11 On the Universality of Invariant Networks
39 10 Random Shuffling Beats SGD after Finite Epochs
40 10 Analyzing Federated Learning through an Adversarial Lens
41 10 Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
42 10 Online Meta-Learning
43 10 On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms
44 9 Learning to Generalize from Sparse and Underspecified Rewards
45 9 Insertion Transformer: Flexible Sequence Generation via Insertion Operations
46 9 CoT: Cooperative Training for Generative Modeling of Discrete Data
47 9 Variational Implicit Processes
48 9 Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
49 9 Emerging Convolutions for Generative Normalizing Flows
50 9 Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
51 9 Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
52 9 FloWaveNet : A Generative Flow for Raw Audio
53 9 Policy Certificates: Towards Accountable Reinforcement Learning
54 9 Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
55 9 Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication
56 9 Gauge Equivariant Convolutional Networks and the Icosahedral CNN
57 9 High-Fidelity Image Generation With Fewer Labels
58 9 Safe Policy Improvement with Baseline Bootstrapping
59 9 Off-Policy Deep Reinforcement Learning without Exploration
60 9 Using Pre-Training Can Improve Model Robustness and Uncertainty
61 9 Manifold Mixup: Better Representations by Interpolating Hidden States
62 8 Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
63 8 Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret
64 8 Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
65 8 On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization
66 8 Open-ended learning in symmetric zero-sum games
67 8 Error Feedback Fixes SignSGD and other Gradient Compression Schemes
68 7 TarMAC: Targeted Multi-Agent Communication
69 7 Latent Normalizing Flows for Discrete Sequences
70 7 Provably Efficient Maximum Entropy Exploration
71 7 Sorting Out Lipschitz Function Approximation
72 7 Understanding Geometry of Encoder-Decoder CNNs
73 7 A Theory of Regularized Markov Decision Processes
74 7 Graph U-Nets
75 7 A Kernel Theory of Modern Data Augmentation
76 7 Learning deep kernels for exponential family densities
77 7 On Learning Invariant Representations for Domain Adaptation
78 7 Towards a Unified Analysis of Random Fourier Features
79 7 Deep Counterfactual Regret Minimization
80 7 Training Neural Networks with Local Error Signals
81 7 HOList: An Environment for Machine Learning of Higher Order Logic Theorem Proving
82 7 ELF OpenGo: an analysis and open reimplementation of AlphaZero
83 6 Geometry and Symmetry in Short-and-Sparse Deconvolution
84 6 Agnostic Federated Learning
85 6 On the Limitations of Representing Functions on Sets
86 6 Parameter-Efficient Transfer Learning for NLP
87 6 Escaping Saddle Points with Adaptive Gradient Methods
88 6 Batch Policy Learning under Constraints
89 6 Understanding the Impact of Entropy on Policy Optimization
90 6 An Instability in Variational Inference for Topic Models
91 6 Understanding the Origins of Bias in Word Embeddings
92 6 Making Convolutional Networks Shift-Invariant Again
93 6 Fast Context Adaptation via Meta-Learning
94 6 SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
95 6 The Odds are Odd: A Statistical Test for Detecting Adversarial Examples
96 6 Complexity of Linear Regions in Deep Networks
97 6 Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints
98 6 Scalable Fair Clustering
99 6 Learning Action Representations for Reinforcement Learning
100 6 An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
101 6 Natural Analysts in Adaptive Data Analysis
102 6 Collaborative Evolutionary Reinforcement Learning
103 6 Katalyst: Boosting Convex Katayusha for Non-Convex Problems with a Large Condition Number
104 6 Nonconvex Variance Reduced Optimization with Arbitrary Sampling
105 5 Loss Landscapes of Regularized Linear Autoencoders
106 5 A Theoretical Analysis of Contrastive Unsupervised Representation Learning
107 5 Guarantees for Spectral Clustering with Fairness Constraints
108 5 Online Control with Adversarial Disturbances
109 5 Width Provably Matters in Optimization for Deep Linear Neural Networks
110 5 Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions
111 5 MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing
112 5 Remember and Forget for Experience Replay
113 5 The advantages of multiple classes for reducing overfitting from test set reuse
114 5 Model-Based Active Exploration
115 5 Efficient Dictionary Learning with Gradient Descent
116 5 Near optimal finite time identification of arbitrary linear dynamical systems
117 5 EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE
118 5 On the Impact of the Activation function on Deep Neural Networks Training
119 5 Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
120 5 Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
121 5 Variational Inference for sparse network reconstruction from count data
122 5 GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects
123 5 SAGA with Arbitrary Sampling
124 5 Robust Decision Trees Against Adversarial Examples
125 5 First-Order Adversarial Vulnerability of Neural Networks and Input Dimension
126 4 On Variational Bounds of Mutual Information
127 4 Differentially Private Fair Learning
128 4 Fair k-Center Clustering for Data Summarization
129 4 Mixture Models for Diverse Machine Translation: Tricks of the Trade
130 4 Non-Monotonic Sequential Text Generation
131 4 Gromov-Wasserstein Learning for Graph Matching and Node Embedding
132 4 Counterfactual Visual Explanations
133 4 Optimal Mini-Batch and Step Sizes for SAGA
134 4 Infinite Mixture Prototypes for Few-shot Learning
135 4 A Dynamical Systems Perspective on Nesterov Acceleration
136 4 On the Complexity of Approximating Wasserstein Barycenters
137 4 SGD without Replacement: Sharper Rates for General Smooth Convex Functions
138 4 Learning interpretable continuous-time models of latent stochastic dynamical systems
139 4 Bayesian Nonparametric Federated Learning of Neural Networks
140 4 BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning
141 4 Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence
142 4 Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
143 4 Provable Guarantees for Gradient-Based Meta-Learning
144 4 Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules
145 4 Generalized Majorization-Minimization
146 4 Simple Black-box Adversarial Attacks
147 4 Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization
148 4 NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks
149 4 Are Generative Classifiers More Robust to Adversarial Attacks?
150 4 Information-Theoretic Considerations in Batch Reinforcement Learning
151 4 Provably efficient RL with Rich Observations via Latent State Decoding
152 4 Locally Private Bayesian Inference for Count Models
153 4 Bayesian Joint Spike-and-Slab Graphical Lasso
154 4 Graph Matching Networks for Learning the Similarity of Graph Structured Objects
155 4 Diagnosing Bottlenecks in Deep Q-learning Algorithms
156 4 An Investigation of Model-Free Planning
157 4 Contextual Memory Trees
158 4 Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks
159 4 Data Shapley: Equitable Valuation of Data for Machine Learning
160 4 SelectiveNet: A Deep Neural Network with an Integrated Reject Option
161 3 Multi-Frequency Phase Synchronization
162 3 Sublinear quantum algorithms for training linear and kernel-based classifiers
163 3 Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering
164 3 Similarity of Neural Network Representations Revisited
165 3 What is the Effect of Importance Weighting in Deep Learning?
166 3 Analyzing and Improving Representations with the Soft Nearest Neighbor Loss
167 3 Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints
168 3 A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks
169 3 Geometric Scattering for Graph Data Analysis
170 3 Stable and Fair Classification
171 3 Analogies Explained: Towards Understanding Word Embeddings
172 3 Finding Options that Minimize Planning Time
173 3 Hybrid Models with Deep and Invertible Features
174 3 Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
175 3 Distributed Learning over Unreliable Networks
176 3 Learning Optimal Fair Policies
177 3 Metropolis-Hastings Generative Adversarial Networks
178 3 Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization
179 3 Multi-Frequency Vector Diffusion Maps
180 3 Fairwashing: the risk of rationalization
181 3 Finding Mixed Nash Equilibria of Generative Adversarial Networks
182 3 Learning Generative Models across Incomparable Spaces
183 3 Learning-to-Learn Stochastic Gradient Descent with Biased Regularization
184 3 Plug-and-Play Methods Provably Converge with Properly Trained Denoisers
185 3 Control Regularization for Reduced Variance Reinforcement Learning
186 3 The Natural Language of Actions
187 3 Almost surely constrained convex optimization
188 3 Traditional and Heavy Tailed Self Regularization in Neural Network Models
189 3 Self-Supervised Exploration via Disagreement
190 3 Direct Uncertainty Prediction for Medical Second Opinions
191 3 Wasserstein Adversarial Examples via Projected Sinkhorn Iterations
192 3 Conditioning by adaptive sampling for robust design
193 3 Does Data Augmentation Lead to Positive Margin?
194 3 Greedy Layerwise Learning Can Scale To ImageNet
195 3 DL2: Training and Querying Neural Networks with Logic
196 3 The Value Function Polytope in Reinforcement Learning
197 3 Action Robust Reinforcement Learning and Applications in Continuous Control
198 3 Automatic Posterior Transformation for Likelihood-Free Inference
199 3 Rao-Blackwellized Stochastic Gradients for Discrete Distributions
200 3 Subspace Robust Wasserstein Distances
201 3 Importance Sampling Policy Evaluation with an Estimated Behavior Policy
202 3 Lipschitz Generative Adversarial Nets
203 3 Homomorphic Sensing
204 3 A Conditional-Gradient-Based Augmented Lagrangian Framework
205 3 Deep Factors for Forecasting
206 3 Learning to bid in revenue-maximizing auctions
207 3 Molecular Hypergraph Grammar with Its Application to Molecular Optimization
208 3 Topological Data Analysis of Decision Boundaries with Application to Model Selection
209 3 Statistical Foundations of Virtual Democracy
210 3 Lower Bounds for Smooth Nonconvex Finite-Sum Optimization
211 3 Improving Adversarial Robustness via Promoting Ensemble Diversity
212 3 Metric-Optimized Example Weights
213 3 Nonlinear Distributional Gradient Temporal-Difference Learning
214 2 Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning
215 2 Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment
216 2 Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces
217 2 Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness
218 2 Guided evolutionary strategies: augmenting random search with surrogate gradients
219 2 Autoregressive Energy Machines
220 2 Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
221 2 Online Algorithms for Rent-Or-Buy with Expert Advice
222 2 Submodular Maximization beyond Non-negativity: Guarantees, Fast Algorithms, and Applications
223 2 Rates of Convergence for Sparse Variational Gaussian Process Regression
224 2 CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
225 2 MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization
226 2 Adaptive Sensor Placement for Continuous Spaces
227 2 Global Convergence of Block Coordinate Descent in Deep Learning
228 2 Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions
229 2 Discovering Context Effects from Raw Choice Data
230 2 Fairness without Harm: Decoupled Classifiers with Preference Guarantees
231 2 POLITEX: Regret Bounds for Policy Iteration using Expert Prediction
232 2 Fair Regression: Quantitative Definitions and Reduction-Based Algorithms
233 2 Flexibly Fair Representation Learning by Disentanglement
234 2 Proportionally Fair Clustering
235 2 Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement
236 2 On the Connection Between Adversarial Robustness and Saliency Map Interpretability
237 2 Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation
238 2 $\texttt{DoubleSqueeze}$: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression
239 2 Almost Unsupervised Text to Speech and Automatic Speech Recognition
240 2 Target-Based Temporal-Difference Learning
241 2 Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets
242 2 Toward Controlling Discrimination in Online Ad Auctions
243 2 Learning to Infer Program Sketches
244 2 Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation
245 2 Classification from Positive, Unlabeled and Biased Negative Data
246 2 Neural Network Attributions: A Causal Perspective
247 2 Learning Discrete Structures for Graph Neural Networks
248 2 Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group
249 2 CompILE: Compositional Imitation Learning and Execution
250 2 Statistics and Samples in Distributional Reinforcement Learning
251 2 Exploring the Landscape of Spatial Robustness
252 2 EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis
253 2 Provably Efficient Imitation Learning from Observation Alone
254 2 Alternating Minimizations Converge to Second-Order Optimal Solutions
255 2 On the statistical rate of nonlinear recovery in generative models with heavy-tailed data
256 2 Sensitivity Analysis of Linear Structural Causal Models
257 2 Simple Stochastic Gradient Methods for Non-Smooth Non-Convex Regularized Optimization
258 2 Beyond Adaptive Submodularity: Approximation Guarantees of Greedy Policy with Adaptive Submodularity Ratio
259 2 Band-limited Training and Inference for Convolutional Neural Networks
260 2 Multivariate Submodular Optimization
261 2 Domain Agnostic Learning with Disentangled Representations
262 2 Learning a Compressed Sensing Measurement Matrix via Gradient Unrolling
263 2 Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization
264 2 Robust Learning from Untrusted Sources
265 2 Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization
266 2 On Connected Sublevel Sets in Deep Learning
267 2 Sum-of-Squares Polynomial Flow
268 2 On the Convergence and Robustness of Adversarial Training
269 2 Active Learning for Decision-Making from Imbalanced Observational Data
270 2 Low Latency Privacy Preserving Inference
271 2 Weak Detection of Signal in the Spiked Wigner Model
272 2 The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
273 2 Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization
274 2 Graphical-model based estimation and inference for differential privacy
275 2 Differentiable Linearized ADMM
276 2 CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration
277 2 Composable Core-sets for Determinant Maximization: A Simple Near-Optimal Algorithm
278 2 Fingerprint Policy Optimisation for Robust Reinforcement Learning
279 2 Safe Grid Search with Optimal Complexity
280 2 Dynamic Weights in Multi-Objective Deep Reinforcement Learning
281 2 DeepMDP: Learning Continuous Latent Space Models for Representation Learning
282 2 On Symmetric Losses for Learning from Corrupted Labels
283 2 A Kernel Perspective for Regularizing Deep Neural Networks
284 2 Random Matrix Improved Covariance Estimation for a Large Class of Metrics
285 2 Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
286 2 Adversarial Generation of Time-Frequency Features with application in audio synthesis
287 2 Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models
288 2 Correlated Variational Auto-Encoders
289 2 Maximum Likelihood Estimation for Learning Populations of Parameters
290 2 Self-Attention Graph Pooling
291 2 Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise
292 2 Learning to Prove Theorems via Interacting with Proof Assistants
293 2 A Composite Randomized Incremental Gradient Method
294 2 GMNN: Graph Markov Neural Networks
295 2 Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
296 2 When Samples Are Strategically Selected
297 2 Processing Megapixel Images with Deep Attention-Sampling Models
298 2 Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models
299 2 PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization
300 2 A Contrastive Divergence for Combining Variational Inference and MCMC
301 2 Adversarial Attacks on Node Embeddings via Graph Poisoning
302 1 Understanding Priors in Bayesian Neural Networks at the Unit Level
303 1 Semi-Cyclic Stochastic Gradient Descent
304 1 Learning Dependency Structures for Weak Supervision Models
305 1 Faster Attend-Infer-Repeat with Tractable Probabilistic Models
306 1 Hierarchical Importance Weighted Autoencoders
307 1 Unsupervised Label Noise Modeling and Loss Correction
308 1 QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
309 1 The information-theoretic value of unlabeled data in semi-supervised learning
310 1 Cross-Domain 3D Equivariant Image Embeddings
311 1 Neural Collaborative Subspace Clustering
312 1 PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits
313 1 Sequential Facility Location: Approximate Submodularity and Greedy Algorithm
314 1 Good Initializations of Variational Bayes for Deep Models
315 1 Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
316 1 Nonparametric Bayesian Deep Networks with Local Competition
317 1 Communication-Constrained Inference and the Role of Shared Randomness
318 1 Decentralized Exploration in Multi-Armed Bandits
319 1 Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations
320 1 On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference
321 1 Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians
322 1 DAG-GNN: DAG Structure Learning with Graph Neural Networks
323 1 Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning
324 1 Partially Linear Additive Gaussian Graphical Models
325 1 Learning Context-dependent Label Permutations for Multi-label Classification
326 1 Approximation and non-parametric estimation of ResNet-type convolutional neural networks
327 1 Robust Inference via Generative Classifiers for Handling Noisy Labels
328 1 Robust Estimation of Tree Structured Gaussian Graphical Models
329 1 Graph Resistance and Learning from Pairwise Comparisons
330 1 Coresets for Ordered Weighted Clustering
331 1 Efficient Nonconvex Regularized Tensor Completion with Structure-aware Proximal Iterations
332 1 Zero-Shot Knowledge Distillation in Deep Networks
333 1 Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms
334 1 Spectral Clustering of Signed Graphs via Matrix Power Means
335 1 Adaptive Regret of Convex and Smooth Functions
336 1 Scaling Up Ordinal Embedding: A Landmark Approach
337 1 Understanding and correcting pathologies in the training of learned optimizers
338 1 On Scalable and Efficient Computation of Large Scale Optimal Transport
339 1 A fully differentiable beam search decoder
340 1 Online Variance Reduction with Mixtures
341 1 MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets
342 1 A Polynomial Time MCMC Method for Sampling from Continuous Determinantal Point Processes
343 1 Fairness risk measures
344 1 Fairness-Aware Learning for Continuous Attributes and Treatments
345 1 Neural Separation of Observed and Unobserved Distributions
346 1 Reinforcement Learning in Configurable Continuous Environments
347 1 Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables
348 1 Adaptive Scale-Invariant Online Algorithms for Learning Linear Models
349 1 Bridging Theory and Algorithm for Domain Adaptation
350 1 MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement
351 1 Learning Discrete and Continuous Factors of Data via Alternating Disentanglement
352 1 CAB: Continuous Adaptive Blending for Policy Evaluation and Learning
353 1 Learning Structured Decision Problems with Unawareness
354 1 Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits
355 1 Competing Against Nash Equilibria in Adversarially Changing Zero-Sum Games
356 1 Complementary-Label Learning for Arbitrary Losses and Models
357 1 Neuron birth-death dynamics accelerates gradient descent and converges asymptotically
358 1 Unifying Orthogonal Monte Carlo Methods
359 1 Differentially Private Empirical Risk Minimization with Non-convex Loss Functions
360 1 Towards a Deep and Unified Understanding of Deep Neural Models in NLP
361 1 State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations
362 1 Geometric Losses for Distributional Learning
363 1 Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting
364 1 Co-manifold learning with missing data
365 1 Compositional Fairness Constraints for Graph Embeddings
366 1 Improved Convergence for $\ell_1$ and $\ell_\infty$ Regression via Iteratively Reweighted Least Squares
367 1 Transfer of Samples in Policy Search via Multiple Importance Sampling
368 1 Sample-Optimal Parametric Q-Learning Using Linearly Additive Features
369 1 Bias Also Matters: Bias Attribution for Deep Neural Network Explanation
370 1 Combining parametric and nonparametric models for off-policy evaluation
371 1 Disentangled Graph Convolutional Networks
372 1 Differentiable Dynamic Normalization for Learning Deep Representation
373 1 Relational Pooling for Graph Representations
374 1 Hessian Aided Policy Gradient
375 1 Estimate Sequences for Variance-Reduced Stochastic Composite Optimization
376 1 Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment
377 1 Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm
378 1 Tensor Variable Elimination for Plated Factor Graphs
379 1 Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances
380 1 Position-aware Graph Neural Networks
381 1 How does Disagreement Help Generalization against Label Corruption?
382 1 IMEXnet – A Forward Stable Deep Neural Network
383 1 Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding
384 1 Bayesian Optimization Meets Bayesian Optimal Stopping
385 1 Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity
386 1 Equivariant Transformer Networks
387 1 Submodular Observation Selection and Information Gathering for Quadratic Models
388 1 Conditional Independence in Testing Bayesian Networks
389 1 MONK — Outlier-Robust Mean Embedding Estimation by Median-of-Means
390 1 Improved Parallel Algorithms for Density-Based Network Clustering
391 1 Graph Element Networks: adaptive, structured computation and memory
392 1 Learning Models from Data with Measurement Error: Tackling Underreporting
393 1 Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization
394 1 A Deep Reinforcement Learning Perspective on Internet Congestion Control
395 1 Orthogonal Random Forest for Causal Inference
396 1 Classifying Treatment Responders Under Causal Effect Monotonicity
397 1 On the Generalization Gap in Reparameterizable Reinforcement Learning
398 1 Approximated Oracle Filter Pruning for Destructive CNN Width Optimization
399 1 Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
400 1 Better generalization with less data using robust gradient descent
401 1 Monge blunts Bayes: Hardness Results for Adversarial Training
402 1 Beyond the Chinese Restaurant and Pitman-Yor processes: Statistical Models with double power-law behavior
403 1 Variational Annealing of GANs: A Langevin Perspective
404 1 On the Design of Estimators for Bandit Off-Policy Evaluation
405 1 A Large-Scale Study on Regularization and Normalization in GANs
406 1 Automated Model Selection with Bayesian Quadrature
407 1 Zeno: Distributed Stochastic Gradient Descent with Suspicion-based Fault-tolerance
408 1 Deep Gaussian Processes with Importance-Weighted Variational Inference
409 1 Noisy Dual Principal Component Pursuit
410 1 Transferable Clean-Label Poisoning Attacks on Deep Neural Nets
411 1 Bilinear Bandits with Low-rank Structure
412 1 Structured agents for physical construction
413 1 Estimating Information Flow in Deep Neural Networks
414 1 Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels
415 1 GOODE: A Gaussian Off-The-Shelf Ordinary Differential Equation Solver
416 1 Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
417 1 Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards
418 1 Distribution calibration for regression
419 1 Distributed Learning with Sublinear Communication
420 1 Temporal Gaussian Mixture Layer for Videos
421 1 Stochastic Deep Networks
422 1 Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA
423 1 Efficient optimization of loops and limits with randomized telescoping sums
424 1 Robust Influence Maximization for Hyperparametric Models
425 1 Communication Complexity in Locally Private Distribution Estimation and Heavy Hitters
426 1 Convolutional Poisson Gamma Belief Network
427 1 SWALP : Stochastic Weight Averaging in Low Precision Training
428 1 Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
429 1 Beyond Backprop: Online Alternating Minimization with Auxiliary Variables
430 1 Discovering Options for Exploration by Minimizing Cover Time
431 1 Static Automatic Batching In TensorFlow
432 1 Rotation Invariant Householder Parameterization for Bayesian PCA
433 1 Fault Tolerance in Iterative-Convergent Machine Learning
434 1 SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver
435 1 Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications
436 1 Generalized Linear Rule Models
437 1 Optimal Minimal Margin Maximization with Boosting
438 1 GDPP: Learning Diverse Generations using Determinantal Point Processes
439 1 Per-Decision Option Discounting
440 1 Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search
441 1 BayesNAS: A Bayesian Approach for Neural Architecture Search
442 1 Collaborative Channel Pruning for Deep Networks
443 1 Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff
444 1 Learning from a Learner
445 1 Rate Distortion For Model Compression:From Theory To Practice
446 1 Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty
447 1 Imitation Learning from Imperfect Demonstration
448 1 Switching Linear Dynamics for Variational Bayes Filtering
449 1 Feature-Critic Networks for Heterogeneous Domain Generalization
450 1 Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs
451 1 Predictor-Corrector Policy Optimization
452 1 EMI: Exploration with Mutual Information
453 1 Wasserstein of Wasserstein Loss for Learning Generative Models
454 1 Learning Optimal Linear Regularizers
455 1 A Statistical Investigation of Long Memory in Language and Music
456 1 Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
457 1 Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
458 1 Inference and Sampling of $K_{33}$-free Ising Models
459 1 CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
460 1 A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation
461 1 Learning to Optimize Multigrid PDE Solvers
462 1 LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning
463 1 Combating Label Noise in Deep Learning using Abstention
464 1 On The Power of Curriculum Learning in Training Deep Networks
465 1 Learning to Clear the Market
466 1 Online learning with kernel losses
467 1 Teaching a black-box learner
468 1 Learning to Groove with Inverse Sequence Transformations
469 1 Stable-Predictive Optimistic Counterfactual Regret Minimization
470 1 Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization
471 1 Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization
472 1 Making Deep Q-learning methods robust to time discretization
473 1 Validating Causal Inference Models via Influence Functions
474 0 Lorentzian Distance Learning for Hyperbolic Representations
475 0 Pareto Optimal Streaming Unsupervised Classification
476 0 LatentGNN: Learning Efficient Non-local Relations for Visual Recognition
477 0 Greedy Orthogonal Pivoting Algorithm for Non-Negative Matrix Factorization
478 0 Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation
479 0 Hyperbolic Disk Embeddings for Directed Acyclic Graphs
480 0 Faster Algorithms for Binary Matrix Factorization
481 0 Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
482 0 ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
483 0 Unsupervised Deep Learning by Neighbourhood Discovery
484 0 Discovering Conditionally Salient Features with Statistical Guarantees
485 0 Dropout as a Structured Shrinkage Prior
486 0 Categorical Feature Compression via Submodular Optimization
487 0 Exploiting structure of uncertainty for efficient matroid semi-bandits
488 0 Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity
489 0 Learning and Data Selection in Big Datasets
490 0 The Wasserstein Transform
491 0 Distributed, Egocentric Representations of Graphs for Detecting Critical Structures
492 0 COMIC: Multi-view Clustering Without Parameter Selection
493 0 Random Walks on Hypergraphs with Edge-Dependent Vertex Weights
494 0 Supervised Hierarchical Clustering with Exponential Linkage
495 0 Scale-free adaptive planning for deterministic dynamics & discounted rewards
496 0 Learning Distance for Sequences by Learning a Ground Metric
497 0 Efficient Training of BERT by Progressively Stacking
498 0 Making Decisions that Reduce Discriminatory Impacts
499 0 On the Long-term Impact of Algorithmic Decision Policies: Effort Unfairness and Feature Segregation through Social Learning
500 0 Kernel Normalized Cut: a Theoretical Revisit
501 0 Humor in Word Embeddings: Cockamamie Gobbledegook for Nincompoops
502 0 Trainable Decoding of Sets of Sequences for Neural Sequence Models
503 0 Spectral Approximate Inference
504 0 Empirical Analysis of Beam Search Performance Degradation in Neural Sequence Models
505 0 LIT: Learned Intermediate Representation Training for Model Compression
506 0 A Better k-means++ Algorithm via Local Search
507 0 Anytime Online-to-Batch, Optimism and Acceleration
508 0 Improving Neural Language Modeling via Adversarial Training
509 0 Fast Algorithm for Generalized Multinomial Models with Ranking Data
510 0 Fast and Stable Maximum Likelihood Estimation for Incomplete Multinomial Models
511 0 Unreproducible Research is Reproducible
512 0 Deep Residual Output Layers for Neural Language Generation
513 0 Online Adaptive Principal Component Analysis and Its extensions
514 0 Meta-Learning Neural Bloom Filters
515 0 Efficient Full-Matrix Adaptive Regularization
516 0 Recursive Sketches for Modular Deep Learning
517 0 Efficient On-Device Models using Neural Projections
518 0 Ladder Capsule Network
519 0 Mallows ranking models: maximum likelihood estimate and regeneration
520 0 Learning to select for a predefined ranking
521 0 Dimensionality Reduction for Tukey Regression
522 0 Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems
523 0 Demystifying Dropout
524 0 Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs
525 0 Concrete Autoencoders: Differentiable Feature Selection and Reconstruction
526 0 Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
527 0 Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering
528 0 DBSCAN++: Towards fast and scalable density clustering
529 0 Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case
530 0 Accelerated Flow for Probability Distributions
531 0 Model Function Based Conditional Gradient Method with Armijo-like Line Search
532 0 Iterative Linearized Control: Stable Algorithms and Complexity Guarantees
533 0 AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
534 0 Adaptive Antithetic Sampling for Variance Reduction
535 0 State-Regularized Recurrent Neural Networks
536 0 Learning What and Where to Transfer
537 0 Adversarial Online Learning with noise
538 0 Replica Conditional Sequential Monte Carlo
539 0 Gaining Free or Low-Cost Interpretability with Interpretable Partial Substitute
540 0 Calibrated Model-Based Deep Reinforcement Learning
541 0 Power k-Means Clustering
542 0 Hierarchically Structured Meta-learning
543 0 Incremental Randomized Sketching for Online Kernel Learning
544 0 Exploring interpretable LSTM neural networks over multi-variable data
545 0 RaFM: Rank-Aware Factorization Machines
546 0 Functional Transparency for Structured Data: a Game-Theoretic Approach
547 0 Projections for Approximate Policy Iteration Algorithms
548 0 Differentially Private Learning of Geometric Concepts
549 0 Online Learning with Sleeping Experts and Feedback Graphs
550 0 Multi-objective training of Generative Adversarial Networks with multiple discriminators
551 0 Bounding User Contributions: A Bias-Variance Trade-off in Differential Privacy
552 0 Model Comparison for Semantic Grouping
553 0 Linear-Complexity Data-Parallel Earth Mover’s Distance Approximations
554 0 Variational Laplace Autoencoders
555 0 Online Convex Optimization in Adversarial Markov Decision Processes
556 0 Stochastic Iterative Hard Thresholding for Graph-structured Sparsity Optimization
557 0 Doubly Robust Joint Learning for Recommendation on Data Missing Not at Random
558 0 On Sparse Linear Regression in the Local Differential Privacy Model
559 0 Matrix-Free Preconditioning in Online Learning
560 0 Data Poisoning Attacks on Stochastic Bandits
561 0 Learning Neurosymbolic Generative Models via Program Synthesis
562 0 Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
563 0 Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning
564 0 Differential Inclusions for Modeling Nonsmooth ADMM Variants: A Continuous Limit Theory
565 0 Stochastic Blockmodels meet Graph Neural Networks
566 0 Quantile Stein Variational Gradient Descent for Batch Bayesian Optimization
567 0 A Recurrent Neural Cascade-based Model for Continuous-Time Diffusion
568 0 Exploration Conscious Reinforcement Learning Revisited
569 0 The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions
570 0 Interpreting Adversarially Trained Convolutional Neural Networks
571 0 Deep Generative Learning via Variational Gradient Flow
572 0 Breaking Inter-Layer Co-Adaptation by Classifier Anonymization
573 0 Bayesian Optimization of Composite Functions
574 0 First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems
575 0 Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data
576 0 Open Vocabulary Learning on Source Code with a Graph-Structured Cache
577 0 Toward Understanding the Importance of Noise in Training Neural Networks
578 0 Invariant-Equivariant Representation Learning for Multi-Class Data
579 0 Active Learning with Disagreement Graphs
580 0 Scalable Nonparametric Sampling from Multimodal Posteriors with the Posterior Bootstrap
581 0 Learning to Route in Similarity Graphs
582 0 Active Learning for Probabilistic Structured Prediction of Cuts and Matchings
583 0 The Variational Predictive Natural Gradient
584 0 Deep Compressed Sensing
585 0 Minimal Achievable Sufficient Statistic Learning
586 0 Bayesian Generative Active Deep Learning
587 0 Hierarchical Decompositional Mixtures of Variational Autoencoders
588 0 Efficient learning of smooth probability functions from Bernoulli tests with guarantees
589 0 Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments
590 0 Discriminative Regularization for Latent Variable Models with Applications to Electrocardiography
591 0 Understanding and Accelerating Particle-Based Variational Inference
592 0 Connectivity-Optimized Representation Learning via Persistent Homology
593 0 Nonlinear Stein Variational Gradient Descent for Learning Diversified Mixture Models
594 0 Dead-ends and Secure Exploration in Reinforcement Learning
595 0 Predicate Exchange: Inference with Declarative Knowledge
596 0 Fast Direct Search in an Optimally Compressed Continuous Target Space for Efficient Multi-Label Active Learning
597 0 Adversarially Learned Representations for Information Obfuscation and Inference
598 0 Active Embedding Search via Noisy Paired Comparisons
599 0 A Tree-Based Method for Fast Repeated Sampling of Determinantal Point Processes
600 0 Hiring Under Uncertainty
601 0 On Medians of (Randomized) Pairwise Means
602 0 Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation
603 0 Overcoming Multi-model Forgetting
604 0 Generative Modeling of Infinite Occluded Objects for Compositional Scene Representation
605 0 Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size!
606 0 More Efficient Off-Policy Evaluation through Regularized Targeted Learning
607 0 A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs
608 0 Scalable Training of Inference Networks for Gaussian-Process Models
609 0 Submodular Cost Submodular Cover with an Approximate Oracle
610 0 Riemannian adaptive stochastic gradient algorithms on matrix manifolds
611 0 Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN
612 0 Training CNNs with Selective Allocation of Channels
613 0 Neural Inverse Knitting: From Images to Manufacturing Instructions
614 0 Discovering Latent Covariance Structures for Multiple Time Series
615 0 Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation
616 0 Transferable Adversarial Training: A General Approach to Adapting Deep Classifiers
617 0 Adjustment Criteria for Generalizing Experimental Findings
618 0 Kernel Mean Matching for Content Addressability of GANs
619 0 Incorporating Grouping Information into Bayesian Decision Tree Ensembles
620 0 Towards Understanding Knowledge Distillation
621 0 New results on information theoretic clustering
622 0 Anomaly Detection With Multiple-Hypotheses Predictions
623 0 Trajectory-Based Off-Policy Deep Reinforcement Learning
624 0 LegoNet: Efficient Convolutional Neural Networks with Lego Filters
625 0 Lossless or Quantized Boosting with Integer Arithmetic
626 0 Variational Russian Roulette for Deep Bayesian Nonparametrics
627 0 Approximating Orthogonal Matrices with Effective Givens Factorization
628 0 Random Function Priors for Correlation Modeling
629 0 Learning Classifiers for Target Domain with Limited or No Labels
630 0 On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization
631 0 Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models
632 0 Composing Value Functions in Reinforcement Learning
633 0 DP-GP-LVM: A Bayesian Non-Parametric Model for Learning Multivariate Dependency Structures
634 0 Distributed Weighted Matching via Randomized Composable Coresets
635 0 Causal Identification under Markov Equivalence: Completeness Results
636 0 Context-Aware Zero-Shot Learning for Object Recognition
637 0 Dynamic Learning with Frequent New Product Launches: A Sequential Multinomial Logit Bandit Problem
638 0 DeepNose: Using artificial neural networks to represent the space of odorants
639 0 Data Poisoning Attacks in Multi-Party Learning
640 0 Screening rules for Lasso with non-convex Sparse Regularizers
641 0 Concentration Inequalities for Conditional Value at Risk
642 0 Characterizing Well-Behaved vs. Pathological Deep Neural Networks
643 0 Dynamic Measurement Scheduling for Event Forecasting using Deep RL
644 0 Taming MAML: Efficient unbiased meta-reinforcement learning
645 0 Online Learning to Rank with Features
646 0 A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning
647 0 Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data
648 0 SELFIE: Refurbishing Unclean Samples for Robust Deep Learning
649 0 Learning Novel Policies For Tasks
650 0 End-to-End Probabilistic Inference for Nonstationary Audio Analysis
651 0 Trimming the $\ell_1$ Regularizer: Statistical Analysis, Optimization, and Applications to Deep Learning
652 0 Disentangling Disentanglement in Variational Autoencoders
653 0 Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
654 0 Cognitive model priors for predicting human decisions
655 0 Overcoming Mean-Field Approximations in Recurrent Gaussian Process Models
656 0 A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization
657 0 Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging
658 0 Fast and Flexible Inference of Joint Distributions from their Marginals
659 0 Collective Model Fusion for Multiple Black-Box Experts
660 0 Correlated bandits or: How to minimize mean-squared error online
661 0 On discriminative learning of prediction uncertainty
662 0 A Multitask Multiple Kernel Learning Algorithm for Survival Analysis with Application to Cancer Biology
663 0 Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation
664 0 ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation
665 0 Learning with Bad Training Data via Iterative Trimmed Loss Minimization
666 0 Target Tracking for Contextual Bandits: Application to Demand Side Management
667 0 Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems
668 0 Graph Convolutional Gaussian Processes
669 0 Exploiting Worker Correlation for Label Aggregation in Crowdsourcing
670 0 Self-similar Epochs: Value in arrangement
671 0 HyperGAN: A Generative Model for Diverse, Performant Neural Networks
672 0 A Personalized Affective Memory Model for Improving Emotion Recognition
673 0 Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Applications
674 0 Poission Subsampled R\’enyi Differential Privacy
675 0 Jumpout : Improved Dropout for Deep Neural Networks with ReLUs
676 0 Geometry Aware Convolutional Filters for Omnidirectional Images Representation
677 0 A Framework for Bayesian Optimization in Embedded Subspaces
678 0 Area Attention
679 0 The Implicit Fairness Criterion of Unconstrained Learning
680 0 Co-Representation Network for Generalized Zero-Shot Learning
681 0 Sublinear Space Private Algorithms Under the Sliding Window Model
682 0 Optimality Implies Kernel Sum Classifiers are Statistically Efficient
683 0 Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator
684 0 Shallow-Deep Networks: Understanding and Mitigating Network Overthinking
685 0 Neurally-Guided Structure Inference
686 0 An Optimal Private Stochastic-MAB Algorithm based on Optimal Private Stopping Rule
687 0 A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent
688 0 Active Manifolds: A non-linear analogue to Active Subspaces
689 0 Bayesian Counterfactual Risk Minimization
690 0 Compressing Gradient Optimizers via Count-Sketches
691 0 Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks
692 0 White-box vs Black-box: Bayes Optimal Strategies for Membership Inference
693 0 Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction
694 0 Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models
695 0 Sublinear Time Nearest Neighbor Search over Generalized Weighted Space
696 0 Bayesian leave-one-out cross-validation for large data
697 0 Formal Privacy for Functional Data with Gaussian Perturbations
698 0 Separable value functions across time-scales
699 0 Dirichlet Simplex Nest and Geometric Inference
700 0 Scalable Learning in Reproducing Kernel Krein Spaces
701 0 Heterogeneous Model Reuse via Optimizing Multiparty Multiclass Margin
702 0 HexaGAN: Generative Adversarial Nets for Real World Classification
703 0 Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces
704 0 On Dropout and Nuclear Norm Regularization
705 0 Phaseless PCA: Low-Rank Matrix Recovery from Column-wise Phaseless Measurements
706 0 Understanding and Controlling Memory in Recurrent Neural Networks
707 0 kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection
708 0 Improved Dynamic Graph Learning through Fault-Tolerant Sparsification
709 0 Non-Parametric Priors For Generative Adversarial Networks
710 0 Regularization in directable environments with application to Tetris
711 0 Imputing Missing Events in Continuous-Time Event Streams
712 0 Learning to Convolve: A Generalized Weight-Tying Approach
713 0 Large-Scale Sparse Kernel Canonical Correlation Analysis
714 0 Curvature-Exploiting Acceleration of Elastic Net Computations
715 0 Doubly-Competitive Distribution Estimation
716 0 AUCµ: A Performance Metric for Multi-Class Machine Learning Models
717 0 Neural Joint Source-Channel Coding
718 0 Flat Metric Minimization with Applications in Generative Modeling
719 0 Weakly-Supervised Temporal Localization via Occurrence Count Learning
720 0 Rehashing Kernel Evaluation in High Dimensions
721 0 Learning to Collaborate in Markov Decision Processes
722 0 Dual Entangled Polynomial Code: Three-Dimensional Coding for Distributed Matrix Multiplication
723 0 A Persistent Weisfeiler–Lehman Procedure for Graph Classification
724 0 Neural Logic Reinforcement Learning
725 0 Revisiting precision recall definition for generative modeling
726 0 Acceleration of SVRG and Katyusha X by Inexact Preconditioning
727 0 Look Ma, No Latent Variables: Accurate Cutset Networks via Compilation
728 0 Bayesian Deconditional Kernel Mean Embeddings
729 0 Optimistic Policy Optimization via Multiple Importance Sampling
730 0 Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
731 0 Learning Hawkes Processes Under Synchronization Noise
732 0 Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth
733 0 Blended Conditonal Gradients
734 0 Boosted Density Estimation Remastered
735 0 Distributional Reinforcement Learning for Efficient Exploration
736 0 Generalized Approximate Survey Propagation for High-Dimensional Estimation
737 0 Projection onto Minkowski Sums with Application to Constrained Learning
738 0 Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
739 0 Voronoi Boundary Classification: A High-Dimensional Geometric Approach via Weighted Monte Carlo Integration
740 0 PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach
741 0 Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension
742 0 Circuit-GNN: Graph Neural Networks for Distributed Circuit Design
743 0 Particle Flow Bayes’ Rule
744 0 Multiplicative Weights Updates as a distributed constrained optimization algorithm: Convergence to second-order stationary points almost always
745 0 Generalized No Free Lunch Theorem for Adversarial Robustness
746 0 Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation
747 0 Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations
748 0 Shape Constraints for Set Functions
749 0 Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference
750 0 Sparse Extreme Multi-label Learning with Oracle Property
751 0 Stein Point Markov Chain Monte Carlo
752 0 Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates
753 0 Graph Neural Network for Music Score Data and Modeling Expressive Piano Performance
754 0 Policy Consolidation for Continual Reinforcement Learning
755 0 POPQORN: Quantifying Robustness of Recurrent Neural Networks
756 0 Multi-Agent Adversarial Inverse Reinforcement Learning
757 0 Amortized Monte Carlo Integration
758 0 LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations
759 0 PAC Learnability of Node Functions in Networked Dynamical Systems
760 0 TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning
761 0 Adversarial camera stickers: A physical camera-based attack on deep learning systems
762 0 Composing Entropic Policies using Divergence Correction
763 0 TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning
764 0 Improving Model Selection by Employing the Test Data
765 0 Understanding MCMC Dynamics as Flows on the Wasserstein Space
766 0 On Certifying Non-Uniform Bounds against Adversarial Attacks
767 0 Moment-Based Variational Inference for Markov Jump Processes
768 0 Calibrated Approximate Bayesian Inference
769 0 Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data
770 0 Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function
771 0 Refined Complexity of PCA with Outliers
772 0 Regret Circuits: Composability of Regret Minimizers