Rank |
Cited by |
Paper name |
0 |
210 |
Glow: Generative Flow with Invertible 1×1 Convolutions |
1 |
186 |
Are GANs Created Equal? A Large-Scale Study |
2 |
180 |
Neural Ordinary Differential Equations |
3 |
176 |
Visualizing the Loss Landscape of Neural Nets |
4 |
123 |
How Does Batch Normalization Help Optimization? |
5 |
114 |
Isolating Sources of Disentanglement in Variational Autoencoders |
6 |
110 |
Video-to-Video Synthesis |
7 |
98 |
Natasha 2: Faster Non-Convex Optimization Than SGD |
8 |
95 |
PointCNN: Convolution On X-Transformed Points |
9 |
93 |
Adversarially Robust Generalization Requires More Data |
10 |
84 |
Realistic Evaluation of Deep Semi-Supervised Learning Algorithms |
11 |
81 |
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data |
12 |
77 |
Scaling provable adversarial defenses |
13 |
76 |
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis |
14 |
74 |
Derivative Estimation in Random Design |
15 |
73 |
Neural Tangent Kernel: Convergence and Generalization in Neural Networks |
16 |
70 |
An intriguing failing of convolutional neural networks and the CoordConv solution |
17 |
70 |
Neural Architecture Optimization |
18 |
70 |
Data-Efficient Hierarchical Reinforcement Learning |
19 |
69 |
Sanity Checks for Saliency Maps |
20 |
68 |
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents |
21 |
67 |
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models |
22 |
65 |
Neural Architecture Search with Bayesian Optimisation and Optimal Transport |
23 |
62 |
TADAM: Task dependent adaptive metric for improved few-shot learning |
24 |
58 |
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels |
25 |
57 |
Probabilistic Model-Agnostic Meta-Learning |
26 |
57 |
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport |
27 |
56 |
Searching for Efficient Multi-Scale Architectures for Dense Image Prediction |
28 |
56 |
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator |
29 |
55 |
Playing hard exploration games by watching YouTube |
30 |
55 |
Recurrent World Models Facilitate Policy Evolution |
31 |
55 |
Conditional Adversarial Domain Adaptation |
32 |
54 |
CatBoost: unbiased boosting with categorical features |
33 |
54 |
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation |
34 |
53 |
Co-teaching: Robust training of deep neural networks with extremely noisy labels |
35 |
52 |
Neural Voice Cloning with a Few Samples |
36 |
52 |
Adversarial vulnerability for any classifier |
37 |
51 |
Hierarchical Graph Representation Learning with Differentiable Pooling |
38 |
50 |
Gradient Sparsification for Communication-Efficient Distributed Optimization |
39 |
49 |
Stochastic Cubic Regularization for Fast Nonconvex Optimization |
40 |
48 |
Bilinear Attention Networks |
41 |
47 |
SNIPER: Efficient Multi-Scale Training |
42 |
46 |
NEON2: Finding Local Minima via First-Order Oracles |
43 |
44 |
First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time |
44 |
43 |
DropBlock: A regularization method for convolutional networks |
45 |
43 |
Visual Reinforcement Learning with Imagined Goals |
46 |
42 |
Empirical Risk Minimization Under Fairness Constraints |
47 |
41 |
Link Prediction Based on Graph Neural Networks |
48 |
41 |
Learning to Navigate in Cities Without a Map |
49 |
41 |
PacGAN: The power of two samples in generative adversarial networks |
50 |
41 |
Gradient Descent for Spiking Neural Networks |
51 |
40 |
Implicit Bias of Gradient Descent on Linear Convolutional Networks |
52 |
40 |
Learning to Infer Graphics Programs from Hand-Drawn Images |
53 |
40 |
Is Q-Learning Provably Efficient? |
54 |
38 |
Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks |
55 |
38 |
Meta-Reinforcement Learning of Structured Exploration Strategies |
56 |
37 |
A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks |
57 |
37 |
Pelee: A Real-Time Object Detection System on Mobile Devices |
58 |
36 |
Understanding Batch Normalization |
59 |
36 |
Unsupervised Text Style Transfer using Language Models as Discriminators |
60 |
36 |
DeepProbLog: Neural Probabilistic Logic Programming |
61 |
36 |
Recurrent Relational Networks |
62 |
35 |
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures |
63 |
35 |
Predictive Uncertainty Estimation via Prior Networks |
64 |
35 |
Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks |
65 |
35 |
Why Is My Classifier Discriminatory? |
66 |
35 |
Non-Local Recurrent Network for Image Restoration |
67 |
35 |
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding |
68 |
34 |
Generalisation in humans and deep neural networks |
69 |
34 |
Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator |
70 |
34 |
Discrimination-aware Channel Pruning for Deep Neural Networks |
71 |
34 |
Long short-term memory and Learning-to-learn in networks of spiking neurons |
72 |
34 |
Implicit Reparameterization Gradients |
73 |
33 |
Joint Autoregressive and Hierarchical Priors for Learned Image Compression |
74 |
33 |
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise |
75 |
33 |
Efficient Neural Network Robustness Certification with General Activation Functions |
76 |
33 |
Multi-Task Learning as Multi-Objective Optimization |
77 |
32 |
Constrained Graph Variational Autoencoders for Molecule Design |
78 |
32 |
A Probabilistic U-Net for Segmentation of Ambiguous Images |
79 |
32 |
Assessing Generative Models via Precision and Recall |
80 |
31 |
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs |
81 |
31 |
Randomized Prior Functions for Deep Reinforcement Learning |
82 |
31 |
LF-Net: Learning Local Features from Images |
83 |
31 |
Adversarial Examples that Fool both Computer Vision and Time-Limited Humans |
84 |
31 |
Meta-Gradient Reinforcement Learning |
85 |
31 |
Image-to-image translation for cross-domain disentanglement |
86 |
31 |
Large Margin Deep Networks for Classification |
87 |
30 |
Semidefinite relaxations for certifying robustness to adversarial examples |
88 |
30 |
Reinforcement Learning for Solving the Vehicle Routing Problem |
89 |
30 |
Evolved Policy Gradients |
90 |
30 |
Byzantine Stochastic Gradient Descent |
91 |
30 |
Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization |
92 |
29 |
A Unified View of Piecewise Linear Neural Network Verification |
93 |
29 |
Sparsified SGD with Memory |
94 |
29 |
Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization |
95 |
29 |
Tree-to-tree Neural Networks for Program Translation |
96 |
28 |
Unsupervised Attention-guided Image-to-Image Translation |
97 |
28 |
Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning |
98 |
27 |
3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data |
99 |
27 |
The challenge of realistic music generation: modelling raw audio at scale |
100 |
27 |
Speaker-Follower Models for Vision-and-Language Navigation |
101 |
27 |
Entropy and mutual information in models of deep neural networks |
102 |
27 |
FRAGE: Frequency-Agnostic Word Representation |
103 |
26 |
Fast and Effective Robustness Certification |
104 |
26 |
Flexible neural representation for physics prediction |
105 |
26 |
Does mitigating ML’s impact disparity require treatment disparity? |
106 |
26 |
Verifiable Reinforcement Learning via Policy Extraction |
107 |
25 |
Balanced Policy Evaluation and Learning |
108 |
25 |
Reinforcement Learning of Theorem Proving |
109 |
25 |
Learning Plannable Representations with Causal InfoGAN |
110 |
25 |
A Lyapunov-based Approach to Safe Reinforcement Learning |
111 |
25 |
Neural Arithmetic Logic Units |
112 |
25 |
Training Deep Neural Networks with 8-bit Floating Point Numbers |
113 |
25 |
Relational recurrent neural networks |
114 |
25 |
ResNet with one-neuron hidden layers is a Universal Approximator |
115 |
25 |
Optimal Algorithms for Non-Smooth Distributed Optimization in Networks |
116 |
25 |
How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD |
117 |
25 |
Which Neural Net Architectures Give Rise to Exploding and Vanishing Gradients? |
118 |
25 |
Learning to Decompose and Disentangle Representations for Video Prediction |
119 |
24 |
Deep State Space Models for Time Series Forecasting |
120 |
24 |
Towards Robust Interpretability with Self-Explaining Neural Networks |
121 |
24 |
Learning Attentional Communication for Multi-Agent Cooperation |
122 |
24 |
The Convergence of Sparsified Gradient Methods |
123 |
24 |
Task-Driven Convolutional Recurrent Models of the Visual System |
124 |
24 |
SimplE Embedding for Link Prediction in Knowledge Graphs |
125 |
24 |
How to Start Training: The Effect of Initialization and Architecture |
126 |
24 |
IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis |
127 |
23 |
Bayesian Model-Agnostic Meta-Learning |
128 |
23 |
Memory Replay GANs: Learning to Generate New Categories without Forgetting |
129 |
23 |
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning |
130 |
23 |
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries |
131 |
23 |
Online Learning with an Unknown Fairness Metric |
132 |
23 |
Neural Nearest Neighbors Networks |
133 |
22 |
ATOMO: Communication-efficient Learning via Atomic Sparsification |
134 |
22 |
GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration |
135 |
22 |
End-to-End Differentiable Physics for Learning and Control |
136 |
22 |
Efficient Formal Safety Analysis of Neural Networks |
137 |
22 |
Probabilistic Matrix Factorization for Automated Machine Learning |
138 |
22 |
Re-evaluating evaluation |
139 |
22 |
Delta-encoder: an effective sample synthesis method for few-shot object recognition |
140 |
22 |
GIANT: Globally Improved Approximate Newton Method for Distributed Optimization |
141 |
22 |
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate |
142 |
22 |
Neighbourhood Consensus Networks |
143 |
22 |
Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search |
144 |
21 |
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks |
145 |
21 |
Insights on representational similarity in neural networks with canonical correlation |
146 |
21 |
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation |
147 |
21 |
Direct Runge-Kutta Discretization Achieves Acceleration |
148 |
21 |
SLAYER: Spike Layer Error Reassignment in Time |
149 |
20 |
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization |
150 |
20 |
Phase Retrieval Under a Generative Prior |
151 |
20 |
Differentiable MPC for End-to-end Planning and Control |
152 |
20 |
On gradient regularizers for MMD GANs |
153 |
20 |
To Trust Or Not To Trust A Classifier |
154 |
20 |
Fairness Through Computationally-Bounded Awareness |
155 |
20 |
Learning to Optimize Tensor Programs |
156 |
20 |
Evidential Deep Learning to Quantify Classification Uncertainty |
157 |
20 |
Moonshine: Distilling with Cheap Convolutions |
158 |
20 |
A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem |
159 |
20 |
Deep Attentive Tracking via Reciprocative Learning |
160 |
20 |
A^2-Nets: Double Attention Networks |
161 |
19 |
Clebsch–Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network |
162 |
19 |
Latent Alignment and Variational Attention |
163 |
19 |
Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects |
164 |
19 |
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion |
165 |
19 |
Reward learning from human preferences and demonstrations in Atari |
166 |
19 |
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces |
167 |
19 |
Banach Wasserstein GAN |
168 |
19 |
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching |
169 |
19 |
Amortized Inference Regularization |
170 |
19 |
MetaGAN: An Adversarial Approach to Few-Shot Learning |
171 |
19 |
Reinforced Continual Learning |
172 |
19 |
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives |
173 |
18 |
Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds |
174 |
18 |
Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization |
175 |
18 |
Constructing Unrestricted Adversarial Examples with Generative Models |
176 |
18 |
Hybrid Macro/Micro Level Backpropagation for Training Deep Spiking Neural Networks |
177 |
18 |
Dimensionally Tight Bounds for Second-Order Hamiltonian Monte Carlo |
178 |
18 |
Spectral Filtering for General Linear Dynamical Systems |
179 |
18 |
Adaptive Sampling Towards Fast Graph Representation Learning |
180 |
18 |
On the Dimensionality of Word Embedding |
181 |
18 |
Are ResNets Provably Better than Linear Predictors? |
182 |
18 |
Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced |
183 |
17 |
Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies |
184 |
17 |
Adaptive Methods for Nonconvex Optimization |
185 |
17 |
Learning to Play With Intrinsically-Motivated, Self-Aware Agents |
186 |
17 |
Communication Compression for Decentralized Training |
187 |
17 |
Masking: A New Perspective of Noisy Supervision |
188 |
17 |
Hyperbolic Neural Networks |
189 |
17 |
Faster Neural Networks Straight from JPEG |
190 |
17 |
Online Structured Laplace Approximations for Overcoming Catastrophic Forgetting |
191 |
17 |
Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization |
192 |
17 |
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments |
193 |
17 |
The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal |
194 |
17 |
Norm matters: efficient and accurate normalization schemes in deep networks |
195 |
17 |
Generalized Zero-Shot Learning with Deep Calibration Network |
196 |
17 |
FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification |
197 |
16 |
Deep Generative Models with Learnable Knowledge Constraints |
198 |
16 |
Deepcode: Feedback Codes via Deep Learning |
199 |
16 |
The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization |
200 |
16 |
Watch Your Step: Learning Node Embeddings via Graph Attention |
201 |
16 |
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network |
202 |
16 |
Adversarial Multiple Source Domain Adaptation |
203 |
16 |
Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments |
204 |
16 |
Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples |
205 |
16 |
Multi-Agent Generative Adversarial Imitation Learning |
206 |
16 |
A Bayes-Sard Cubature Method |
207 |
16 |
Generalizing to Unseen Domains via Adversarial Data Augmentation |
208 |
16 |
On Learning Intrinsic Rewards for Policy Gradient Methods |
209 |
16 |
Towards Robust Detection of Adversarial Examples |
210 |
16 |
Adding One Neuron Can Eliminate All Bad Local Minima |
211 |
16 |
Unsupervised Learning of Shape and Pose with Differentiable Point Clouds |
212 |
16 |
Representation Balancing MDPs for Off-policy Policy Evaluation |
213 |
16 |
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation |
214 |
16 |
A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication |
215 |
16 |
Embedding Logical Queries on Knowledge Graphs |
216 |
16 |
Learning Deep Disentangled Embeddings With the F-Statistic Loss |
217 |
15 |
Mesh-TensorFlow: Deep Learning for Supercomputers |
218 |
15 |
A Stein variational Newton method |
219 |
15 |
Learning Conditioned Graph Structures for Interpretable Visual Question Answering |
220 |
15 |
cpSGD: Communication-efficient and differentially-private distributed SGD |
221 |
15 |
Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction |
222 |
15 |
Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders |
223 |
15 |
On the Convergence and Robustness of Training GANs with Regularized Optimal Transport |
224 |
15 |
Multimodal Generative Models for Scalable Weakly-Supervised Learning |
225 |
15 |
RetGK: Graph Kernels based on Return Probabilities of Random Walks |
226 |
15 |
Multi-Layered Gradient Boosting Decision Trees |
227 |
15 |
Domain-Invariant Projection Learning for Zero-Shot Recognition |
228 |
15 |
Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis |
229 |
15 |
MetaAnchor: Learning to Detect Objects with Customized Anchors |
230 |
15 |
Visual Object Networks: Image Generation with Disentangled 3D Representations |
231 |
14 |
Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters |
232 |
14 |
Robust Learning of Fixed-Structure Bayesian Networks |
233 |
14 |
Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions |
234 |
14 |
Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing |
235 |
14 |
Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias |
236 |
14 |
Dendritic cortical microcircuits approximate the backpropagation algorithm |
237 |
14 |
Spectral Signatures in Backdoor Attacks |
238 |
14 |
VideoCapsuleNet: A Simplified Network for Action Detection |
239 |
14 |
Simple, Distributed, and Accelerated Probabilistic Programming |
240 |
14 |
Learning towards Minimum Hyperspherical Energy |
241 |
14 |
On GANs and GMMs |
242 |
14 |
Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model |
243 |
14 |
Scalable methods for 8-bit training of neural networks |
244 |
14 |
Adversarial Text Generation via Feature-Mover’s Distance |
245 |
14 |
Importance Weighting and Variational Inference |
246 |
14 |
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering |
247 |
14 |
Learning to Reconstruct Shapes from Unseen Classes |
248 |
14 |
Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation |
249 |
14 |
Where Do You Think You’re Going?: Inferring Beliefs about Dynamics from Behavior |
250 |
14 |
Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making |
251 |
14 |
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction |
252 |
13 |
Co-regularized Alignment for Unsupervised Domain Adaptation |
253 |
13 |
RenderNet: A deep convolutional network for differentiable rendering from 3D shapes |
254 |
13 |
Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization |
255 |
13 |
Distributed Multi-Player Bandits – a Game of Thrones Approach |
256 |
13 |
Learning to Teach with Dynamic Loss Functions |
257 |
13 |
Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance |
258 |
13 |
Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization |
259 |
13 |
Minimax Statistical Learning with Wasserstein distances |
260 |
13 |
Non-monotone Submodular Maximization in Exponentially Fewer Iterations |
261 |
13 |
Evolution-Guided Policy Gradient in Reinforcement Learning |
262 |
13 |
Empirical Risk Minimization in Non-interactive Local Differential Privacy Revisited |
263 |
13 |
Self-Erasing Network for Integral Object Attention |
264 |
13 |
PAC-learning in the presence of adversaries |
265 |
12 |
Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound |
266 |
12 |
Neural Proximal Gradient Descent for Compressive Imaging |
267 |
12 |
Confounding-Robust Policy Improvement |
268 |
12 |
Reducing Network Agnostophobia |
269 |
12 |
DeepPINK: reproducible feature selection in deep neural networks |
270 |
12 |
Data-dependent PAC-Bayes priors via differential privacy |
271 |
12 |
Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding |
272 |
12 |
Knowledge Distillation by On-the-Fly Native Ensemble |
273 |
12 |
Dual Policy Iteration |
274 |
12 |
Differentially Private Testing of Identity and Closeness of Discrete Distributions |
275 |
12 |
On Fast Leverage Score Sampling and Optimal Learning |
276 |
12 |
How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery? |
277 |
12 |
A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization |
278 |
12 |
COLA: Decentralized Linear Learning |
279 |
12 |
A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization |
280 |
12 |
DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors |
281 |
12 |
Simple random search of static linear policies is competitive for reinforcement learning |
282 |
12 |
Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization |
283 |
12 |
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization |
284 |
12 |
KDGAN: Knowledge Distillation with Generative Adversarial Networks |
285 |
12 |
Do Less, Get More: Streaming Submodular Maximization with Subsampling |
286 |
12 |
Learning Disentangled Joint Continuous and Discrete Representations |
287 |
12 |
Dialog-based Interactive Image Retrieval |
288 |
11 |
Context-aware Synthesis and Placement of Object Instances |
289 |
11 |
Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution |
290 |
11 |
Learning with SGD and Random Features |
291 |
11 |
A Retrieve-and-Edit Framework for Predicting Structured Outputs |
292 |
11 |
Deep Dynamical Modeling and Control of Unsteady Fluid Flows |
293 |
11 |
Group Equivariant Capsule Networks |
294 |
11 |
Adversarial Regularizers in Inverse Problems |
295 |
11 |
Depth-Limited Solving for Imperfect-Information Games |
296 |
11 |
Online Adaptive Methods, Universality and Acceleration |
297 |
11 |
Privacy Amplification by Subsampling: Tight Analyses via Couplings and Divergences |
298 |
11 |
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming |
299 |
11 |
Unsupervised Video Object Segmentation for Deep Reinforcement Learning |
300 |
11 |
Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity |
301 |
11 |
Can We Gain More from Orthogonality Regularizations in Training Deep Networks? |
302 |
11 |
Unsupervised Learning of Object Landmarks through Conditional Image Generation |
303 |
11 |
Data center cooling using model-predictive control |
304 |
11 |
Adversarial Attacks on Stochastic Bandits |
305 |
11 |
Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals |
306 |
11 |
Deep Reinforcement Learning of Marked Temporal Point Processes |
307 |
11 |
One-Shot Unsupervised Cross Domain Translation |
308 |
11 |
Distilled Wasserstein Learning for Word Embedding and Topic Modeling |
309 |
11 |
Deep Defense: Training DNNs with Improved Adversarial Robustness |
310 |
11 |
Sparse DNNs with Improved Adversarial Robustness |
311 |
11 |
Learning long-range spatial dependencies with horizontal gated recurrent units |
312 |
10 |
The Price of Fair PCA: One Extra dimension |
313 |
10 |
Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions |
314 |
10 |
Human-in-the-Loop Interpretability Prior |
315 |
10 |
Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation |
316 |
10 |
Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport |
317 |
10 |
Deep Anomaly Detection Using Geometric Transformations |
318 |
10 |
Hardware Conditioned Policies for Multi-Robot Transfer Learning |
319 |
10 |
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes |
320 |
10 |
Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation |
321 |
10 |
Plug-in Estimation in High-Dimensional Linear Inverse Problems: A Rigorous Analysis |
322 |
10 |
Chaining Mutual Information and Tightening Generalization Bounds |
323 |
10 |
Causal Inference with Noisy and Missing Covariates via Matrix Factorization |
324 |
10 |
Scalable Hyperparameter Transfer Learning |
325 |
10 |
Generative Probabilistic Novelty Detection with Adversarial Autoencoders |
326 |
10 |
BRITS: Bidirectional Recurrent Imputation for Time Series |
327 |
10 |
Compact Generalized Non-local Network |
328 |
10 |
Recurrent Transformer Networks for Semantic Correspondence |
329 |
10 |
A Dual Framework for Low-rank Tensor Completion |
330 |
10 |
Policy Optimization via Importance Sampling |
331 |
10 |
Boolean Decision Rules via Column Generation |
332 |
10 |
MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare |
333 |
10 |
Leveraging the Exact Likelihood of Deep Latent Variable Models |
334 |
10 |
The committee machine: Computational to statistical gaps in learning a two-layers neural network |
335 |
10 |
Paraphrasing Complex Network: Network Compression via Factor Transfer |
336 |
10 |
Learning Hierarchical Semantic Image Manipulation through Structured Representations |
337 |
10 |
Leveraged volume sampling for linear regression |
338 |
10 |
3D-Aware Scene Manipulation via Inverse Graphics |
339 |
10 |
On Oracle-Efficient PAC RL with Rich Observations |
340 |
10 |
Adaptive Online Learning in Dynamic Environments |
341 |
10 |
Posterior Concentration for Sparse Deep Learning |
342 |
10 |
Deep Neural Nets with Interpolating Function as Output Activation |
343 |
10 |
Adapted Deep Embeddings: A Synthesis of Methods for k-Shot Inductive Transfer Learning |
344 |
9 |
Learning to Share and Hide Intentions using Information Regularization |
345 |
9 |
Deep Predictive Coding Network with Local Recurrent Processing for Object Recognition |
346 |
9 |
Reversible Recurrent Neural Networks |
347 |
9 |
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition |
348 |
9 |
Transfer Learning with Neural AutoML |
349 |
9 |
Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo |
350 |
9 |
Training Neural Networks Using Features Replay |
351 |
9 |
On Coresets for Logistic Regression |
352 |
9 |
Multi-View Silhouette and Depth Decomposition for High Resolution 3D Object Representation |
353 |
9 |
Deep Generative Models for Distribution-Preserving Lossy Compression |
354 |
9 |
Learning Task Specifications from Demonstrations |
355 |
9 |
Deep Generative Markov State Models |
356 |
9 |
TopRank: A practical algorithm for online stochastic ranking |
357 |
9 |
Escaping Saddle Points in Constrained Optimization |
358 |
9 |
Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates |
359 |
9 |
Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes |
360 |
9 |
Multivariate Time Series Imputation with Generative Adversarial Networks |
361 |
9 |
Toddler-Inspired Visual Object Learning |
362 |
9 |
Image Inpainting via Generative Multi-column Convolutional Neural Networks |
363 |
8 |
Learning Temporal Point Processes via Reinforcement Learning |
364 |
8 |
With Friends Like These, Who Needs Adversaries? |
365 |
8 |
Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems |
366 |
8 |
Learning Abstract Options |
367 |
8 |
Improving Simple Models with Confidence Profiles |
368 |
8 |
Robustness of conditional GANs to noisy labels |
369 |
8 |
Blockwise Parallel Decoding for Deep Autoregressive Models |
370 |
8 |
Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams |
371 |
8 |
Maximizing acquisition functions for Bayesian optimization |
372 |
8 |
Global Non-convex Optimization with Discretized Diffusions |
373 |
8 |
Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation |
374 |
8 |
Beyond Grids: Learning Graph Representations for Visual Recognition |
375 |
8 |
Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data |
376 |
8 |
Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features |
377 |
8 |
Online Learning of Quantum States |
378 |
8 |
Automatic differentiation in ML: Where we are and where we should be going |
379 |
8 |
Generalisation of structural knowledge in the hippocampal-entorhinal system |
380 |
8 |
Hamiltonian Variational Auto-Encoder |
381 |
8 |
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training |
382 |
8 |
Approximate Knowledge Compilation by Online Collapsed Importance Sampling |
383 |
8 |
Beyond Log-concavity: Provable Guarantees for Sampling Multi-modal Distributions using Simulated Tempering Langevin Monte Carlo |
384 |
8 |
Distributed k-Clustering for Data with Heavy Noise |
385 |
8 |
Learning Libraries of Subroutines for Neurally–Guided Bayesian Program Induction |
386 |
8 |
Learning Loop Invariants for Program Verification |
387 |
8 |
Towards Text Generation with Adversarially Learned Neural Outlines |
388 |
8 |
Out-of-Distribution Detection using Multiple Semantic Label Representations |
389 |
8 |
Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks |
390 |
8 |
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search |
391 |
8 |
Incorporating Context into Language Encoding Models for fMRI |
392 |
8 |
Approximating Real-Time Recurrent Learning with Random Kronecker Factors |
393 |
8 |
Turbo Learning for CaptionBot and DrawingBot |
394 |
8 |
L4: Practical loss-based stepsize adaptation for deep learning |
395 |
8 |
Online convex optimization for cumulative constraints |
396 |
8 |
Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning |
397 |
8 |
CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces |
398 |
8 |
Content preserving text generation with attribute controls |
399 |
8 |
On the Local Minima of the Empirical Risk |
400 |
8 |
End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems |
401 |
8 |
Mean-field theory of graph neural networks in graph partitioning |
402 |
8 |
Differentially Private Uniformly Most Powerful Tests for Binomial Data |
403 |
8 |
Heterogeneous Bitwidth Binarization in Convolutional Neural Networks |
404 |
8 |
Acceleration through Optimistic No-Regret Dynamics |
405 |
8 |
Bayesian Inference of Temporal Task Specifications from Demonstrations |
406 |
8 |
BinGAN: Learning Compact Binary Descriptors with a Regularized GAN |
407 |
8 |
Neural Code Comprehension: A Learnable Representation of Code Semantics |
408 |
8 |
Inequity aversion improves cooperation in intertemporal social dilemmas |
409 |
8 |
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks |
410 |
8 |
Local Differential Privacy for Evolving Data |
411 |
8 |
Attention in Convolutional LSTM for Gesture Recognition |
412 |
8 |
Symbolic Graph Reasoning Meets Convolutions |
413 |
8 |
Collaborative Learning for Deep Neural Networks |
414 |
8 |
Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners |
415 |
8 |
Global Geometry of Multichannel Sparse Blind Deconvolution on the Sphere |
416 |
8 |
MetaReg: Towards Domain Generalization using Meta-Regularization |
417 |
8 |
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks |
418 |
8 |
LinkNet: Relational Embedding for Scene Graph |
419 |
8 |
Nonlocal Neural Networks, Nonlocal Diffusion and Nonlocal Modeling |
420 |
8 |
Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions |
421 |
8 |
Self-Supervised Generation of Spatial Audio for 360° Video |
422 |
8 |
See and Think: Disentangling Semantic Scene Completion |
423 |
8 |
Geometrically Coupled Monte Carlo Sampling |
424 |
7 |
Understanding Regularized Spectral Clustering via Graph Conductance |
425 |
7 |
Connecting Optimization and Regularization Paths |
426 |
7 |
Nonparametric Density Estimation under Adversarial Losses |
427 |
7 |
Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming |
428 |
7 |
Generalization Bounds for Uniformly Stable Algorithms |
429 |
7 |
Towards Deep Conversational Recommendations |
430 |
7 |
Ex ante coordination and collusion in zero-sum multi-player extensive-form games |
431 |
7 |
Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization |
432 |
7 |
Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis |
433 |
7 |
DAGs with NO TEARS: Continuous Optimization for Structure Learning |
434 |
7 |
Quadrature-based features for kernel approximation |
435 |
7 |
Differential Privacy for Growing Databases |
436 |
7 |
HOUDINI: Lifelong Learning as Program Synthesis |
437 |
7 |
A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks |
438 |
7 |
How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective |
439 |
7 |
Robust Hypothesis Testing Using Wasserstein Uncertainty Sets |
440 |
7 |
Streaming Kernel PCA with \tilde{O}(\sqrt{n}) Random Features |
441 |
7 |
Learning Latent Subspaces in Variational Autoencoders |
442 |
7 |
Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization |
443 |
7 |
Information Constraints on Auto-Encoding Variational Bayes |
444 |
7 |
Dual Swap Disentangling |
445 |
7 |
A Convex Duality Framework for GANs |
446 |
7 |
ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions |
447 |
7 |
Neural Networks Trained to Solve Differential Equations Learn General Representations |
448 |
7 |
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning |
449 |
7 |
But How Does It Work in Theory? Linear SVM with Random Features |
450 |
7 |
Faithful Inversion of Generative Models for Effective Amortized Inference |
451 |
7 |
Weakly Supervised Dense Event Captioning in Videos |
452 |
7 |
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes |
453 |
7 |
Wasserstein Variational Inference |
454 |
7 |
BourGAN: Generative Networks with Metric Embeddings |
455 |
7 |
The Description Length of Deep Learning models |
456 |
7 |
Trajectory Convolution for Action Recognition |
457 |
7 |
Distributed Stochastic Optimization via Adaptive SGD |
458 |
7 |
Bayesian Semi-supervised Learning with Graph Gaussian Processes |
459 |
7 |
Multi-Class Learning: From Theory to Algorithm |
460 |
7 |
Hybrid Knowledge Routed Modules for Large-scale Object Detection |
461 |
7 |
A Game-Theoretic Approach to Recommendation Systems with Strategic Content Providers |
462 |
7 |
Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN |
463 |
7 |
A Model for Learned Bloom Filters and Optimizing by Sandwiching |
464 |
7 |
How Many Samples are Needed to Estimate a Convolutional Neural Network? |
465 |
7 |
Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with \beta-Divergences |
466 |
6 |
Forward Modeling for Partial Observation Strategy Games – A StarCraft Defogger |
467 |
6 |
The Sparse Manifold Transform |
468 |
6 |
Learning to Solve SMT Formulas |
469 |
6 |
Bayesian Nonparametric Spectral Estimation |
470 |
6 |
Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform |
471 |
6 |
Online Robust Policy Learning in the Presence of Unknown Adversaries |
472 |
6 |
Object-Oriented Dynamics Predictor |
473 |
6 |
Improving Explorability in Variational Inference with Annealed Variational Objectives |
474 |
6 |
Learning Compressed Transforms with Low Displacement Rank |
475 |
6 |
Orthogonally Decoupled Variational Gaussian Processes |
476 |
6 |
Wasserstein Distributionally Robust Kalman Filtering |
477 |
6 |
Teaching Inverse Reinforcement Learners via Features and Demonstrations |
478 |
6 |
Credit Assignment For Collective Multiagent RL With Global Rewards |
479 |
6 |
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks |
480 |
6 |
Generative modeling for protein structures |
481 |
6 |
Disconnected Manifold Learning for Generative Adversarial Networks |
482 |
6 |
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis |
483 |
6 |
BRUNO: A Deep Recurrent Model for Exchangeable Data |
484 |
6 |
Manifold-tiling Localized Receptive Fields are Optimal in Similarity-preserving Neural Networks |
485 |
6 |
Bayesian Alignments of Warped Multi-Output Gaussian Processes |
486 |
6 |
Sharp Bounds for Generalized Uniformity Testing |
487 |
6 |
Constructing Fast Network through Deconstruction of Convolution |
488 |
6 |
Adversarially Robust Optimization with Gaussian Processes |
489 |
6 |
Bandit Learning in Concave N-Person Games |
490 |
6 |
Occam’s razor is insufficient to infer the preferences of irrational agents |
491 |
6 |
The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network |
492 |
6 |
Unsupervised Adversarial Invariance |
493 |
6 |
Densely Connected Attention Propagation for Reading Comprehension |
494 |
6 |
Training deep learning based denoisers without ground truth data |
495 |
6 |
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations |
496 |
6 |
Norm-Ranging LSH for Maximum Inner Product Search |
497 |
6 |
Learning a High Fidelity Pose Invariant Model for High-resolution Face Frontalization |
498 |
6 |
Answerer in Questioner’s Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog |
499 |
6 |
Model Agnostic Supervised Local Explanations |
500 |
6 |
Modular Networks: Learning to Decompose Neural Computation |
501 |
6 |
Structured Local Minima in Sparse Blind Deconvolution |
502 |
6 |
Smoothed analysis of the low-rank approach for smooth semidefinite programs |
503 |
6 |
Efficient Stochastic Gradient Hard Thresholding |
504 |
6 |
Random Feature Stein Discrepancies |
505 |
6 |
Variational Memory Encoder-Decoder |
506 |
6 |
On Misinformation Containment in Online Social Networks |
507 |
6 |
Deep Non-Blind Deconvolution via Generalized Low-Rank Approximation |
508 |
6 |
Sigsoftmax: Reanalysis of the Softmax Bottleneck |
509 |
6 |
Supervised autoencoders: Improving generalization performance with unsupervised regularizers |
510 |
6 |
Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language |
511 |
6 |
Structure-Aware Convolutional Neural Networks |
512 |
6 |
Efficient Algorithms for Non-convex Isotonic Regression through Submodular Optimization |
513 |
5 |
Bias and Generalization in Deep Generative Models: An Empirical Study |
514 |
5 |
Benefits of over-parameterization with EM |
515 |
5 |
Fully Neural Network Based Speech Recognition on Mobile and Embedded Devices |
516 |
5 |
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning |
517 |
5 |
Gaussian Process Prior Variational Autoencoders |
518 |
5 |
Learning To Learn Around A Common Mean |
519 |
5 |
Low-Rank Tucker Decomposition of Large Tensors Using TensorSketch |
520 |
5 |
Blind Deconvolutional Phase Retrieval via Convex Programming |
521 |
5 |
Coupled Variational Bayes via Optimization Embedding |
522 |
5 |
Improving Online Algorithms via ML Predictions |
523 |
5 |
e-SNLI: Natural Language Inference with Natural Language Explanations |
524 |
5 |
Invariant Representations without Adversarial Training |
525 |
5 |
SING: Symbol-to-Instrument Neural Generator |
526 |
5 |
A Structured Prediction Approach for Label Ranking |
527 |
5 |
Uniform Convergence of Gradients for Non-Convex Learning and Optimization |
528 |
5 |
Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs |
529 |
5 |
Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images |
530 |
5 |
On preserving non-discrimination when combining expert advice |
531 |
5 |
Algorithms and Theory for Multiple-Source Adaptation |
532 |
5 |
Variational Bayesian Monte Carlo |
533 |
5 |
Adversarial Scene Editing: Automatic Object Removal from Weak Supervision |
534 |
5 |
Non-Adversarial Mapping with VAEs |
535 |
5 |
Stochastic Chebyshev Gradient Descent for Spectral Optimization |
536 |
5 |
Implicit Probabilistic Integrators for ODEs |
537 |
5 |
Provably Correct Automatic Sub-Differentiation for Qualified Programs |
538 |
5 |
Heterogeneous Multi-output Gaussian Process Prediction |
539 |
5 |
Contamination Attacks and Mitigation in Multi-Party Machine Learning |
540 |
5 |
Bayesian Distributed Stochastic Gradient Descent |
541 |
5 |
Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling |
542 |
5 |
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models |
543 |
5 |
Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification |
544 |
5 |
Automating Bayesian optimization with Bayesian optimization |
545 |
5 |
Exact natural gradient in deep linear networks and its application to the nonlinear case |
546 |
5 |
Binary Classification from Positive-Confidence Data |
547 |
5 |
Learning to Multitask |
548 |
5 |
Variational Inference with Tail-adaptive f-Divergence |
549 |
5 |
Learning Others’ Intentional Models in Multi-Agent Settings Using Interactive POMDPs |
550 |
5 |
Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing |
551 |
5 |
Estimating Learnability in the Sublinear Data Regime |
552 |
5 |
Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance |
553 |
5 |
Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning |
554 |
5 |
Supervising Unsupervised Learning |
555 |
5 |
The Physical Systems Behind Optimization Algorithms |
556 |
5 |
The Price of Privacy for Low-rank Factorization |
557 |
5 |
Distributed Weight Consolidation: A Brain Segmentation Case Study |
558 |
5 |
Learning sparse neural networks via sensitivity-driven regularization |
559 |
5 |
Lipschitz regularity of deep neural networks: analysis and efficient estimation |
560 |
5 |
A Bandit Approach to Sequential Experimental Design with False Discovery Control |
561 |
5 |
Optimal Subsampling with Influence Functions |
562 |
5 |
Modern Neural Networks Generalize on Small Data Sets |
563 |
5 |
Boosting Black Box Variational Inference |
564 |
5 |
Single-Agent Policy Tree Search With Guarantees |
565 |
5 |
Q-learning with Nearest Neighbors |
566 |
5 |
Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models |
567 |
5 |
Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base |
568 |
5 |
Mirrored Langevin Dynamics |
569 |
5 |
Computing Higher Order Derivatives of Matrix and Tensor Expressions |
570 |
5 |
Gaussian Process Conditional Density Estimation |
571 |
5 |
Sequential Context Encoding for Duplicate Removal |
572 |
5 |
Precision and Recall for Time Series |
573 |
5 |
Partially-Supervised Image Captioning |
574 |
5 |
Temporal Regularization for Markov Decision Process |
575 |
5 |
Neural Guided Constraint Logic Programming for Program Synthesis |
576 |
5 |
Learning Versatile Filters for Efficient Convolutional Neural Networks |
577 |
5 |
Found Graph Data and Planted Vertex Covers |
578 |
5 |
Generative Neural Machine Translation |
579 |
5 |
Revisiting Multi-Task Learning with ROCK: a Deep Residual Auxiliary Block for Visual Detection |
580 |
5 |
Unsupervised Learning of View-invariant Action Representations |
581 |
5 |
A flexible model for training action localization with varying levels of supervision |
582 |
5 |
Solving Large Sequential Games with the Excessive Gap Technique |
583 |
5 |
On Learning Markov Chains |
584 |
5 |
Rest-Katyusha: Exploiting the Solution’s Structure via Scheduled Restart Schemes |
585 |
5 |
Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC |
586 |
5 |
Chain of Reasoning for Visual Question Answering |
587 |
5 |
Snap ML: A Hierarchical Framework for Machine Learning |
588 |
5 |
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation |
589 |
4 |
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking |
590 |
4 |
Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons |
591 |
4 |
Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language |
592 |
4 |
Data-Driven Clustering via Parameterized Lloyd’s Families |
593 |
4 |
Complex Gated Recurrent Neural Networks |
594 |
4 |
Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior |
595 |
4 |
Temporal alignment and latent Gaussian process factor inference in population spike trains |
596 |
4 |
PCA of high dimensional random walks with comparison to neural network training |
597 |
4 |
Using Large Ensembles of Control Variates for Variational Inference |
598 |
4 |
Non-delusional Q-learning and value-iteration |
599 |
4 |
Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models |
600 |
4 |
Entropy Rate Estimation for Markov Chains with Large State Space |
601 |
4 |
Invertibility of Convolutional Generative Networks from Partial Measurements |
602 |
4 |
Multi-objective Maximization of Monotone Submodular Functions with Cardinality Constraint |
603 |
4 |
Learning and Testing Causal Models with Interventions |
604 |
4 |
The Global Anchor Method for Quantifying Linguistic Shifts and Domain Adaptation |
605 |
4 |
Learning Attractor Dynamics for Generative Memory |
606 |
4 |
PAC-Bayes bounds for stable algorithms with instance-dependent priors |
607 |
4 |
Learning Safe Policies with Expert Guidance |
608 |
4 |
Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes |
609 |
4 |
Exploration in Structured Reinforcement Learning |
610 |
4 |
Data Amplification: A Unified and Competitive Approach to Property Estimation |
611 |
4 |
Contextual Stochastic Block Models |
612 |
4 |
Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks |
613 |
4 |
Diffusion Maps for Textual Network Embedding |
614 |
4 |
Constrained Cross-Entropy Method for Safe Reinforcement Learning |
615 |
4 |
Bandit Learning with Implicit Feedback |
616 |
4 |
Model-Agnostic Private Learning |
617 |
4 |
Causal Inference via Kernel Deviance Measures |
618 |
4 |
Scaling Gaussian Process Regression with Derivatives |
619 |
4 |
A no-regret generalization of hierarchical softmax to extreme multi-label classification |
620 |
4 |
Deep Structured Prediction with Nonlinear Output Transformations |
621 |
4 |
Transfer of Value Functions via Variational Methods |
622 |
4 |
Variational Learning on Aggregate Outputs with Gaussian Processes |
623 |
4 |
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks |
624 |
4 |
Multi-Task Zipping via Layer-wise Neuron Sharing |
625 |
4 |
Computing Kantorovich-Wasserstein Distances on d-dimensional histograms using (d+1)-partite graphs |
626 |
4 |
Reparameterization Gradient for Non-differentiable Models |
627 |
4 |
Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization |
628 |
4 |
Geometry-Aware Recurrent Neural Networks for Active Visual Recognition |
629 |
4 |
Bandit Learning with Positive Externalities |
630 |
4 |
Interpreting Neural Network Judgments via Minimal, Stable, and Symbolic Corrections |
631 |
4 |
Differentially Private Contextual Linear Bandits |
632 |
4 |
Scalable Coordinated Exploration in Concurrent Reinforcement Learning |
633 |
4 |
Bilevel Distance Metric Learning for Robust Image Recognition |
634 |
4 |
An Information-Theoretic Analysis for Thompson Sampling with Many Actions |
635 |
4 |
GumBolt: Extending Gumbel trick to Boltzmann priors |
636 |
4 |
Variational PDEs for Acceleration on Manifolds and Application to Diffeomorphisms |
637 |
4 |
Direct Estimation of Differences in Causal Graphs |
638 |
4 |
Convergence of Cubic Regularization for Nonconvex Optimization under KL Property |
639 |
4 |
Tight Bounds for Collaborative PAC Learning via Multiplicative Weights |
640 |
4 |
Differentially Private Bayesian Inference for Exponential Families |
641 |
4 |
Representation Learning for Treatment Effect Estimation from Observational Data |
642 |
4 |
Revisiting Decomposable Submodular Function Minimization with Incidence Relations |
643 |
4 |
SEGA: Variance Reduction via Gradient Sketching |
644 |
4 |
Virtual Class Enhanced Discriminative Embedding Learning |
645 |
4 |
Relating Leverage Scores and Density using Regularized Christoffel Functions |
646 |
4 |
DifNet: Semantic Segmentation by Diffusion Networks |
647 |
4 |
Regularization Learning Networks: Deep Learning for Tabular Datasets |
648 |
4 |
Joint Active Feature Acquisition and Classification with Variable-Size Set Encoding |
649 |
4 |
Quadratic Decomposable Submodular Function Minimization |
650 |
4 |
A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents |
651 |
4 |
Uncertainty-Aware Attention for Reliable Interpretation and Prediction |
652 |
4 |
Generalizing Graph Matching beyond Quadratic Assignment Model |
653 |
4 |
Informative Features for Model Comparison |
654 |
4 |
Training DNNs with Hybrid Block Floating Point |
655 |
4 |
Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems |
656 |
4 |
An Off-policy Policy Gradient Theorem Using Emphatic Weightings |
657 |
4 |
Generalized Inverse Optimization through Online Learning |
658 |
4 |
Kalman Normalization: Normalizing Internal Representations Across Network Layers |
659 |
3 |
Transfer of Deep Reactive Policies for MDP Planning |
660 |
3 |
Multiple Instance Learning for Efficient Sequential Data Classification on Resource-constrained Devices |
661 |
3 |
Point process latent variable models of larval zebrafish behavior |
662 |
3 |
Differentially Private Change-Point Detection |
663 |
3 |
Learning Beam Search Policies via Imitation Learning |
664 |
3 |
Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders |
665 |
3 |
A Simple Cache Model for Image Recognition |
666 |
3 |
On Markov Chain Gradient Descent |
667 |
3 |
Unsupervised Depth Estimation, 3D Face Rotation and Replacement |
668 |
3 |
Learning convex bounds for linear quadratic control policy synthesis |
669 |
3 |
The Effect of Network Width on the Performance of Large-batch Training |
670 |
3 |
The Importance of Sampling inMeta-Reinforcement Learning |
671 |
3 |
Coordinate Descent with Bandit Sampling |
672 |
3 |
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages |
673 |
3 |
Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity |
674 |
3 |
A convex program for bilinear inversion of sparse vectors |
675 |
3 |
The promises and pitfalls of Stochastic Gradient Langevin Dynamics |
676 |
3 |
Efficient Online Portfolio with Logarithmic Regret |
677 |
3 |
Proximal Graphical Event Models |
678 |
3 |
Learning Signed Determinantal Point Processes through the Principal Minor Assignment Problem |
679 |
3 |
GILBO: One Metric to Measure Them All |
680 |
3 |
Bayesian Adversarial Learning |
681 |
3 |
Extracting Relationships by Multi-Domain Matching |
682 |
3 |
Unsupervised Learning of Artistic Styles with Archetypal Style Analysis |
683 |
3 |
The Limits of Post-Selection Generalization |
684 |
3 |
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient |
685 |
3 |
Deep Neural Networks with Box Convolutions |
686 |
3 |
Graphical Generative Adversarial Networks |
687 |
3 |
Neural Interaction Transparency (NIT): Disentangling Learned Interactions for Improved Interpretability |
688 |
3 |
Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation |
689 |
3 |
Learning a latent manifold of odor representations from neural responses in piriform cortex |
690 |
3 |
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training |
691 |
3 |
Scalable Robust Matrix Factorization with Nonconvex Loss |
692 |
3 |
Practical exact algorithm for trembling-hand equilibrium refinements in games |
693 |
3 |
Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks |
694 |
3 |
Adaptive Negative Curvature Descent with Applications in Non-convex Optimization |
695 |
3 |
Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions |
696 |
3 |
Exponentiated Strongly Rayleigh Distributions |
697 |
3 |
A Bridging Framework for Model Optimization and Deep Propagation |
698 |
3 |
Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models |
699 |
3 |
Meta-Learning MCMC Proposals |
700 |
3 |
The streaming rollout of deep networks – towards fully model-parallel execution |
701 |
3 |
Solving Non-smooth Constrained Programs with Lower Complexity than \mathcal{O}(1/\varepsilon): A Primal-Dual Homotopy Smoothing Approach |
702 |
3 |
Learning from discriminative feature feedback |
703 |
3 |
Bipartite Stochastic Block Models with Tiny Clusters |
704 |
3 |
Equality of Opportunity in Classification: A Causal Approach |
705 |
3 |
Sequence-to-Segment Networks for Segment Detection |
706 |
3 |
Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation |
707 |
3 |
Step Size Matters in Deep Learning |
708 |
3 |
From Stochastic Planning to Marginal MAP |
709 |
3 |
Constructing Deep Neural Networks by Bayesian Network Structure Learning |
710 |
3 |
Optimization over Continuous and Multi-dimensional Decisions with Observational Data |
711 |
3 |
Metric on Nonlinear Dynamical Systems with Perron-Frobenius Operators |
712 |
3 |
Safe Active Learning for Time-Series Modeling with Gaussian Processes |
713 |
3 |
Processing of missing data by neural networks |
714 |
3 |
A Practical Algorithm for Distributed Clustering and Outlier Detection |
715 |
3 |
Dual Principal Component Pursuit: Improved Analysis and Efficient Algorithms |
716 |
3 |
DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning |
717 |
3 |
Regularizing by the Variance of the Activations’ Sample-Variances |
718 |
3 |
Automatic Program Synthesis of Long Programs with a Learned Garbage Collector |
719 |
3 |
Nonparametric learning from Bayesian models with randomized objective functions |
720 |
3 |
Learning Optimal Reserve Price against Non-myopic Bidders |
721 |
3 |
Enhancing the Accuracy and Fairness of Human Decision Making |
722 |
3 |
Learning to Exploit Stability for 3D Scene Parsing |
723 |
3 |
Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning |
724 |
3 |
Geometry Based Data Generation |
725 |
3 |
New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity |
726 |
3 |
Alternating optimization of decision trees, with application to learning sparse oblique trees |
727 |
3 |
Synthesized Policies for Transfer and Adaptation across Tasks and Environments |
728 |
3 |
Interactive Structure Learning with Structural Query-by-Committee |
729 |
3 |
Efficient nonmyopic batch active search |
730 |
3 |
\ell_1-regression with Heavy-tailed Distributions |
731 |
3 |
Frequency-Domain Dynamic Pruning for Convolutional Neural Networks |
732 |
3 |
Visual Memory for Robust Path Following |
733 |
3 |
Maximum-Entropy Fine Grained Classification |
734 |
3 |
A Unified Framework for Extensive-Form Game Abstraction with Bounds |
735 |
3 |
HitNet: Hybrid Ternary Recurrent Neural Network |
736 |
3 |
Joint Sub-bands Learning with Clique Structures for Wavelet Domain Super-Resolution |
737 |
3 |
HOGWILD!-Gibbs can be PanAccurate |
738 |
2 |
Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements |
739 |
2 |
Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations |
740 |
2 |
Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues |
741 |
2 |
Robust Subspace Approximation in a Stream |
742 |
2 |
Bayesian Structure Learning by Recursive Bootstrap |
743 |
2 |
Total stochastic gradient algorithms and applications in reinforcement learning |
744 |
2 |
Synaptic Strength For Convolutional Neural Network |
745 |
2 |
A Spectral View of Adversarially Robust Features |
746 |
2 |
Testing for Families of Distributions via the Fourier Transform |
747 |
2 |
Scalable Laplacian K-modes |
748 |
2 |
Learning to Reason with Third Order Tensor Products |
749 |
2 |
Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization |
750 |
2 |
Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach |
751 |
2 |
Identification and Estimation of Causal Effects from Dependent Data |
752 |
2 |
Representer Point Selection for Explaining Deep Neural Networks |
753 |
2 |
Learning SMaLL Predictors |
754 |
2 |
Iterative Value-Aware Model Learning |
755 |
2 |
Improving Neural Program Synthesis with Inferred Execution Traces |
756 |
2 |
Estimators for Multivariate Information Measures in General Probability Spaces |
757 |
2 |
Stochastic Primal-Dual Method for Empirical Risk Minimization with O(1) Per-Iteration Complexity |
758 |
2 |
Distributionally Robust Graphical Models |
759 |
2 |
Bilevel learning of the Group Lasso structure |
760 |
2 |
Graphical model inference: Sequential Monte Carlo meets deterministic approximations |
761 |
2 |
Learning to Specialize with Knowledge Distillation for Visual Question Answering |
762 |
2 |
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data |
763 |
2 |
A General Method for Amortizing Variational Filtering |
764 |
2 |
Scalar Posterior Sampling with Applications |
765 |
2 |
Improved Algorithms for Collaborative PAC Learning |
766 |
2 |
Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks |
767 |
2 |
Training Deep Models Faster with Robust, Approximate Importance Sampling |
768 |
2 |
Efficient Loss-Based Decoding on Graphs for Extreme Classification |
769 |
2 |
Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies |
770 |
2 |
Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks |
771 |
2 |
Provable Gaussian Embedding with One Observation |
772 |
2 |
Model-based targeted dimensionality reduction for neuronal population data |
773 |
2 |
Representation Learning of Compositional Data |
774 |
2 |
Modeling Dynamic Missingness of Implicit Feedback for Recommendation |
775 |
2 |
Query K-means Clustering and the Double Dixie Cup Problem |
776 |
2 |
On the Local Hessian in Back-propagation |
777 |
2 |
On Controllable Sparse Alternatives to Softmax |
778 |
2 |
Multi-domain Causal Structure Learning in Linear Systems |
779 |
2 |
Deep State Space Models for Unconditional Word Generation |
780 |
2 |
Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer |
781 |
2 |
Diverse Ensemble Evolution: Curriculum Data-Model Marriage |
782 |
2 |
Loss Functions for Multiset Prediction |
783 |
2 |
Efficient inference for time-varying behavior during learning |
784 |
2 |
Contextual Pricing for Lipschitz Buyers |
785 |
2 |
Manifold Structured Prediction |
786 |
2 |
Middle-Out Decoding |
787 |
2 |
Differentially Private k-Means with Constant Multiplicative Error |
788 |
2 |
Fully Understanding The Hashing Trick |
789 |
2 |
Contour location via entropy reduction leveraging multiple information sources |
790 |
2 |
Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task |
791 |
2 |
Porcupine Neural Networks: Approximating Neural Network Landscapes |
792 |
2 |
Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates |
793 |
2 |
Context-dependent upper-confidence bounds for directed exploration |
794 |
2 |
Recurrently Controlled Recurrent Networks |
795 |
2 |
Hunting for Discriminatory Proxies in Linear Regression Models |
796 |
2 |
Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima |
797 |
2 |
Explaining Deep Learning Models — A Bayesian Non-parametric Approach |
798 |
2 |
Semi-Supervised Learning with Declaratively Specified Entropy Constraints |
799 |
2 |
Maximum Causal Tsallis Entropy Imitation Learning |
800 |
2 |
Mallows Models for Top-k Lists |
801 |
2 |
Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates |
802 |
2 |
Binary Rating Estimation with Graph Side Information |
803 |
2 |
Inexact trust-region algorithms on Riemannian manifolds |
804 |
2 |
Differentially Private Robust Low-Rank Approximation |
805 |
2 |
Probabilistic Neural Programmed Networks for Scene Generation |
806 |
2 |
Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization |
807 |
2 |
Sublinear Time Low-Rank Approximation of Distance Matrices |
808 |
2 |
Scaling the Poisson GLM to massive neural datasets through polynomial approximations |
809 |
2 |
Infinite-Horizon Gaussian Processes |
810 |
2 |
Learning Gaussian Processes by Minimizing PAC-Bayesian Generalization Bounds |
811 |
2 |
Deep, complex, invertible networks for inversion of transmission effects in multimode optical fibres |
812 |
2 |
Contextual Combinatorial Multi-armed Bandits with Volatile Arms and Submodular Reward |
813 |
2 |
Learning latent variable structured prediction models with Gaussian perturbations |
814 |
2 |
Practical Methods for Graph Two-Sample Testing |
815 |
2 |
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation |
816 |
2 |
Causal Discovery from Discrete Data using Hidden Compact Representation |
817 |
2 |
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms |
818 |
2 |
Structural Causal Bandits: Where to Intervene? |
819 |
2 |
Active Learning for Non-Parametric Regression Using Purely Random Trees |
820 |
2 |
Breaking the Span Assumption Yields Fast Finite-Sum Minimization |
821 |
2 |
Universal Growth in Production Economies |
822 |
2 |
High Dimensional Linear Regression using Lattice Basis Reduction |
823 |
2 |
Fighting Boredom in Recommender Systems with Linear Reinforcement Learning |
824 |
2 |
Generalizing Tree Probability Estimation via Bayesian Networks |
825 |
2 |
Global Gated Mixture of Second-order Pooling for Improving Deep Convolutional Neural Networks |
826 |
2 |
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization |
827 |
2 |
Boosted Sparse and Low-Rank Tensor Regression |
828 |
2 |
DropMax: Adaptive Variational Softmax |
829 |
2 |
Connectionist Temporal Classification with Maximum Entropy Regularization |
830 |
2 |
A Neural Compositional Paradigm for Image Captioning |
831 |
2 |
An Efficient Pruning Algorithm for Robust Isotonic Regression |
832 |
2 |
Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units |
833 |
1 |
Sparse PCA from Sparse Linear Regression |
834 |
1 |
Computationally and statistically efficient learning of causal Bayes nets using path queries |
835 |
1 |
Removing Hidden Confounding by Experimental Grounding |
836 |
1 |
MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization |
837 |
1 |
Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning |
838 |
1 |
Fast deep reinforcement learning using online adjustments from the past |
839 |
1 |
Streamlining Variational Inference for Constraint Satisfaction Problems |
840 |
1 |
Convex Elicitation of Continuous Properties |
841 |
1 |
Learning and Inference in Hilbert Space with Quantum Graphical Models |
842 |
1 |
Uplift Modeling from Separate Labels |
843 |
1 |
Dynamic Network Model from Partial Observations |
844 |
1 |
Theoretical guarantees for EM under misspecified Gaussian mixture models |
845 |
1 |
Statistical and Computational Trade-Offs in Kernel K-Means |
846 |
1 |
GLoMo: Unsupervised Learning of Transferable Relational Graphs |
847 |
1 |
Adaptive Path-Integral Autoencoders: Representation Learning and Planning for Dynamical Systems |
848 |
1 |
A Statistical Recurrent Model on the Manifold of Symmetric Positive Definite Matrices |
849 |
1 |
Stein Variational Gradient Descent as Moment Matching |
850 |
1 |
A Bayesian Nonparametric View on Count-Min Sketch |
851 |
1 |
Deep Poisson gamma dynamical systems |
852 |
1 |
Information-theoretic Limits for Community Detection in Network Models |
853 |
1 |
Online Reciprocal Recommendation with Theoretical Performance Guarantees |
854 |
1 |
Statistical mechanics of low-rank tensor decomposition |
855 |
1 |
Modelling and unsupervised learning of symmetric deformable object categories |
856 |
1 |
Efficient Anomaly Detection via Matrix Sketching |
857 |
1 |
Improved Expressivity Through Dendritic Neural Networks |
858 |
1 |
Stochastic Expectation Maximization with Variance Reduction |
859 |
1 |
Monte-Carlo Tree Search for Constrained POMDPs |
860 |
1 |
Breaking the Activation Function Bottleneck through Adaptive Parameterization |
861 |
1 |
Rectangular Bounding Process |
862 |
1 |
Adaptive Learning with Unknown Information Flows |
863 |
1 |
A Bayesian Approach to Generative Adversarial Imitation Learning |
864 |
1 |
Constant Regret, Generalized Mixability, and Mirror Descent |
865 |
1 |
How to tell when a clustering is (approximately) correct using convex relaxations |
866 |
1 |
Stimulus domain transfer in recurrent models for large scale cortical population prediction on video |
867 |
1 |
Efficient online algorithms for fast-rate regret bounds under sparsity |
868 |
1 |
Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation |
869 |
1 |
Unorganized Malicious Attacks Detection |
870 |
1 |
Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss |
871 |
1 |
rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions |
872 |
1 |
Maximizing Induced Cardinality Under a Determinantal Point Process |
873 |
1 |
Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms |
874 |
1 |
Stochastic Nonparametric Event-Tensor Decomposition |
875 |
1 |
Diminishing Returns Shape Constraints for Interpretability and Regularization |
876 |
1 |
Policy Regret in Repeated Games |
877 |
1 |
Large-Scale Stochastic Sampling from the Probability Simplex |
878 |
1 |
An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression |
879 |
1 |
Proximal SCOPE for Distributed Sparse Learning |
880 |
1 |
The Everlasting Database: Statistical Validity at a Fair Price |
881 |
1 |
Size-Noise Tradeoffs in Generative Networks |
882 |
1 |
Exponentially Weighted Imitation Learning for Batched Historical Data |
883 |
1 |
The Cluster Description Problem – Complexity Results, Formulations and Approximations |
884 |
1 |
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models |
885 |
1 |
Approximation algorithms for stochastic clustering |
886 |
1 |
Gamma-Poisson Dynamic Matrix Factorization Embedded with Metadata Influence |
887 |
1 |
Mental Sampling in Multimodal Representations |
888 |
1 |
Critical initialisation for deep signal propagation in noisy rectifier neural networks |
889 |
1 |
Learning convex polytopes with margin |
890 |
1 |
Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames |
891 |
1 |
Sample Efficient Stochastic Gradient Iterative Hard Thresholding Method for Stochastic Sparse Linear Regression with Limited Attribute Observation |
892 |
1 |
Horizon-Independent Minimax Linear Regression |
893 |
1 |
Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models |
894 |
1 |
Learning in Games with Lossy Feedback |
895 |
1 |
Learning Confidence Sets using Support Vector Machines |
896 |
1 |
Fast greedy algorithms for dictionary selection with generalized sparsity constraints |
897 |
1 |
Non-metric Similarity Graphs for Maximum Inner Product Search |
898 |
1 |
A Mathematical Model For Optimal Decisions In A Representative Democracy |
899 |
1 |
Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels |
900 |
1 |
PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits |
901 |
1 |
Learning filter widths of spectral decompositions with wavelets |
902 |
1 |
Lifelong Inverse Reinforcement Learning |
903 |
1 |
Expanding Holographic Embeddings for Knowledge Completion |
904 |
1 |
Submodular Field Grammars: Representation, Inference, and Application to Image Parsing |
905 |
1 |
BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training |
906 |
1 |
Flexible and accurate inference and learning for deep generative models |
907 |
1 |
KONG: Kernels for ordered-neighborhood graphs |
908 |
1 |
Minimax Estimation of Neural Net Distance |
909 |
1 |
Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization |
910 |
1 |
Multiplicative Weights Updates with Constant Step-Size in Graphical Constant-Sum Games |
911 |
1 |
Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization |
912 |
1 |
Stochastic Spectral and Conjugate Descent Methods |
913 |
1 |
Semi-crowdsourced Clustering with Deep Generative Models |
914 |
1 |
Parsimonious Bayesian deep networks |
915 |
1 |
Asymptotic optimality of adaptive importance sampling |
916 |
1 |
When do random forests fail? |
917 |
1 |
Adaptation to Easy Data in Prediction with Limited Advice |
918 |
1 |
Gradient Descent Meets Shift-and-Invert Preconditioning for Eigenvector Computation |
919 |
1 |
Continuous-time Value Function Approximation in Reproducing Kernel Hilbert Spaces |
920 |
1 |
Provable Variational Inference for Constrained Log-Submodular Models |
921 |
1 |
Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling |
922 |
1 |
Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning |
923 |
1 |
Mixture Matrix Completion |
924 |
1 |
Algorithmic Linearly Constrained Gaussian Processes |
925 |
1 |
SplineNets: Continuous Neural Decision Graphs |
926 |
1 |
The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning |
927 |
1 |
Video Prediction via Selective Sampling |
928 |
1 |
Stochastic Composite Mirror Descent: Optimal Bounds with High Probabilities |
929 |
1 |
A loss framework for calibrated anomaly detection |
930 |
1 |
Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution |
931 |
1 |
Multitask Boosting for Survival Analysis with Competing Risks |
932 |
1 |
The Lingering of Gradients: How to Reuse Gradients Over Time |
933 |
1 |
(Probably) Concave Graph Matching |
934 |
1 |
Fast Similarity Search via Optimal Sparse Lifting |
935 |
0 |
Contrastive Learning from Pairwise Measurements |
936 |
0 |
Support Recovery for Orthogonal Matching Pursuit: Upper and Lower bounds |
937 |
0 |
Sketching Method for Large Scale Combinatorial Inference |
938 |
0 |
Regret Bounds for Online Portfolio Selection with a Cardinality Constraint |
939 |
0 |
Improved Network Robustness with Adversary Critic |
940 |
0 |
Discretely Relaxing Continuous Variables for tractable Variational Inference |
941 |
0 |
Bounded-Loss Private Prediction Markets |
942 |
0 |
Lifted Weighted Mini-Bucket |
943 |
0 |
Predictive Approximate Bayesian Computation via Saddle Points |
944 |
0 |
Learning Invariances using the Marginal Likelihood |
945 |
0 |
Variance-Reduced Stochastic Gradient Descent on Streaming Data |
946 |
0 |
Trading robust representations for sample complexity through self-supervised visual experience |
947 |
0 |
PAC-Bayes Tree: weighted subtrees with guarantees |
948 |
0 |
The emergence of multiple retinal cell types through efficient coding of natural movies |
949 |
0 |
The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models |
950 |
0 |
Inferring Latent Velocities from Weather Radar Data using Gaussian Processes |
951 |
0 |
Wavelet regression and additive models for irregularly spaced data |
952 |
0 |
Distributed Multitask Reinforcement Learning with Quadratic Convergence |
953 |
0 |
Legendre Decomposition for Tensors |
954 |
0 |
Compact Representation of Uncertainty in Clustering |
955 |
0 |
Clustering Redemption–Beyond the Impossibility of Kleinberg’s Axioms |
956 |
0 |
Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds |
957 |
0 |
Submodular Maximization via Gradient Ascent: The Case of Deep Submodular Functions |
958 |
0 |
Dirichlet belief networks for topic structure learning |
959 |
0 |
A Reduction for Efficient LDA Topic Reconstruction |
960 |
0 |
Preference Based Adaptation for Learning Objectives |
961 |
0 |
On Neuronal Capacity |
962 |
0 |
Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation |
963 |
0 |
Deep Homogeneous Mixture Models: Representation, Separation, and Approximation |
964 |
0 |
A probabilistic population code based on neural samples |
965 |
0 |
Efficient Gradient Computation for Structured Output Learning with Rational and Tropical Losses |
966 |
0 |
A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice |
967 |
0 |
Algebraic tests of general Gaussian latent tree models |
968 |
0 |
Online Improper Learning with an Approximation Oracle |
969 |
0 |
Community Exploration: From Offline Optimization to Online Learning |
970 |
0 |
Learning Concave Conditional Likelihood Models for Improved Analysis of Tandem Mass Spectra |
971 |
0 |
Experimental Design for Cost-Aware Learning of Causal Graphs |
972 |
0 |
Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression |
973 |
0 |
Multi-armed Bandits with Compensation |
974 |
0 |
Power-law efficient neural codes provide general link between perceptual bias and discriminability |
975 |
0 |
Learning from Group Comparisons: Exploiting Higher Order Interactions |
976 |
0 |
Objective and efficient inference for couplings in neuronal networks |
977 |
0 |
Neural Edit Operations for Biological Sequences |
978 |
0 |
Measures of distortion for machine learning |
979 |
0 |
Information-based Adaptive Stimulus Selection to Optimize Communication Efficiency in Brain-Computer Interfaces |
980 |
0 |
A Smoother Way to Train Structured Prediction Models |
981 |
0 |
Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making |
982 |
0 |
Active Matting |
983 |
0 |
Limited Memory Kelley’s Method Converges for Composite Convex and Submodular Objectives |
984 |
0 |
Completing State Representations using Spectral Learning |
985 |
0 |
Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification |
986 |
0 |
TETRIS: TilE-matching the TRemendous Irregular Sparsity |
987 |
0 |
Efficient Projection onto the Perfect Phylogeny Model |
988 |
0 |
Beauty-in-averageness and its contextual modulations: A Bayesian statistical account |
989 |
0 |
Early Stopping for Nonparametric Testing |
990 |
0 |
Inferring Networks From Random Walk-Based Node Similarities |
991 |
0 |
Communication Efficient Parallel Algorithms for Optimization on Manifolds |
992 |
0 |
Latent Gaussian Activity Propagation: Using Smoothness and Structure to Separate and Localize Sounds in Large Noisy Environments |
993 |
0 |
On Binary Classification in Extreme Regions |
994 |
0 |
Optimistic optimization of a Brownian |
995 |
0 |
Fast Estimation of Causal Interactions using Wold Processes |
996 |
0 |
Factored Bandits |
997 |
0 |
Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net |
998 |
0 |
Query Complexity of Bayesian Private Learning |
999 |
0 |
Modelling sparsity, heterogeneity, reciprocity and community structure in temporal interaction data |
1000 |
0 |
MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval |
1001 |
0 |
Overlapping Clustering Models, and One (class) SVM to Bind Them All |
1002 |
0 |
Bayesian Model Selection Approach to Boundary Detection with Non-Local Priors |
1003 |
0 |
Genetic-Gated Networks for Deep Reinforcement Learning |
1004 |
0 |
Foreground Clustering for Joint Segmentation and Localization in Videos and Images |
1005 |
0 |
Learning semantic similarity in a continuous space |
1006 |
0 |
Quantifying Learning Guarantees for Convex but Inconsistent Surrogates |
1007 |
0 |
Removing the Feature Correlation Effect of Multiplicative Noise |
1008 |
0 |
Optimization for Approximate Submodularity |