What We’re Reading: Trends & Takeaways from the NeurIPS 2021 Conference

The Gretel research team's favorite trends and takeaways from the NeurlPS 35th Annual Conference on Neural Information Processing Systems.
Copyright © 2022 Gretel Labs. All rights reserved.
Copyright © 2022 Gretel Labs. All rights reserved.

Introduction

Many of us on Gretel’s research team attended the annual NeurIPS conference last month. As with every year’s event, there were hundreds of fascinating papers and presentations that highlighted the future of machine learning and information science. Below are just a handful of takeaways we found interesting, particularly from a data privacy perspective. 

If you’ve read any of these papers, tweet us your feedback @gretel_ai, join our Slack community, or connect with us individually on LinkedIn. We’d love to hear what you think!

Andrew Carr:

  • As we think about generative synthetic data for more modalities, diffusion models have emerged as a potential strong contender for images and audio with DN21 showing superior performance over GANs. SDME21 and others also show nice progress in being able to work with these models and their likelihood. 
  • Many papers such as MKWZMMZ21 worked to extend language model performance to longer and longer sequence lengths. This is still an open problem, but much exciting progress in factorization, kernel methods, and more was presented at the conference.
  • Fine-tuning is making a comeback with larger models being pre-trained on enormous corpora. One such example is DTLYZ21 which proposes a new fine-tuning scheme which encourages adversarial robustness in language models.

Daniel Nissani:

  • TDF21 showed that differential privacy and fairness may be at odds, at least when using empirical risk minimization for differential privacy. Strikingly, as researchers are observing how differential privacy may exacerbate unfairness, other models such as DECAF proposed by BKBS21, a causally aware GAN to produce fair synthetic data, claims they can create differentially private and fair synthetic data. The idea of causally aware models was actually quite prominent during the conference, such as ZL21 proposing a causally aware language model that can be used for conditional text generation.
  • The adult income dataset has a replacement with an API interface for easier analysis. DHMS21 created this new dataset due to the inherent biases presented in data preprocessing of the original dataset. They also built this python package to interact with the new dataset: `folktables`
  • One of my favorite papers that is probably getting little attention is about what metrics should be used for supervised learning tasks. GZTP21 provides a rigorous analysis of a family of supervised learning metrics, narrowing down to Correlation Distance (derived from Matthew’s Correlation Coefficient) and Symmetric Balanced Accuracy.

Lipika Ramaswamy:

  • There were 48 papers on differential privacy (DP) in 2021 – up from 31 last year! The subset of papers that apply to DP for machine learning (ML) were my focus at the conference.
  • Research continues to address practical concerns with DP-SGD (and optimizers in general). Often one finds with training large models privately that the worst case privacy loss with Rényi DP accounting is simply too large to be meaningful. GLW21 provides a lower bound and a tighter upper bound on privacy loss by approximating the privacy curve of the composition of each iteration of DP-SGD. Sometimes the problem is that “non-average” data points impact privacy loss too much, so FZ21 introduces filters for a fully adaptive Rényi-DP composition theorem that bounds the overall privacy loss by the individual privacy losses of each data point. DP-SGD has also desperately needed a speed up, which SVK21 obtains using vectorization, just-in-time compilation and static graph optimization. The best part of the papers I’ve highlighted is that all have code implementations so trying them out is not much of a lift!
  • Label differential privacy is gaining traction in the private ML sphere. This notion applies when only the labels of a dataset can be treated as sensitive, and hence are the only portion of each record protected by the DP guarantee. The two notable papers are GGKMZ21 and EMPST21.