Off-Line, Off-Policy RL for Real-World Decision Making at Facebook - #448
Play • 1 hr 1 min

Today we’re joined by Jason Gauci, a Software Engineering Manager at Facebook AI.

In our conversation with Jason, we explore their Reinforcement Learning platform, Re-Agent (Horizon). We discuss the role of decision making and game theory in the platform and the types of decisions they’re using Re-Agent to make, from ranking and recommendations to their eCommerce marketplace.

Jason also walks us through the differences between online/offline and on/off policy model training, and where Re-Agent sits in this spectrum. Finally, we discuss the concept of counterfactual causality, and how they ensure safety in the results of their models.

The complete show notes for this episode can be found at

Towards Data Science
Towards Data Science
The TDS team
72. Margot Gerritsen - Does AI have to be understandable to be ethical?
As AI systems have become more ubiquitous, people have begun to pay more attention to their ethical implications. Those implications are potentially enormous: Google’s search algorithm and Twitter’s recommendation system each have the ability to meaningfully sway public opinion on just about any issue. As a result, Google and Twitter’s choices have an outsized impact — not only on their immediate user base, but on society in general. That kind of power comes with risk of intentional misuse (for example, Twitter might choose to boost tweets that express views aligned with their preferred policies). But while intentional misuse is an important issue, equally challenging is the problem of avoiding unintentionally bad outputs from AI systems. Unintentionally bad AIs can lead to various biases that make algorithms perform better for some people than for others, or more generally to systems that are optimizing for things we actually don’t want in the long run. For example, platforms like Twitter and YouTube have played an important role in the increasing polarization of their US (and worldwide) user bases. They never intended to do this, of course, but their effect on social cohesion is arguably the result of internal cultures based on narrow metric optimization: when you optimize for short-term engagement, you often sacrifice long-term user well-being. The unintended consequences of AI systems are hard to predict, almost by definition. But their potential impact makes them very much worth thinking and talking about — which is why I sat down with Stanford professor, co-director of the Women in Data Science (WiDS) initiative, and host of the WiDS podcast Margot Gerritsen for this episode of the podcast.
1 hr 22 min
Machine Learning Street Talk
Machine Learning Street Talk
Machine Learning Street Talk
#044 - Data-efficient Image Transformers (Hugo Touvron)
Today we are going to talk about the *Data-efficient image Transformers paper or (DeiT) which Hugo is the primary author of. One of the recipes of success for vision models since the DL revolution began has been the availability of large training sets. CNNs have been optimized for almost a decade now, including through extensive architecture search which is prone to overfitting. Motivated by the success of transformers-based models in Natural Language Processing there has been increasing attention in applying these approaches to vision models. Hugo and his collaborators used a different training strategy and a new distillation token to get a massive increase in sample efficiency with image transformers.  00:00:00 Introduction 00:06:33 Data augmentation is all you need 00:09:53 Now the image patches are the convolutions though? 00:12:16 Where are those inductive biases hiding? 00:15:46 Distillation token 00:21:01 Why different resolutions on training 00:24:14 How data efficient can we get? 00:26:47 Out of domain generalisation 00:28:22 Why are transformers data efficient at all? Learning invariances 00:32:04 Is data augmentation cheating? 00:33:25 Distillation strategies - matching the intermediatae teacher representation as well as output 00:35:49 Do ML models learn the same thing for a problem? 00:39:01 How is it like at Facebook AI? 00:41:17 How long is the PhD programme? 00:42:03 Other interests outside of transformers? 00:43:18 Transformers for Vision and Language 00:47:40 Could we improve transformers models? (Hybrid models) 00:49:03 Biggest challenges in AI? 00:50:52 How far can we go with data driven approach?
52 min
Learning Bayesian Statistics
Learning Bayesian Statistics
Alexandre ANDORRA
#34 Multilevel Regression, Post-stratification & Missing Data, with Lauren Kennedy
We already mentioned multilevel regression and post-stratification (MRP, or Mister P) on this podcast, but we didn’t dedicate a full episode to explaining how it works, why it’s useful to deal with non-representative data, and what its limits are. Well, let’s do that now, shall we? To that end, I had the delight to talk with Lauren Kennedy! Lauren is a lecturer in Business Analytics at Monash University in Melbourne, Australia, where she develops new statistical methods to analyze social science data. Working mainly with R and Stan, Lauren studies non-representative data, multilevel modeling, post-stratification, causal inference, and, more generally, how to make inferences from the social sciences. Needless to say that I asked her everything I could about MRP, including how to choose priors, why her recent paper about structured priors can improve MRP, and when MRP is not useful. We also talked about missing data imputation, and how all these methods relate to causal inference in the social sciences. If you want a bit of background, Lauren did her Undergraduates in Psychological Sciences and Maths and Computer Sciences at Adelaide University, with Danielle Navarro and Andrew Perfors, and then did her PhD with the same advisors. She spent 3 years in NYC with Andrew Gelman’s Lab at Columbia University, and then moved back to Melbourne in 2020. Most importantly, Lauren is an adept of crochet — she’s already on her third blanket! Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at ( ! Thank you to my Patrons for making this episode possible! Yusuke Saito, Avi Bryant, Ero Carrera, Brian Huey, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, Adam Bartonicek, William Benton, Alan O'Donnell, Mark Ormsby, Demetri Pananos, James Ahloy, Jon Berezowski, Robin Taylor, Thomas Wiecki, Chad Scherrer, Vincent Arel-Bundock, Nathaniel Neitzke, Zwelithini Tunyiswa, Elea McDonnell Feit, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran, Paul Oreto, Colin Caprani, George Ho, Colin Carroll and Nathaniel Burbank. Visit ( to unlock exclusive Bayesian swag ;) Links from the show: Lauren's website: ( Lauren on Twitter: ( Lauren on GitHub: ( Improving multilevel regression and poststratification with structured priors: ( Using model-based regression and poststratification to generalize findings beyond the observed sample: ( Lauren's beginners Bayes workshop: ( MRP in RStanarm: ( Choosing your rstanarm prior with prior predictive checks: ( Mister P -- What’s its secret sauce?: ( Bayesian Multilevel Estimation with Poststratification -- State-Level Estimates from National Polls: ( MRPyMC3 -... Support this podcast
1 hr 13 min
The Cloudcast
The Cloudcast
Cloudcast Media
Evolution of Commercial OSS
Joseph “JJ” Jacks (@asynchio, Founder/General Partner OSS Capital) talks about how Commercial OSS has evolved, coopetition with cloud providers, and what's next for Commercial OSS business models and communities.  *SHOW: *492 *SHOW SPONSOR LINKS:* * CloudZero - Cloud Cost Intelligence for Engineering Teams * BMC Wants to Know if your business is on its A-Game * BMC Autonomous Digital Enterprise * Datadog Security Monitoring Homepage - Modern Monitoring and Analytics * Try Datadog yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirt. *CLOUD NEWS OF THE WEEK *- *CHECK OUT OUR NEW PODCAST - **"CLOUDCAST BASICS"* *SHOW NOTES:* * OSS Capital Partners and Advisors * Commercial Open-Source Software Company Index (COSSI) * OSS Capital to launch an ETF (with NASDAQ) of OSS Companies in Summer 2021 * Open Consensus - Data Driven Perspectives on Open Source Software * COSS Community / Open Core Summit  * The Kubernetes State of the Community (Eps.272) * Exploring the Business Side of Open Source Software (Eps.358) * Server Side Public License *Topic 1 *- Welcome to the show. For those that don’t already know you, tell us a little bit about your background, and some of the things you’re focused on today.  *Topic 2* - You’ve been tracking the commercialization of open-source projects for quite a while now. What big trends have you seen evolve over the last two decades (from Red Hat to MongoDB)  *Topic 3 *- Even in the face of new OSS-centric offerings from the cloud providers, we still continue to see companies getting funded. What is the sentiment in the VC-communities about what the new competitive landscape looks like? Are there new rules in the game?* * *Topic 4 *- We’ve recently seen MongoDB and Elastic changing their licensing model to SSPL. The stock of both companies continues to rise. Is what they are doing a short-term “fix” to a competitive threat, or a critical mistake? Does licensing need to evolve as a company matures?  *Topic 5* - Are there fundamental shifts in how OSS companies are created and eventually operationalized happening now?  *Topic 6* - Where do you see commercial OSS trending over the next 5 years, and what big changes need to happen to make those realities happen? *FEEDBACK?* * Email: show at thecloudcast dot net * Twitter: @thecloudcastnet
47 min
Gradient Dissent - A Machine Learning Podcast by W&B
Gradient Dissent - A Machine Learning Podcast by W&B
Lukas Biewald
Daphne Koller, CEO of insitro, on digital biology and the next epoch of science
From teaching at Stanford to co-founding Coursera, insitro, and Engageli, Daphne Koller reflects on the importance of education, giving back, and cross-functional research. Daphne Koller is the founder and CEO of insitro, a company using machine learning to rethink drug discovery and development. She is a MacArthur Fellowship recipient, member of the National Academy of Engineering, member of the American Academy of Arts and Science, and has been a Professor in the Department of Computer Science at Stanford University. In 2012, Daphne co-founded Coursera, one of the world's largest online education platforms. She is also a co-founder of Engageli, a digital platform designed to optimize student success. Follow Daphne on Twitter: Topics covered: 0:00​ Giving back and intro 2:10​ insitro's mission statement and Eroom's Law 3:21​ The drug discovery process and how ML helps 10:05​ Protein folding 15:48​ From 2004 to now, what's changed? 22:09​ On the availability of biology and vision datasets 26:17​ Cross-functional collaboration at insitro 28:18​ On teaching and founding Coursera 31:56​ The origins of Engageli 36:38​ Probabilistic graphic models 39:33​ Most underrated topic in ML 43:43​ Biggest day-to-day challenges Get our podcast on these other platforms: Apple Podcasts: Spotify: Google: YouTube: Soundcloud: Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices:
46 min
More episodes
Clear search
Close search
Google apps
Main menu