Fighting Churn with Data Science | Carl Gold, PhD
1 hr 9 min

Carl is a former Wall Street Quant turned data scientist who is leading the battle against churn, using data as his weapon.

A data scientist, he uses a variety of tools and techniques to analyze data around online systems, and his expertise has led to the creation of the Subscription Economy Index.

Currently, he’s the Chief Data Scientist at Zuora - a comprehensive subscription management platform and newly public Silicon Valley “unicorn” with more than 1,000 customers worldwide.

FIND CARL ONLINE

Website: https://fightchurnwithdata.com/

LinkedIn: https://www.linkedin.com/in/carlgold/

Twitter: https://twitter.com/carl24k

GitHub: https://github.com/carl24k

WHAT YOU'LL LEARN

[00:16:01] What is churn?

[00:21:48] Metrics for understanding churn

[00:24:01] Feature engineering for churn

[00:27:22] Why ratio metrics are the best best in your battle against churn

[00:33:09] Dealing with outliers

[00:39:34] More feature engineering tips

QUOTES

[09:06] "When I started out, of course, people thought machine learning was trash...No one was that interested in machine learning back in the early 2000s. It wasn't until after Google essentially had showed how much they could do with machine learning in a production environment with big data."

[12:22] "It should enable better decisions, too. Not just faster decisions by getting the right data to the right people and giving them the right tools. We really should see companies making more optimal decisions."

[13:30] "There should be like a Hippocratic Oath for Data scientists, which means that goes beyond just you don't want to make mistakes. It means that you shouldn't be working on those, you know, on those dangerous applications. "

[22:04] "the features that you choose in my mind are really the main part of solving any data science problem and not the algorithm. I show actually in my book that if you do a good job on your feature engineering, the algorithm that you choose is not that important for your accuracy. So feature engineering always has number one importance in Data science"

SHOW NOTES

[00:01:31] Introduction for our guest

[00:02:54] Carl’s path into data science

[00:04:30] The fascination with churn

[00:08:04] How much more hyped do you think the field has become since you first broke into it?

[00:09:41] Where do you see the field headed in the next two to five years?

[00:11:20] What do you think would be the biggest positive impact that Data science will have on society in the next two to five years?

[00:12:36] What do you think would be the scariest application of machine learning and data science in the next two to five years?

[00:13:17] As practitioners of machine learning, what do you think would be some of our biggest concerns when we're out there doing our work?

[00:16:01] What is Churn? Is that what we do we make butter.

[00:17:27] So why is churn so hard to fight?

[00:21:48] The importance of metrics in our battle against churn

[00:24:01] How do we go from raw event data to metrics?

[00:24:45] How do cohorts help us analyze, predict, and understand churn?

[00:27:22] What are ratio metrics and why are they so powerful?

[00:33:09] Why are outliers so problematic to deal with?

model and get information from them, but without them ruining your numbers.

[00:34:57] What are some common mistakes that you've seen Data scientists make when it comes to dealing with outliers?

[00:39:14] How to be more thoughtful when it comes to feature engineering?

[00:42:31] Debunking the common misconception that the choice of algorithm is the most important thing that contributes to model performance.

[00:43:56] Your features don’t need to be the most creative

[00:45:28] Your job isn’t over once you deploy the model

[00:49:05] What are some things that we need to monitor and track - the context of churn - to make sure that our model is doing what it should be, that is performing as we've designed it?

[00:50:26] How COVID is messing up everyone’s churn models

[00:53:14] Is data science an art or science?

[00:55:24] What are some soft skills that Data scientists are missing that are really going to help them take their careers to the next level?

[00:56:51] How could a data scientist develop their business acumen and their product sense

[00:57:44] What to do with these crazy job descriptions

[00:59:27] What’s the one thing you want people to learn from your story?

[01:00:39] The lightning round

Special Guest: Carl Gold, Phd.

Chai Time Data Science
Chai Time Data Science
Sanyam Bhutani
Ekaterina Kochmar: Automated Language Teaching & Assessment, NLP, Korbit.ai #122
Video Version: https://youtu.be/2MT7bYZsiV4 Subscribe here to the newsletter: https://tinyletter.com/sanyambhutani In this episode, Sanyam Bhutani interviews Ekaterina Kochmar: Lecturer and Researcher at Cambridge University, Co-Founder & CSO at Korbit.ai As you might know, Sanyam is a fan of learning to learn - The topic in general, they bring back the conversation in this episode. Ekaterina has been working on automated language teaching and assessment, which is using machine learning or different tools to augment teaching as an intelligent tutor to build intelligent systems for teaching different concepts for language, specifically English language, and even beyond. They have a deeper dive into this conversation discuss: - What does building a system like this take? - What research goes into it? - What are the interesting trends here? They also dive into Katrina's approach of research, and what does a research pipeline for her look like. Links: https://www.manning.com/books/getting-started-with-natural-language-processing Follow: Ekaterina Kochmar: https://www.linkedin.com/in/ekaterina-kochmar-0a655b14/ https://www.cl.cam.ac.uk/~ek358/ Sanyam Bhutani: https://twitter.com/bhutanisanyam1 Blog: sanyambhutani.com About: https://sanyambhutani.com/tag/chaitimedatascience/ A show for Interviews with Practitioners, Kagglers & Researchers and all things Data Science hosted by Sanyam Bhutani. You can expect weekly episodes every available as Video, Podcast, and blogposts. Intro track: Flow by LiQWYD https://soundcloud.com/liqwyd
1 hr 19 min
Data Futurology - Leadership And Strategy in Artificial Intelligence, Machine Learning, Data Science
Data Futurology - Leadership And Strategy in Artificial Intelligence, Machine Learning, Data Science
Felipe Flores
SL-9 The Role of the Chief Data Officer
We are joined by Celine Le Cotonnec, Chief Data & Innovation Officer at Bank of Singapore and Shameek Kundu, Chief Data Officer at Standard Chartered Bank. They share with us their perspective on what the function and goal of the Chief Data Officer is. From Shameek’s experience, the function of the Chief Data Officer role varies by industry, maturity of the organization with respect to data and sometimes even by geographic location. Celine refers to a study that shows a successful data transformation is 20% about tech, 50% about the people’s mindset and 30% process reengineering wherever you are implementing some type of data product. For her, one key aspect of the CDO role is how you manage the people to achieve transformation within the organization. Quotes: * "Many banks that have built some kind of credit model, even with traditional analytics, have seen their models crumble with Covid-19." * "In any successful data transformation; 20% is about the tech, 50% is about people and their mindset and 30% is about process change." * "If you want to detect financial crime, absolutely the holy grail of AI, in order to detect something like financial crime, at least in supervised learning, you need to know that a transaction was financial crime. But all we can do is detect potential financial crime, then we go and report it to a regulator and we never hear back." Thanks to our sponsors: Shine Solutions Group Talent Insights SAS Women in Analytics (WIA) Network Growing Data Read the full episode summary here: #SheLeads Ep 9 Enjoy the ninth episode of our #SheLeads Series! --- Send in a voice message: https://anchor.fm/datafuturology/message
54 min
Learning Bayesian Statistics
Learning Bayesian Statistics
Alexandre ANDORRA
#28 Game Theory, Industrial Organization & Policy Design, with Shosh Vasserman
In times of crisis, designing an efficient policy response is paramount. In case of natural disasters or pandemics, it can even determine the difference between life and death for a substantial number of people. But precisely, how do you design such policy responses, making sure that risks are optimally shared, people feel safe enough to reveal necessary information, and stakeholders commit to the policies? That’s where a field of economics, industrial organization (IO), can help, as Shosh Vasserman will tell us in this episode. Shosh is an assistant professor of economics at the Stanford Graduate School of Business. Specialized in industrial organization, her interests span a number of policy settings, such as public procurement, pharmaceutical pricing and auto-insurance. Her work leverages theory, empirics and modern computation (including the Stan software!) to better understand the equilibrium implications of policies and proposals involving information revelation, risk sharing and commitment.  In short, Shoshana uses theory and data to study how risk, commitment and information flows interplay with policy design. And she does a lot of this with… Bayesian models! Who said Bayes had no place in economics? Prior to Stanford, Shoshana did her Bachelor’s in mathematics and economics at MIT, and then her PhD in economics at Harvard University. This was a fascinating conversation where I learned a lot about Bayesian inference on large scale random utility logit models, socioeconomic network heterogeneity and pandemic policy response — and I’m sure you will too! Visit https://www.patreon.com/learnbayesstats (https://www.patreon.com/learnbayesstats) to unlock exclusive Bayesian swag ;) Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ (https://bababrinkman.com/) ! Links from the show: Shosh's website: https://shoshanavasserman.com/ (https://shoshanavasserman.com/) Shosh on Twitter: https://twitter.com/shoshievass (https://twitter.com/shoshievass) How do different reopening strategies balance health and employment: https://reopenmappingproject.com/ (https://reopenmappingproject.com/) Aggregate random coefficients logit—a generative approach: http://modernstatisticalworkflow.blogspot.com/2017/03/aggregate-random-coefficients-logita.html (http://modernstatisticalworkflow.blogspot.com/2017/03/aggregate-random-coefficients-logita.html) Voluntary Disclosure and Personalized Pricing: https://shoshanavasserman.com/files/2020/08/Voluntary-Disclosure-and-Personalized-Pricing.pdf (https://shoshanavasserman.com/files/2020/08/Voluntary-Disclosure-and-Personalized-Pricing.pdf) Socioeconomic Network Heterogeneity and Pandemic Policy Response: https://shoshanavasserman.com/files/2020/06/Network-Heterogeneity-Pandemic-Policy.pdf (https://shoshanavasserman.com/files/2020/06/Network-Heterogeneity-Pandemic-Policy.pdf) Buying Data from Consumers -- The Impact of Monitoring Programs in U.S. Auto Insurance: https://shoshanavasserman.com/files/2020/05/jinvass_0420.pdf (https://shoshanavasserman.com/files/2020/05/jinvass_0420.pdf) Thank you to my Patrons for making this episode possible! Yusuke Saito, Avi Bryant, Ero Carrera, Brian Huey, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, Adam Bartonicek, William Benton, Alan O'Donnell, Mark Ormsby, Demetri Pananos, James Ahloy, Jon Berezowski, Robin Taylor, Thomas Wiecki, Chad Scherrer, Vincent Arel-Bundock, Nathaniel Neitzke, Zwelithini Tunyiswa, Elea McDonnell Feit, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran and Paul Oreto. Support this podcast
1 hr 4 min
More episodes
Search
Clear search
Close search
Google apps
Main menu