MLOps.community
MLOps.community
Jan 21, 2021
When Machine Learning meets privacy - Episode 9
Play • 42 min

**Private data, Data Science friendly**

Data Scientists are always eager to get their hands on more data, in particular, if that data has any value that can be extracted. Nevertheless, in real-world situations, data does not exist in the abundance that we thought existed, in other situations, the data might exist, but not possible to share it with different entities due to privacy concerns, which makes the work of data scientists not only hard, but sometimes even impossible.

// Abstract:

In the last episode of this series, we've decided to bring not one, but two guests to tells us how Synthetic data can unlock the use of data for Data Science teams whenever privacy concerns are a reality.  Jean-François Rajotte, Researcher and Resident data Scientist at the University of Columbia and Sumit Mukherjee, Senior Applied Scientist at Microsoft's AI for Good, bring us into more detail their expertise not only, in Synthetic data generation, but in it's mind blowing combination with Federated Learning to take the healthcare sector into the next level of AI adoption.


//Other links to check on Jean-François Rajotte:

https://venturebeat.com/2021/01/20/microsofts-felicia-taps-ai-to-enable-health-providers-to-share-data-anonymously/

https://dsi.ubc.ca/

https://leap-project.github.io/


//Other links to check on Sumit Mukherjee:

www.sumitmukherjee.com (Sumit research)

https://arxiv.org/abs/2101.07235

https://arxiv.org/pdf/2009.05683.pdf

https://github.com/microsoft/privGAN (PrivGan)


//Final thoughts

Feel free to drop some questions into our slack channel (https://go.mlops.community/slack) 

Watch some of the other podcast episodes and old meetups on the channel: https://www.youtube.com/channel/UCG6qpjVnBTTT8wLGBygANOQ

----------- Connect With Us ✌️-------------

Join our Slack community:  https://go.mlops.community/slack

Follow us on Twitter:  @mlopscommunity

Sign up for the next meetup: https://go.mlops.community/register

Connect with Fabiana on LinkedIn: https://www.linkedin.com/in/fabiana-clemente/

Connect with Jean-François on LinkedIn: https://www.linkedin.com/in/jfraj/

Connect with Sumit on LinkedIn: https://www.linkedin.com/in/sumitmukherjee2/

Gradient Dissent - A Machine Learning Podcast by W&B
Gradient Dissent - A Machine Learning Podcast by W&B
Lukas Biewald
Daphne Koller, CEO of insitro, on digital biology and the next epoch of science
From teaching at Stanford to co-founding Coursera, insitro, and Engageli, Daphne Koller reflects on the importance of education, giving back, and cross-functional research. Daphne Koller is the founder and CEO of insitro, a company using machine learning to rethink drug discovery and development. She is a MacArthur Fellowship recipient, member of the National Academy of Engineering, member of the American Academy of Arts and Science, and has been a Professor in the Department of Computer Science at Stanford University. In 2012, Daphne co-founded Coursera, one of the world's largest online education platforms. She is also a co-founder of Engageli, a digital platform designed to optimize student success. https://www.insitro.com/ https://www.insitro.com/jobs https://www.engageli.com/ https://www.coursera.org/ Follow Daphne on Twitter: https://twitter.com/DaphneKoller https://www.linkedin.com/in/daphne-koller-4053a820/ Topics covered: 0:00​ Giving back and intro 2:10​ insitro's mission statement and Eroom's Law 3:21​ The drug discovery process and how ML helps 10:05​ Protein folding 15:48​ From 2004 to now, what's changed? 22:09​ On the availability of biology and vision datasets 26:17​ Cross-functional collaboration at insitro 28:18​ On teaching and founding Coursera 31:56​ The origins of Engageli 36:38​ Probabilistic graphic models 39:33​ Most underrated topic in ML 43:43​ Biggest day-to-day challenges Get our podcast on these other platforms: Apple Podcasts: http://wandb.me/apple-podcasts Spotify: http://wandb.me/spotify Google: http://wandb.me/google-podcasts YouTube: http://wandb.me/youtube Soundcloud: http://wandb.me/soundcloud Tune in to our bi-weekly virtual salon and listen to industry leaders and researchers in machine learning share their research: http://wandb.me/salon Join our community of ML practitioners where we host AMA's, share interesting projects and meet other people working in Deep Learning: http://wandb.me/slack Our gallery features curated machine learning reports by researchers exploring deep learning techniques, Kagglers showcasing winning models, and industry leaders sharing best practices: https://wandb.ai/gallery
46 min
Machine Learning Engineered
Machine Learning Engineered
Charlie You
Bringing DevOps Best Practices into Machine Learning with Benedikt Koller from ZenML
Benedikt Koller is a self-professed "Ops guy", having spent over 12 years working in roles such as DevOps engineer, platform engineer, and infrastructure tech lead at companies like Stylight and Talentry in addition to his own consultancy KEMB. He's recently dove head first into the world of ML, where he hopes to bring his extensive ops knowledge into the field as the co-founder of Maiot, the company behind ZenML, an open source MLOps framework. Learn more: https://zenml.io/ (https://zenml.io/) https://maiot.io/ (https://maiot.io/) Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter (https://www.cyou.ai/newsletter) Follow Charlie on Twitter: https://twitter.com/CharlieYouAI (https://twitter.com/CharlieYouAI) Subscribe to ML Engineered: https://mlengineered.com/listen (https://mlengineered.com/listen) Comments? Questions? Submit them here: http://bit.ly/mle-survey (http://bit.ly/mle-survey) Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/ (https://www.givingwhatwecan.org/) Timestamps: 02:15 Introducing Benedikt Koller 05:30 What the "DevOps revolution" was 10:10 Bringing good Ops practices into ML projects 30:50 Pivoting from vehicle predictive analytics to open source ML tooling 34:35 Design decisions made in ZenML 39:20 Most common problems faced by applied ML teams 49:00 The importance of separating configurations from code 55:25 Resources Ben recommends for learning Ops 57:30 What to monitor in an ML pipelines 01:00:45 Why you should run experiments in automated pipelines 01:08:20 The essential components of an MLOps stack 01:10:25 Building an open source business and what's next for ZenML 01:20:20 Rapid fire questions Links: https://github.com/maiot-io/zenml (ZenML's GitHub) https://blog.maiot.io/ (Maiot Blog) https://12factor.net/ (The Twelve Factor App) https://blog.maiot.io/12-factors-of-ml-in-production/ (12 Factors of reproducible Machine Learning in production) https://www.seldon.io/ (Seldon) https://www.pachyderm.com/ (Pachyderm) https://www.kubeflow.org/ (KubeFlow) https://www.penguinrandomhouse.com/books/566988/something-deeply-hidden-by-sean-carroll/ (Something Deeply Hidden) https://www.goodreads.com/series/56399-the-expanse (The Expanse Series) https://us.macmillan.com/books/9780765382030 (The Three Body Problem) https://echelonfront.com/extreme-ownership/ (Extreme Ownership)
1 hr 28 min
Machine Learning Street Talk
Machine Learning Street Talk
Machine Learning Street Talk
#045 Microsoft's Platform for Reinforcement Learning (Bonsai)
Microsoft has an interesting strategy with their new “autonomous systems” technology also known as Project Bonsai. They want to create an interface to abstract away the complexity and esoterica of deep reinforcement learning. They want to fuse together expert knowledge and artificial intelligence all on one platform, so that complex problems can be decomposed into simpler ones. They want to take machine learning Ph.Ds out of the equation and make autonomous systems engineering look more like a traditional software engineering process. It is an ambitious undertaking, but interesting. Reinforcement learning is extremely difficult (as I cover in the video), and if you don’t have a team of RL Ph.Ds with tech industry experience, you shouldn’t even consider doing it yourself. This is our take on it! There are 3 chapters in this video; Chapter 1: Tim's intro and take on RL being hard, intro to Bonsai and machine teaching  Chapter 2: Interview with Scott Stanfield [recorded Jan 2020] 00:56:41 Chapter 3: Traditional street talk episode [recorded Dec 2020] 01:38:13 This is *not* an official communication from Microsoft, all personal opinions. There is no MS-confidential information in this video.  With: Scott Stanfield https://twitter.com/seesharp Megan Bloemsma https://twitter.com/BloemsmaMegan Gurdeep Pall (he has not validated anything we have said in this video or been involved in the creation of it) https://www.linkedin.com/in/gurdeep-pall-0aa639bb/ Panel:  Dr. Keith Duggar Dr. Tim Scarfe Yannic Kilcher
2 hr 30 min
Women in Data Science
Women in Data Science
Professor Margot Gerritsen
Kristian Lum | Applying Statistics to Promote Fairness and Transparency
Kristian’s interest in statistics and algorithmic fairness has taken her on a winding career path from academia to business, to public service, and back to academia. As she has made different career changes, she didn’t decide between academia vs. industry vs. non-profit, it was more about the problem she was interested in working on at the moment, and what else is happening in her life. After she earned her PhD in Statistical Science from Duke University, she worked as a research professor at Virginia Tech where she did microsimulation and agent-based modelingin a simulation lab. After that, she tried a data visualization and analytics startup called DataPad that was quickly acquired. When she was thinking about her next step in her career, she wanted to do something with social impact. She was fascinated by the work of the Human Rights Data Analysis Group (HRDAG) that was applying statistical models to casualty data to estimate the number of undocumented conflict casualties. She spent a summer working for HRDAG in Colombia and then decided to join the organization full time. She spent five years as HRDAG’s lead statistician leading the group’s project on criminal justice in the United States focused on algorithmic fairness and predictive policing. Predictive policing uses algorithms to help the police decide where to deploy their resources based on crime statistics, so if you look at where crimes are most likely to occur, this is where you police more often. Kristian’s work showed that these algorithms could actually perpetuate historical over-policing and racial bias in minority communities. Early this year, she moved from HRDAG back to academia. She started her new position at the University of Pennsylvania in the Computer and Information Science Department on March 2 and a week later Penn closed down for COVID. Over this year, she has learned that she needs to adjust her expectations for herself, and not be so frustrated when she can't get things done that maybe under normal circumstances she could. It's not just working from home with her daughter nearby, it's the stress of everything that's going on, the additional mental fatigue of having to do all these risks calculations. This year has also made her appreciate the increasingly critical role of data science in driving data-driven decision making. RELATED LINKS Connect with Kristian Lum on LinkedIN and Twitter Learn more about Penn Engineering Learn more about HRDAG Connect with Margot Gerritsen on Twitter (@margootjeg) and LinkedIn Find out more about Margot on her Stanford Profile
31 min
More episodes
Search
Clear search
Close search
Google apps
Main menu