Add by RSS Feed
Get the Android app
Get the iOS app
with Sam Ramji
We all know the future of software is “cloud-native.” How did we get here? What’s coming next? Join Sam Ramji and become part of this growing community of friends. Let's go far together.
Nov 15, 2023
Throwback: The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik
This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset. In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack. ------------------- “We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley “I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding providers as the translators to take all of our unstructured data and bring it into vector space into a common machine language, vector search engines are essentially the workhorses that allow us to compute and search over these objects in vectorized format. They're essentially the calculators of the AI stack.” – Zain Hasan “Haystack, I would really position as the kitchen. I need Mikiko to bring the apples. I need Zain to bring the pears. I need Hugging Face or OpenAI to bring the oranges to make a good fruit salad. But, Haystack will provide the spoons and the pans and the knives to make that into something that works together.” – Tuana Celik ------------------- Episode Timestamps: (02:58): What open source data means to the panelists (09:11): What interested the panelists about AI/ML (24:10): Mikiko explains Featureform (27:00): Zain explains Weaviate (30:23): Tuana explains deepset (36:00): The panelists discuss how their companies fit into the AI-first ecosystem (44:58): How jobs need to evolve with the AI-native stack (54:35): Executive producer, Audra Montenegro's backstage takeaways ------------------- Links: LinkedIn - Connect with Mikiko Visit Featureform LinkedIn - Connect with Zain Visit Weaviate LinkedIn - Connect with Tuana Visit deepset Visit Data-centric AI
Nov 1, 2023
How We Should Think About Data Reliability for Our LLMs with Mona Rakibe
This episode features an interview with Mona Rakibe, CEO and Co-founder of Telmai, an AI-based data observability platform built for open architecture. Mona is a veteran in the data infrastructure space and has held engineering and product leadership positions that drove product innovation and growth strategies for startups and enterprises. She has served companies like Reltio, EMC, Oracle, and BEA where AI-driven solutions have played a pivotal role. In this episode, Sam sits down with Mona to discuss the application of LLMs, cleaning up data pipelines, and how we should think about data reliability. ------------------- “When this push of large language model generative AI came in, the discussions shifted a little bit. People are more keen on, ‘How do I control the noise level in my data, in-stream, so that my model training is proper or is not very expensive, we have better precision?’ We had to shift a little bit that, ‘Can we separate this data in-stream for our users?’ Like good data, suspicious data, so they train it on little bit pre-processed data and they can optimize their costs. There's a lot that has changed from even people, their education level, but use cases also just within the last three years. Can we, as a tool, let users have some control and what they define as quality data reliability, and then monitor on those metrics was some of the things that we have done. That's how we think of data reliability. Full pipeline from ingestion to consumption, ability to have some human’s input in the system.” – Mona Rakibe ------------------- Episode Timestamps: (01:04): The journey of Telmai (05:30): How we should think about data reliability, quality, and observability (13:37): What open source data means to Mona (15:34): How Mona guides people on cleaning up their data pipelines (26:08): LLMs in real life (30:37): A question Mona wishes to be asked (33:22): Mona’s advice for the audience (36:02): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Mona Learn more about Telmai
Oct 18, 2023
Throwback: Open Source Innovation, The GPL for Data, and The Data In to Data Out Ratio with Larry Augustin
This episode features an interview with Larry Augustin, angel investor and advisor to early-stage technology companies. Larry previously served as the Vice President for Applications at AWS, where he was responsible for application services like Pinpoint, Chime, and WorkSpaces. Before joining AWS, Larry was the CEO of SugarCRM, an open source CRM vendor. He also was the founder and CEO of VA Linux, where he launched SourceForge. Among the group who coined the term “open source”, Larry has sat on the boards of several open source and Linux organizations. In this episode, Sam and Larry discuss who owns the rights to data, the data in to data out ratio, and why Larry is an open source titan. ------------------- "People are willing to give up so much of their personal information because they get an awful lot back. And privacy experts come along and say, ‘Well, you're taking all this personal information’. But then most people look at that and say, ‘But I get a lot of value back out of that.’ And it's this data ratio value question, which is: for a little in, I get a lot back. That becomes a key element in this. And I think there has to be some kind of similar thought process around open source data in general, which is if I contribute some data into this, I'm going to get a lot of value back. So this data in to data out ratio, I think it's an incredibly important one. And it gets everyone in the mindset of, ‘How do I provide more and more and take less and less?’ It's a principle of application development that I like a lot. And I think there's a similar concept here around open source data. Are there models or structures that we can come up with where people can contribute small amounts of data and as a result of that, they get back a lot of value.” – Larry Augustin ------------------- Episode Timestamps: (02:52): How Larry is spending his time now after AWS (06:25): What drove Larry to open source (18:41): What is the GPL for data? (24:28): Areas of progress in open source data (28:57): The data in to data out ratio (36:39): Larry’s advice for folks in open source ------------------- Links: LinkedIn - Connect with Larry Twitter - Follow Larry
Sep 27, 2023
Reframing Machine Learning and AI-Assisted Development with Jorge Torres
This episode features an interview with Jorge Torres, Co-founder and CEO of MindsDB. MindsDB is a virtual AI database that works with existing data to help developers build AI-centered apps. In 2008, Jorge began his work on scaling solutions using machine learning as the first full-time engineer at Couchsurfing, growing the company from a few thousand users to a few million. He has also served a number of data-intensive start-ups and was a visiting scholar at UC Berkeley researching machine learning automation and explainability. In this episode, Sam and Jorge discuss the inspiration and challenges behind MindsDB, classic data science AI versus applied AI, and time series transformers. ------------------- “So much data in the world is time series data, so much data. Even data that people don't know is time series, it's time series. So long as it’s moving over time, it is time series data. Whether you store it or not, that's a different thing. For having a pre-trained model on time series data, it even enabled the fact that you don't have to store all the historical data. You can just take the model and start passing data as it comes through, and then you get out the forecast. So you don't even have to have the historical data. All you need to have is the data at that given instance, and you can pass it to the model and you get an output. It's mind blowing.” – Jorge Torres ------------------- Episode Timestamps: (05:20): The inspiration behind MindsDB (10:20): Classic data science AI approach vs. applied AI (22:09): What open source data means to Jorge (28:51): What excites Jorge about Nixtla and time series transformers (37:07): A question Jorge wishes to be asked (40:20): Jorge’s advice for the audience (41:38): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Jorge Learn more about MindsDB open source code Learn more about MindsDB
Sep 6, 2023
A Sam Ramji Feature: The Evolution of Open Source, Kubernetes, and AI's Forward Journey
On this episode, we’ve partnered with the Future Rodeo podcast for a discussion between Sam and Matt Wallace. Matt is the Chief Technology Officer and EVP at Faction, a pioneer of multi-cloud data services, and host of Future Rodeo. In this episode, Sam and Matt discuss Microsoft’s transformation, the impact of Kubernetes on container orchestration, and the rapid acceleration of AI research and development. ------------------- Episode Timestamps: (01:38): Microsoft’s open source transformation (13:19): The impact of Kubernetes and how it defragmented the industry (22:06): The transformative power of AI and how it’s changing the value of reasoning (54:58): The concept of cognitive economy and its potential impact on AI and software development (01:03:25): Potential implications of advancements in robotics, AI, and clean energy (01:04:17): Sam’s advice for those entering the industry or choosing a career path ------------------- Links: LinkedIn - Connect with Matt Listen to the Future Rodeo podcast
1 hr 10 min
Aug 23, 2023
The Importance of Open Source Data for Generative AI, Now and in the Future with Abby Kearns
This episode features an interview with Abby Kearns, technology executive, board director, and angel investor. Her career has spanned executive leadership, product marketing, product management, and consulting across Fortune 500 companies and startups, including Puppet, Cloud Foundry Foundation, and Verizon. Abby currently serves as a board director for Lightbend, Stackpath, and Invoke. In this episode, Sam sits down with Abby to discuss the betrayal source license, the role open source plays in AI, and empowering trust. ------------------- “There's so much happening so quickly that I think open source has the power to help harness a lot of that innovative conversation. In a way that I think it's going to be really, really hard to match in a proprietary way. I think open source and the ability, given the fact that we're talking about AI and data, the two are very interrelated at this point. AI is not super interesting without data. I think the power of open source right now and what's happening, I think it has to happen in open source and I think it really has to have that level of transparency and visibility. But, always the ability for everyone to step up and understand what's happening at this moment in time and shape it.” – Abby Kearns ------------------- Episode Timestamps: (00:50): Sam and Abby discuss the betrayal source license (14:12): What open source data means to Abby (23:30): Abby dives into the companies she’s investing in (34:30): How nonprofits can empower trust (38:32): A question Abby wishes to be asked (40:21): Abby’s advice for the audience (43:53): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Abby Twitter - Follow Abby Read _Design the Life You Love_
Aug 9, 2023
The Value of Reproducibility and Ease of AI Deployment with Daniel Lenton
This episode features an interview with Daniel Lenton, Founder and CEO of Ivy, where the team is on a mission to unify the fragmented AI stack. Prior to Ivy, Daniel was a Robotics Research Engineer at Dyson and a Deep Learning Research Scientist for Amazon Prime Air. During his PhD, Daniel explored the intersection between learning-based geometric representations, ego-centric perception, spatial memory, and visuomotor control for robotics. In this episode, Sam and Daniel discuss the inspiration behind Ivy, open source reproducibility, and democratizing AI. ------------------- "There's too much amazing stuff going on, from too many different parties. We just want to be the objective source of truth to show you the data and show you where your model will be doing best, and continue to do this as a service or something like this. This is high-level, some of the areas we see and going into, we really want to be a useful tool for anybody that wants to just kind of understand this fragmented complex space quickly and intuitively, and we are trying to be the tool that does that." – Daniel Lenton ------------------- Episode Timestamps: (01:00): What open source data means to Daniel (05:37): The challenges of building Ivy (15:37): The future of Ivy (25:19): Who should know about Ivy (28:46): Daniel’s advice for the audience (32:00): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Daniel Learn more about Ivy
Jul 26, 2023
ML Engineering Teams and Niche Chat Bot Experiences with Demetrios Brinkmann
This episode features an interview with Demetrios Brinkmann, Founder of the MLOps Community, an organization for people to share best practices around MLOps. Demetrios fell into the Machine Learning Operations world and has since interviewed leading names around MLOps, data science, and machine learning. In this episode, Sam sits down with Demetrios to discuss LLM in production use cases, ML engineering teams, and the LLM Survey Report from the MLOps Community. ------------------- "I think the most novel ones that I saw from the survey were when a chat bot would prompt a human as opposed to the human prompting the chat bot. It's almost like you have this LLM coach. And in that way, it's not necessarily like this isn't LLM in production that an end user is getting that's not outside the business or that is outside the business. It's more like internally, you can think about maybe it's an accountant and the accountant is filing my taxes for the year. As they're filing them, the LLM is prompting them on different tax laws that maybe they weren't thinking about or different ways that they could file things." – Demetrios Brinkmann ------------------- Episode Timestamps: (04:30): LLMs as the new standard (19:26): Key LLM in production use cases (31:18): What open source data means to Demetrios (34:36): What Demetrios is seeing in open source AI models (42:44): One question Demetrios wishes to be asked (44:41): Demetrios’s advice for the audience (47:19): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Demetrios Read the LLM Survey Report Listen to The MLOps Podcast
Jul 12, 2023
Building With Trust, Inspiration, and Reputation with Jaya Gupta, Yuliia Tkachova, and Omoju Miller
This bonus episode features conversations from season 5 of the Open||Source||Data podcast. In this episode, you’ll hear from Jaya Gupta, Partner at Foundation Capital; Yuliia Tkachova, Co-founder and CEO of Masthead Data; and Omoju Miller, Founder and CEO of Fimio. Sam sat down with each guest to discuss how they are building foundations for trust, inspiration, and reputation as we all race into the AI-centric future. You can listen to the full episodes from Jaya Gupta, Yuliia Tkachova, and Omoju Miller by clicking the links below. ------------------- Episode Timestamps: (00:49): Jaya Gupta (01:48): Yuliia Tkachova (03:03): Omoju Miller ------------------- Links: Listen to Jaya’s episode Listen to Yuliia’s episode Listen to Omoju’s episode
Jun 28, 2023
FMOps and a Founders Automated Future with Jaya Gupta
This episode features an interview with Jaya Gupta, Partner at Foundation Capital, where she leads early-stage investments across the enterprise software stack. Previously, Jaya was a Senior Business Analyst at McKinsey & Company focusing on software diligence and helping startups expand their go-to-market strategies. In this episode, Sam and Jaya discuss her journey to Foundation Model Ops, how software is becoming more accessible, and the democratization of AI tools. ------------------- "At the end of the day, FMOps isn't just about the new tools. It's actually more about the new builders, the new workflows, and a completely new market of customers. I was on the other day, looking at LangChain's page of integrations, I don't know if you've seen it, but it's like Anyscale, Databricks, all these other huge legendary companies are integrating with LangChain, and I think it's clear that there's a huge community that is building something real and valuable." – Jaya Gupta ------------------- Episode Timestamps: (01:05): What open source data means to Jaya (08:51): Jaya’s journey to Foundation Model Ops (15:58): How software is becoming more accessible (23:04): The democratization of AI tools (27:01): One question Jaya wishes to be asked (29:32): Jaya’s advice for the audience (31:51): Backstage takeaways with executive producer, Audra Montenegro ------------------- Links: LinkedIn - Connect with Jaya Follow Jaya on Twitter Learn more about FMOps
May 31, 2023
Web3 and Putting Reputation on Code with ML with Omoju Miller
This episode features an interview with Omoju Miller, Founder and CEO of Fimio, a web3 reputation company. Originally from Lagos, Nigeria, Omoju holds a doctoral degree in Computer Science Education from UC Berkeley. Her expertise in machine learning and computational intelligence led her to companies such as Google and GitHub. Omoju also served as a volunteer advisor to the Obama administration’s White House Presidential Innovation Fellows. In this episode, Sam sits down with Omoju to discuss how machine learning can make applications more secure, what the future of the internet looks like, and the fascinating story behind Fimio. ------------------- “So my first view is, in this future internet we have people, we also have bots, we have machines, we have code doing things. And bots sounds like such a horrible word now. [...] You need to have a level of trust on what that bot is. Everything from the humans to the machines collaborating in this decentralized world, we need to hav…
1 hr 2 min
May 17, 2023
The Human Right to Privacy and Caring About UX Design with Yuliia Tkachova
This episode features an interview with Yullia Tkachova, Co-founder and CEO of Masthead Data, an observability platform that catches anomalies in Google BigQuery in real-time. She holds degrees in Management Information Systems, Math, Statistics, and Marketing. Prior to Masthead, Yuliia designed complex BI products and solutions powered by ML and utilized by Fortune 500 companies. In this episode, Sam and Yuliia discuss how ML is shaping the future of data analytics, caring about users, and the fundamental human right to privacy. ------------------- “We map those errors and anomalies on lineage, helping to understand what upstreams and downstreams are affected, what business users are affected. And that actually speeds up all the troubleshooting from hours to minutes. And this is the ultimate goal where we deliver. Because again, my belief that if you don't have this lineage piece was mapped anomalous in errors, it's not observability. It's monitoring. [...] What is also very uniq…
May 3, 2023
Determinism in Complex Environments and Workflow Services with Maxim Fateev
This episode features an interview with Maxim Fateev, Co-founder and CEO of Temporal, an open source, distributed, and scalable workflow orchestration engine capable of running millions of workflows. He has 20 years of experience architecting mission-critical systems at Uber, Google, Amazon, and Microsoft. In this episode, Sam sits down with Maxim to discuss workflow services, the power behind Temporal, and bringing determinism to highly complex environments. ------------------- “[Temporal] has this notion of workflows, which can run for a very long time and handle external events, you can treat them as a durable actor. And they're very good at implementing a lifecycle. For example, you can have an object per model and let this object handle all the events. Like, new data came in, notify this object, this object will go and retrain it. Or, it'll run an activity to superiorly check the status. So you can have end-to-end lifecycle implemented fully in Temporal.” – Maxim Fateev…
Mar 15, 2023
The AI-Native Stack in Practice with Charna Parkey and Sam Bean
This episode features a panel discussion with Charna Parkey, a Real-Time AI Product and Strategy leader at DataStax; and Sam Bean, Staff Engineer at You.com. Charna is a co-author and inventor on several patents, including patent-pending work on ML/coordinated feature engine at the edge. Sam helped create the Spark connector to Weaviate, and is passionate about Big Data, Spark, NLP, Hugging Face, and large language models. In this episode, Charna and Sam discuss adapting to user expectations, what’s missing in the AI stack, and how to become an advanced citizen in open source. ------------------- "We've seen these companies start to better understand that these streaming technologies have a place, whether it's Kafka or Flink or Pulsar, but it's still incredibly difficult to use and we need a different level of abstraction. [...] We're starting to see the stack change so that it becomes more interchangeable of the components and try to sort of raise that layer of abstraction so tha…
1 hr 6 min
Mar 1, 2023
The AI-Native Stack with Mikiko Bazeley, Zain Hasan, and Tuana Celik
This episode features a panel discussion with Mikiko Bazeley, Head of MLOps at Featureform; Zain Hasan, Senior Developer Advocate at Weaviate; and Tuana Celik, Developer Advocate at deepset. In this episode, Mikiko, Zain, and Tuana discuss what open source data means to them, how their companies fit into the AI-first ecosystem, and how jobs will need to evolve with the AI-native stack. ------------------- “We're almost part of a fancy new AI robot kitchen that you'd find in Tokyo, in some ways. I see a virtual feature store as, yes, you can have a bunch of your ingredients tossed into a closet. Or, what you can do is you can essentially have a nice way to organize them. You can have a way to label them, to capture information.” – Mikiko Bazeley “I really like that analogy as well. I like how Mikiko put it where a vector search engine is really extracting value from what you've already got. [...] So where I see vector search engines, really, is if we think of these embedding…
Feb 22, 2023
Special Episode: Data on Kubernetes and Cassandra Forward with Patrick McFadin
This special episode of Open||Source||Data features an interview with Patrick McFadin. Patrick has been a distributed systems hacker since he first plugged a modem into his Atari computer. Looking for adventure, he joined the US Navy, working on the Naval Tactical Data System (NTDS), which cemented his love of distributed systems. He is now an Apache Cassandra Committer, and is the Vice President of Developer Relations at DataStax. Sam catches up with Patrick at Data Day Texas to discuss his book _Managing Cloud Native Data on Kubernetes_, Cassandra Forward, and the future of Apache Cassandra. ------------------- “I can now use my Parquet file in Iceberg or DuckDB, and this is data that I created with Cassandra. And we're not getting to the point where we have to reinvent an entire database. We can just connect the Lego parts together and if they're open, then I don't have these encumbrances. I'm not like, ‘Well, I can connect that if I call a salesperson and get a license.’ […
Feb 15, 2023
Making Graph Data Easier with Open Initiatives with Denise Gosnell
This episode features an interview with Denise Gosnell, Principal Product Manager at Amazon Web Services. At AWS, Denise leads product and strategy for Amazon Neptune, a fully managed graph database service. Her career centers on her passion for examining, applying, and advocating for the applications of graph data. Denise has also authored, patented, and spoken on graph theory, algorithms, databases, and applications across all industry verticals. In this episode, Sam sits down with Denise to discuss graph initiatives, the future of developer models, and what Denise learned from hiking the Appalachian Trail. ------------------- “We just open sourced something called graph-explorer, which is something for the community by the community, Apache 2.0 license. graph-explorer is a low-code visualization tool. But, the best part about it is that it works for JanusGraph, it works for Blazegraph, it works for all of these graph models that we've talked about, because we've got this divide…
Feb 1, 2023
Advising Big Data and The Future of AI/ML with Ben Lorica
This episode features an interview with Ben Lorica, Co-founder and Principal of Gradient Flow, a company that provides a wide range of content on data and technology. Ben is an industry expert on data, machine learning, and AI. He is a Technical Advisor for Databricks, a program chair for several data conferences, and he hosts The Data Exchange Podcast. In this episode, Sam and Ben discuss Big Data and the improvements and future opportunities of AI and machine learning. ------------------- “The reason I use the word decentralize is because when you try to explain it to someone, let's say you want to train a different model for each user, or region, or sensor, or device. So you can't use necessarily just personalized because recommenders can be personalized, but they're still centralized models.” – Ben Lorica ------------------- Episode Timestamps: (01:17): What open source data means to Ben (05:54): What intrigued Ben about Big Data (12:07): What brought Ben to working o…
Jan 18, 2023
Functional Programming and an Ideal Data Stack Building Experience with Holden Karau
This episode features an interview with Holden Karau, an Open Source Engineer at Netflix. Holden is best known for her work on Apache Spark, her advocacy in the open source software movement, and her creation of a variety of related projects including spark-testing-base. Previously, Holden worked at Big Tech companies like Apple, IBM, and Google as a software engineer and developer advocate. In this episode, Sam sits down with Holden to discuss the data analysis stack, functional programming, and the future of open source software data tooling. ------------------- “These things are not one off. We may think that they're one off and they don't need testing, but that's not the reality. When you write something, it needs to be maintainable and as software people, the only real way that I think we know to make something vaguely maintainable is to at least have tests. And these tests need to cover common failure cases that we've experienced. And certainly, there's different approaches…
Jan 4, 2023
Workflow Engines and Building a Domain Specific Language for Data Quality with Tom Baeyens
This episode features an interview with Tom Baeyens, Co-founder and CTO of Soda, where he oversees the company's product development, software architecture, and technology strategy. He is passionate about open source and committed to building a community where data engineers can succeed using the Soda Data Monitoring Platform. Tom is the inventor of the widely-used open source project jBPM and Activiti. He also co-founded Effektif, a cloud process automation company. In this episode, Sam and Tom discuss the evolution of open source workflow engines, data contracts, and why data quality needs a language approach. ------------------- “Where we're heading is what I think is exactly the same as with software engineering in the testing. Test-driven development was a radical new thing back then. But then it turns out, you can much more reliably release software. And this is exactly the same here. If you don't inject data testing, data observability throughout your data stack, then how a…
Dec 14, 2022
Enabling Edge Workers, AI & ML, and The Future of Data Science with Matthew Rocklin
This episode features an interview with Matthew Rocklin, CEO of Coiled, the scalable Dask-based cloud platform. Prior to founding Coiled, Matthew worked on Dask at Anaconda and then NVIDIA where his teams focused on accelerating Dask through parallel computing and GPUs. Matthew is an industry speaker, author, and founding member of Pangeo, whose mission is to develop open source analysis tools for ocean, atmosphere, and climate science. In this episode, Sam sits down with Matthew to discuss enabling edge workers, the future of data science, and the revolution of AI and ML. ------------------- “There's all sorts of fun people using these tools and that's the most fun part of this job. You get to learn so much about so many different applications that are all so different and all so fascinating. You were thinking about all these different tools and technologies and I was talking to someone once, it's like, ‘Oh, it's like you're standing on the shoulders of giants.’ That's not qu…
Dec 7, 2022
OSPOs, Measuring Community Success, and Self Knowledge with Nithya Ruff
This episode features an interview with Nithya Ruff, Head of Open Source Program Office at Amazon. At Amazon, she drives open source culture and coordination and engagement with external communities. Prior to Amazon, Nithya spearheaded and grew Open Source Program Offices (OSPOs) for Comcast and Western Digital. She has also served as the Director-At-Large on the Linux Foundation Board since 2016, where she works to advance the mission of building sustainable ecosystems that are built on open collaboration. In this episode, Sam and Nithya discuss OSPOs, how to measure success, and the evolution of the data ecosystem. ------------------- “I think if we look at what matters to customers, which is innovation, trust, and being a force for change with open source, then we can really deliver on the metrics that the company cares about.” – Nithya Ruff ------------------- Episode Timestamps: (04:02): What open source data means to Nithya (06:29): What interested Nithya about open…
Nov 23, 2022
IoT Databases, Digital Twins, and Real Holodecks with Jonathan Beri
This episode features an interview with Jonathan Beri, Founder & CEO of Golioth, a commercial IoT development platform built for scale. Previously, Jonathan was a Product Manager at Particle, Google/Nest, Magneto, and Myspace where he spent his time building IoT solutions. In this episode, Sam sits down with Jonathan to discuss the concept of digital twins, the future of IoT databases, and how to build a real holodeck. ------------------- “I think about IoT when I started at Nest, we had some of the best engineers I've ever worked with. Starting from first principles, defining networking protocols, and introducing new specifications that became parts of the fabric of the internet. And fast forward 10 years later, a lot of that exists now as building blocks. Someone who's not a PhD with a lifetime and achievement award from the ITF can go actually design systems that are highly productive, integrated, and enabling. And that's where I get excited. And the through line I think is ena…
Nov 9, 2022
Healthcare Infrastructure, ALS Research and Reliable Data with Indu Navar
This episode features an interview with Indu Navar, CEO and Founder of EverythingALS, a patient-driven non-profit, bringing technological innovations and data science to support efforts from care to cure, for people with ALS. Indu’s impressive career includes being an original member of the WebMD engineering team, where she was instrumental in using emerging technologies to achieve application scalability and performance. In this episode, Sam sits down with Indu to discuss healthcare infrastructure applications, her strategies for providing reliable patient data, and the future of ALS research. ------------------- “We said, ‘Okay, we're going to make this a citizen-driven research.’ That means patients are going to come and enroll because it's their project and it's patient-driven. So, it's a patient-driven, open innovation. So, once you do open patient-driven, open innovation, now we are the custodians of the data. Patients own the data, so all the data is shared with the p…
Nov 2, 2022
Shifting Left on Data with DeVaris Brown, Tomer Shiran, and Erica Brescia
This bonus episode features conversations from season 3 of the Open||Source||Data podcast. In this episode, you’ll hear from DeVaris Brown, CEO & Co-founder of Meroxa; Tomer Shiran, Founder & CPO of Dremio; and Erica Brescia, Managing Director at Redpoint Ventures. Sam sat down with each guest to discuss how they’re making data more programmable by shifting left. You can listen to the full episodes from DeVaris Brown, Tomer Shiran, and Erica Brescia by clicking the links below. ------------------- Episode Timestamps: (00:12): DeVaris Brown (00:42): Tomer Shiran (01:32): Erica Brescia ------------------- Links: Listen to DeVaris’ episode Listen to Tomer’s episode Listen to Erica’s episode
Oct 26, 2022
Serial Entrepreneurship, Metadata Capture Systems, and Osquery with Tony Gauda
This episode features an interview with Tony Gauda, Head of Customer Engineering at Fleet Device Management, an open core company powered by Osquery. Tony is a serial entrepreneur and inventor with a profound history in fraud, security, and SaaS business. He holds several issued patents and his companies have raised over $40 million in venture funding. Tony is also the founder of ThinAir, a Y-Combinator backed SaaS service that tackles the insider threat problem for enterprises and government agencies. In this episode, Sam and Tony discuss calculating data usage at scale, the creativity of attackers, and how to evolve as threats increase. ------------------- “The great thing about Osquery is that since it is a sensor-based system that is queryable, it literally gives you the ability to discover new indicators of compromise and then use those when doing security investigations. And Osquery allows you to create these extremely interesting queries that would find things that you woul…
Oct 12, 2022
Code Intelligence, GraphQL, and Closing the Remediation Gap with Beyang Liu
This episode features an interview with Beyang Liu, CTO and Co-founder of Sourcegraph, a code intelligence platform. Prior to Sourcegraph, Beyang was a software engineer at Palantir Technologies, where he developed new data analysis software on a customer-facing team working with Fortune 500 companies. Beyang studied Computer Science at Stanford, where he published research in probabilistic graphical models and computer vision at the Stanford AI Lab. In this episode, Sam sits down with Beyang to discuss the power of intelligence and visualization, GraphQL versus REST API, and how Sourcegraph is drawing inspiration from Google. ------------------- “When I think about the future of Sourcegraph, it's really the future of this global human knowledge base that we're constructing. Similar to the worldwide web, the internet, where that was an amazing thing that came along. We're starting to see something like that emerge in the world of code. The open source ecosystem is this amazing, de…
Sep 28, 2022
Stream Processing, Observability, and the User Experience with Eric Sammer
This episode features an interview with Eric Sammer, CEO of Decodable. Eric has been in the tech industry for over 20 years, holding various roles as an early Cloudera employee. He also was the co-founder and CTO of Rocana, which was acquired by Splunk in 2017. During his time at Splunk, Eric served as the VP and Senior Distinguished Engineer responsible for cloud platform services. In this episode, Sam and Eric discuss the gap between operating infrastructure and the analytical world, stream processing innovations, and why it’s important to work with people who are smarter than you. ------------------- "The thing about Decodable was just like let's connect systems, let's process the data between them. Apache Flink is the right engine and SQL is the language for programming the engine. It doesn't need to be any more complicated. The trick is getting it right, so that people can think about that part of the data infrastructure, the way they think about the network. They don't quest…
Jul 20, 2022
Season 3 Compressed Edition with Sam and Audra
Join Open||Source||Data executive producer Audra Montenegro as she and Sam discuss his learnings and takeaways from this season and what the future of open source data looks like. ------------------- “There's such an open conversation about, ‘Yeah, open source,’ we usually think about open source software. How can we cross apply more of what we think about in software in general into data, and then what is it that's totally new about this domain? So, the answers cluster into three groups. It's either about the source of the data itself is open, meaning this is government data or data that's been made public and it's openly accessible. Or it could be that open source data is how the data is actually produced. Is it using open source tooling? Is it on an open source architecture? And finally, how do you trust that open source data? If it's just a whole bunch of data but it hasn't been labeled, if it hasn't been managed and produced, turned into a product. How do you understand it…
Jul 6, 2022
Accelerating Computation, Machine Learning, and Data Mesh with Sophie Watson
This episode features an interview with Sophie Watson, Technical Product Marketing Manager at NVIDIA. Previously, Sophie served as a software engineer and principal data scientist at RedHat where she used machine learning to solve business problems in the hybrid cloud. Sophie has a PhD in Bayesian statistics and frequently speaks about machine learning workflows on Kubernetes, recommendation engines, and machine learning for search. In this episode, Sam and Sophie discuss Principal Component Analysis, computational acceleration, and MLOps. ------------------- “We all start when we get hold of a data set by visualizing it to try to understand it. So that usually for me involves starting with a simple technique, something like PCA, Principal Component Analysis. It's been around since the eighties, probably longer, maybe the sixties. Don't quote me on that. With Principal Component Analysis, we can map our high dimensional data down to a smaller number of dimensions. Let's map it dow…
Jun 29, 2022
Democratization and Cognition with Margot Gerritsen, Rachel Chalmers, and Patricia Boswell
This bonus episode features conversations from season 1 of the Open||Source||Data podcast. In this episode, you’ll hear from Margot Gerritsen, Stanford Professor and Co-Founder/Director of WiDS; Rachel Chalmers, Partner at Alchemist Accelerator; and Patricia Boswell, Staff Technical Writer at Google. Sam sat down with each guest to discuss cognition and democratization in data. You can listen to the full episodes from Margot Gerritsen, Rachel Chalmers, and Patricia Boswell by clicking the links below. ------------------- Episode Timestamps: (00:18): Margot Gerritsen (02:07): Rachel Chalmers (03:46): Patricia Boswell ------------------- Links: Listen to Margot’s episode Listen to Rachel’s episode Listen to Patricia's episode
Jun 22, 2022
Vector Search, the AI Stack and more with Bob van Luijt
This episode features an interview with Bob van Luijt, CEO and Co-Founder of SeMI Technologies and co-creator of Weaviate, an open source vector search engine. At just 15 years of age, Bob started his own software company in the Netherlands. He went on to study music at ArtEZ University of the Arts and Berklee College of Music, and completed the Harvard Business School Program of Management Excellence. Bob is also a TedX speaker, discussing the relationship between software and language. In this episode, Sam sits down with Bob to break down vector search, the AI-first ecosystem, and how music and software relate to one another. ------------------- “I dare to argue that from the two big waves in database technology that we've seen, so first, in the seventies and eighties with SQL. And then the whole NoSQL wave that we have seen and the big winners that are in there, I dare to argue that we see a third wave coming up. And the third wave, I simply call it AI-first. And what I mean w…
Jun 8, 2022
Open Source Innovation, The GPL for Data, and The Data In to Data Out Ratio with Larry Augustin
This episode features an interview with Larry Augustin, angel investor and advisor to early-stage technology companies. Larry previously served as the Vice President for Applications at AWS, where he was responsible for application services like Pinpoint, Chime, and WorkSpaces. Before joining AWS, Larry was the CEO of SugarCRM, an open source CRM vendor. He also was the founder and CEO of VA Linux, where he launched SourceForge. Among the group who coined the term “open source”, Larry has sat on the boards of several open source and Linux organizations. In this episode, Sam and Larry discuss who owns the rights to data, the data in to data out ratio, and why Larry is an open source titan. ------------------- "People are willing to give up so much of their personal information because they get an awful lot back. And privacy experts come along and say, ‘Well, you're taking all this personal information’. But then most people look at that and say, ‘But I get a lot of value b…
Jun 1, 2022
Data Observability with Barr Moses, Einat Orr, and Shinji Kim
This bonus episode features conversations from season 2 of the Open||Source||Data podcast. In this episode, you’ll hear from Barr Moses, Co-founder and CEO at Monte Carlo; Einat Orr, Co-founder and CEO at Treeverse; and Shinji Kim, Founder and CEO at Select Star. Sam sat down with each guest to discuss data observability. You can listen to the full episodes from Barr Moses, Einat Orr, and Shinji Kim by clicking the links below. ------------------- Episode Timestamps: (00:35): Barr Moses (01:21): Einat Orr (02:07): Shinji Kim ------------------- Links: Listen to Barr’s episode Listen to Einat’s episode Listen to Shinji’s episode
May 25, 2022
Apache Pinot and Real-Time Analytics with Neha Pawar
This episode features an interview with Neha Pawar, a Founding Engineer at StarTree. StarTree is a software development company that focuses on democratizing data for all users by providing real-time, user-facing analytics. Prior to her time at StarTree, Neha was a Senior Software Engineer on LinkedIn’s Data Analytics team where she spent five years working on Apache Pinot. Neha has provided countless contributions to Pinot over the years, focusing on real-time streaming integrations, ingestion, and storage. In this episode, Sam sits down with Neha to discuss Apache Pinot’s impact on the data community and how LinkedIn popularized real-time analytics. ------------------- "Many people do think that a batch is good enough, real-time infra is expensive anyway. And what difference is it going to make if the data shown in this application is a day ago or an hour ago, and it's not real-time to the nearest second? And while that is true, in some cases, but in many other cases, not hav…
May 11, 2022
Real-Time Data, Enabling Developers, and User Experience with DeVaris Brown
This episode features an interview with DeVaris Brown, CEO and Co-Founder of Meroxa. Meroxa was founded in 2020 and enables teams of any size and any expertise to build real-time data pipelines in minutes. Previously, DeVaris was a product leader at Twitter, Heroku, and Zendesk. Sam and DeVaris even crossed paths at Microsoft in the aughts. In this episode, Sam and DeVaris discuss enabling developers, real-time data, and providing the ultimate user experience. ------------------- "From the beginning we wanted to be system engineer first, software engineer second, and we were happy to stand on the shoulders of giants that built foundational pieces of technology to help us get our job done more efficiently. [...] The one thing I love about my co-founder and he's super humble, Ali, we did billions of events a minute at Heroku on the data platform for tens of thousands of Kafka clusters for thousands of customers. But the team was six and he was a lead on that team. And we had five nin…
May 4, 2022
Data Meshes, Fabrics, and Discovery with Zhamak Dehghani, David Thomas, and Shirshanka Das
This bonus episode features conversations from season 1 and 2 of the Open||Source||Data podcast. In this episode, you’ll hear from Zhamak Dehghani, Director of Emerging Technologies at ThoughtWorks North America; David Thomas, Principal at Deloitte; and Shirshanka Das, Founder of LinkedIn DataHub and Acryl Data. Sam sat down with each guest to discuss data meshes, fabrics, and discovery. You can listen to the full episodes from Zhamak Dehghani, David Thomas, and Shirshanka Das by clicking the links below. ------------------- Episode Timestamps: (00:36): Zhamak Dehghani (01:41): David Thomas (02:43): Shirshanka Das ------------------- Links: Listen to Zhamak’s episode Listen to David’s episode Listen to Shirshanka’s episode
Apr 27, 2022
Investing in Communities, Differentiating, and Trusting Your Gut with Erica Brescia
This episode features an interview with Erica Brescia, Managing Director of Redpoint Ventures. At Redpoint, Erica focuses her investing on infrastructure, DevOps, and security. Erica has over 15 years of experience in the open source community and currently serves on the board of directors of the Linux Foundation. Prior to joining Redpoint, Erica was also an angel investor and advisor to companies such as Netlify, Coda, and Xata. In this episode, Sam and Erica discuss the evolution of open source data, what’s changed for practitioners, and why you should always listen to your gut. ------------------- “I think there is just so much good motivation to make the world a better place, especially during my time at GitHub. When you can see what kinds of opportunity open source can bring to people in developing countries, that’s really exciting. You see people whose lives and livelihoods have literally been changed because they were able to participate in a global open source project…
Apr 20, 2022
Data on Kubernetes with Kelsey Hightower, Lachlan Evenson, and Patrick McFadin
This bonus episode features conversations from season 1 of the Open||Source||Data podcast. In this episode, you’ll hear from Kelsey Hightower, Principal Engineer at Google Cloud; Lachlan Evenson, Principal Program Manager at Microsoft Azure; and Patrick McFadin, Head of Developer Relations at DataStax. Sam sat down with each guest to discuss Data on Kubernetes and how they’re making progress on a stateless infrastructure. You can listen to the full episodes from Kelsey Hightower, Lachlan Evenson, and Patrick McFadin by clicking the links below. ------------------- Timestamps: (00:39): Kelsey Hightower (01:33): Lachlan Evenson (02:06): Patrick McFadin ------------------- Links: Listen to Kelsey’s episode Listen to Lachlan’s episode Listen to Patrick’s episode
Apr 13, 2022
Deep Fakes, Responsible Data Science, and Trust with David Danks
This episode features an interview with David Danks, Professor of Data Science and Philosophy and affiliate faculty in Computer Science and Engineering at University of California, San Diego. Prior to UCSD, David was the L.L. Thurstone Professor of Philosophy and Psychology at Carnegie Mellon University. David’s research interests are at the intersection of philosophy, cognitive science, and machine learning. He has also examined the ethics surrounding artificial intelligence in the fields of healthcare, privacy, and security. In this episode, David and Sam dive into responsible data science, deep fakes, and if data is to blame for the lack of trust among consumers. ------------------- "There's a, almost, glorification of the technology that's happening at the moment. And the technology is obviously crucial, but what I really care about in a lot of ways is what are the human beings who build and use that technology doing with it? Because the exact same ones and zeros, the exact s…
Mar 30, 2022
Cloud Innovation, Analytics, and Data Transformation with Monica Kumar
This episode features an interview with Monica Kumar, Senior Vice President of Marketing and Cloud-Go-To Market at Nutanix. Nutanix is a data platform that is redefining workloads in cloud environments. Prior to Nutanix, Monica spent two decades at Oracle where she launched several market solutions. Monica is passionate about positioning and supporting women in leadership roles. She is a founding limited partner of Neythri Futures Fund, a venture fund dedicated to bringing South Asian women into the investment community. Monica also serves on the board of Directors at Watermark, an organization dedicated to women in leadership. In this episode, Monica and Sam discuss the evolving world of marketing analytics, tech’s biggest innovation to date, and how the data industry can change for the better. ------------------- “I believe that cloud has now become more of an operating model. It started out in the public cloud, but now organizations have adopted the same philosophy of self-s…
Mar 16, 2022
Data Lakehouses, Interoperability, and Accessibility with Tomer Shiran
This episode features an interview with Tomer Shiran, Founder and Chief Product Officer at Dremio. Dremio is a high-performance SQL lakehouse platform that helps companies get more from their data in the fastest way possible. Prior to Dremio, Tomer served as VP of Product at MapR and also held product management and engineering roles at Microsoft and IBM Research. He also has a master’s degree from Carnegie Mellon University as well as a bachelor’s from Technion - Israel Institute of Technology. In this episode, Tomer and Sam dive into the economics of storing data, how to build an open architecture, and what exactly a data lakehouse is. ------------------- “I think in the world of data lakes and lakehouses, the model has shifted upside down. Now, instead of bringing the data into the engines, you’re actually bringing the engines to the data. So you have this open data tier built on open source technology. The data is represented in open source formats and stored in the comp…
Mar 2, 2022
Interoperability, Governance, and Divergent Teams with Prukalpa Sankar
This episode features an interview with Prukalpa Sankar, Co-Founder of Atlan. Atlan is a venture-backed startup building a modern data workspace. Prukalpa also co-founded SocialCops, a data for good company behind landmark projects such as India’s National Data Platform. Prukalpa is a recognized industry leader, landing on the Forbes 30 Under 30 list and Fortune’s 40 Under 40. In this episode, Prukalpa and Sam discuss how diversity is a data team’s biggest strength, why governance isn’t always a bad thing, and what they hope the modern data stack will look like in 5 years. ------------------- “Diversity is our biggest strength but our biggest weakness, because it's really hard to make that team collaborate. Because most of the teams in the world are very uniform. So when every single person in the room is a subject matter expert on something, nobody else actually can have oversight on each other's work because they've never done it before. Then how do you create true trust…
Feb 16, 2022
Trust, Automation, and Trade-Offs with Joseph Jacks
This episode features an interview with Joseph Jacks, Founder and General Partner of OSS Capital. OSS Capital is the first and only COSS (Commercial Open Source Software) company investor that focuses on supporting early-stage COSS founders. Joseph, also known as JJ, has worked at Mesosphere, TIBCO Software, and Talend in various sales, engineering, and strategy roles. In this episode, JJ and Sam weigh the trade-offs of open and closed core companies and discuss how each can go public. JJ also dives into the misconception of trust equating privacy within tech. Guest Quote [25:14]: “There’s a societal recognition that if you use technology to automate some part of your life and you use that regularly, you have to be able to trust it. And I think gradually, consumers are becoming more and more aware that one of the most effective ways of checking the trust box is answering the question, ‘Is the technology I'm using open source at the core, yes or no?’ And if the answer is no,…
Feb 2, 2022
Open Source, Adoptability, and Name Changes with Martin Traverso
This episode features an interview with Martin Traverso, CTO at Starburst Data and Co-founder of Trino, a lightning fast distributed SQL query engine. Martin was previously a software engineer at Facebook where he led the Presto (now Trino) development team. Trino has gained worldwide adoption from companies like Netflix, Amazon, and LinkedIn. In this episode, Martin sits down with Sam to discuss the barriers, advantages, and complications of going open-source. Episode Notes -Guest Quote [33:55]: “What makes Trino powerful is the ecosystem around it. You have integrations with all sorts of data sources and that’s part of the power and magic of Trino. You can pull data from all these data sources using a single interface. On the other end is the integrations with all the tools that everyone uses. Once you put all those pieces together, that’s what gives Trino the power.” -Time Stamps [8:38]: How Martin solved Facebook’s analytics problem [13:00]: How the team adapted to…
Oct 29, 2021
Season Two Finale and Recap with Open||Source||Data Producer Audra Montenegro
Join Open||Source||Data producer Audra Montenegro as she and Sam cover highlights and takeaways from the ten episodes of season two. And get a sneak peak of what's in store for season three! See omnystudio.com/listener for privacy information.
Oct 14, 2021
Embeddings, Feature stores, and MLOps with Simba Khadder
Join CEO of Featureform, Simba Khadder as he talks with Sam about how versioning, immutability, and sharing will accelerate ML workflows. Tune-in on state of the art collaboration in data teams, and the power of focusing on your north star. See omnystudio.com/listener for privacy information.
Sep 30, 2021
Abundance, Metadata, and Automation with Mark Grover
How can we make data 10X more accessible for data-driven people within data-driven companies? Tune in to Mark and Sam discussing probabilistic product management, and the emerging metadata ecosystem. See omnystudio.com/listener for privacy information.
Sep 16, 2021
Metadata, Communities, and Architecture with Shirshanka Das
How can we evolve an expanding ecosystem of data technologies while making sense of the whole? Tune in to LinkedIn DataHub, and Acryl Data founder, Shirshanka Das, as he and Sam have a discussion on metadata at the center and specialization at the edge to sustainably scale data governance. See omnystudio.com/listener for privacy information.
Sep 2, 2021
Data Management Pain Points and Future Solutions for Data Discovery
Data discovery is one of the hardest problems to solve in data management in general and comes up as a major pain point in most data mesh discussions. Tune in to this all-star expert panel recorded in collaboration with the Data Mesh community, and hosted by a previous Open||Source||Data podcast guest, Paco Nathan of Derwen.ai. Paco engages panelists, Shinji Kim (Select Star), Sophie Watson (Red Hat), Mark Grover (Stemma), and Shirshanka Das (Acryl Data) in a 60-minute discussion on not only Data Mesh, but other data strategies and process needs for the data discovery future. See omnystudio.com/listener for privacy information.
Aug 19, 2021
ModelOps, ML Monitoring, and Busy Humans with Elena Samuylova
It’s 2 AM - do you know what your models are doing? Listen to Elena Samuylova as she talks to us about how to bridge the critical gaps between data scientists, engineers, and business managers using tooling and empathy. See omnystudio.com/listener for privacy information.
Aug 5, 2021
Cloud-Native, Open-Source, and Collaborative with Eric Brewer and Melody Meckfessel
Google Fellow & VP of Infrastructure Eric Brewer, Observable CEO Melody Meckfessel, and DataStax Chief Strategy Officer Sam Ramji explore the state of the art, the near future, and grand challenges for the next decade in cloud-native data. See omnystudio.com/listener for privacy information.
Jul 22, 2021
MLOps, AIOps, and Data Startups with Jocelyn Goldfein
Dealing with data hyperabundance, solving economic problems for businesses and changing lives for the better. Tune-in to Managing Director at Zetta Venture Partners, Jocelyn Goldfein as she and Sam have a discussion around engineering leadership, organizational graph structures, and productization of AI. See omnystudio.com/listener for privacy information.
Jul 8, 2021
Git-Like Branch and Merge for Data with Einat Orr
What if you could version object storage just like code? Tune in to Einat Orr as she explains how CI/CD and data lineage are being transformed through versioning data, enabling sandboxes, safe rollbacks, and coherent history. See omnystudio.com/listener for privacy information.
Jun 24, 2021
Data Discoverability, Products, and User Diversity with Shinji Kim
Learn how an accelerating abundance of data can be harnessed through telemetry. Tune-in while Shinji Kim and Sam explore opening data to more users, PageRank for tables, and pragmatic use of data lineage to find value. See omnystudio.com/listener for privacy information.
Jun 10, 2021
Data Observability, Customer-Led Growth, and Confidence with Barr Moses
Barr Moses discusses with Sam about bringing DevOps into Data Engineering, building a data startup, and letting joy guide your way to creating impact. Learn how being data-driven depends on systems of people and trust. See omnystudio.com/listener for privacy information.
Apr 22, 2021
Open Source Data & Its Role in the Future of Technology: Season 1 Recap
Wrapping up Season 1, Open||Source||Data producer Audra Montenegro Carter joins Sam Ramji in a conversation about the inspiration and behind-the-scenes production of the podcast, touching upon the top takeaways and lessons learned with Season 1 guests from AWS, Microsoft, ThoughtWorks, Deloitte, Observable, and many more. See omnystudio.com/listener for privacy information.
Mar 25, 2021
Observable Co-Founder and CEO Melody Meckfessel joins Sam in a conversation on how millions of developers are changing how we experience data. Listen-in as Melody explains the importance of data literacy and the shift in data collaboration. See omnystudio.com/listener for privacy information.
Mar 11, 2021
DataOps, MLOps, and Self Service: How Data Teams are Changing
Join Data Institute's Managing Director, Jesse Anderson to learn how data teams are changing in response to overwhelming demand for data products. Tune in as he and Sam discuss bringing software engineering into the domain of data - and why he wrote Data Teams. See omnystudio.com/listener for privacy information.
Feb 25, 2021
Fabrics, Meshes, and Graphs with Deloitte Principal Dave Thomas
Join Dave and Sam as they discuss data sets evolving from finite to infinite, and finding the needle in the haystack with math. Listen to Dave talk about cutting edge data problems and the essential need for curious people. See omnystudio.com/listener for privacy information.
Feb 11, 2021
Metadata, Graphs, and Responsible AI with Paco Nathan
Data Science player and coach, Author, and Venture Amplifier Paco Nathan talks with Sam Ramji about Hybrid AI, mathematical reversibility, and using AI to solve knowledge problems that the exponential growth of data will create for years to come. Join these two as they discuss how you can bring multiple data disciplines together using empathy and math. See omnystudio.com/listener for privacy information.
Jan 28, 2021
Data Analytics: Hard Skills vs Soft Skills and the Gift of Thinking Different
Analytics manager, and Women in Data podcast producer and host Karen Jean-Francois walks us through the differences between Data Science and Analytics. Join her and Sam as they discuss valuable skills you’ll need when transitioning to a career in Data Analytics. Hear Karen’s perspective on the benefits of thinking differently and having a mentor to guide you through transitions. See omnystudio.com/listener for privacy information.
Jan 14, 2021
Global Connectivity: Share and Democratize Through Open Data
Co Founder and Director of WiDS, and Stanford Professor Margot Gerritsen joins Sam Ramji in a conversation about how a data community provides global connectivity, and how learning is all about seeking discomfort with uncertainty and ambiguity. Learn how data is the new gold, but rather than sitting on the mine - share the wealth through a career in Data Science. See omnystudio.com/listener for privacy information.
Dec 23, 2020
From DBA to SRE: 2021 Predictions for Data on Kubernetes
With data comes DBAs and with Kubernetes comes SREs. Listen in as Patrick McFadin and Sam discuss what’s in store in 2021 for Data on Kubernetes, how experienced DBA roles can evolve into very effective SREs, and why today is THE day to learn Kubernetes. See omnystudio.com/listener for privacy information.
Dec 10, 2020
Open Source’s Impact in Academia with Open@RIT's Stephen Jacobs
From a 10-page white paper to creating one of the first University OSPOs - Stephen Jacobs will take us through the 12 years of work it took him to launch a program like Open@RIT. Join Stephen and Sam as they discuss the impact an OSPO has on students' futures, as well as a University's surrounding communities. See omnystudio.com/listener for privacy information.
Nov 25, 2020
Data Meshes: Big Data Architecture Becoming Distributed, Declarative and Domain Oriented
Beyond The Data Lake was Director of Emerging Technologies at ThoughtWorks, Zhamak Dehghani's 2017 paper that was a guiding light for Sam Ramji at another point in his career. Listen to how a Data Mesh allows composition of multi-model data across an organization and beyond. See omnystudio.com/listener for privacy information.
Nov 12, 2020
Data on Kubernetes: Platform, Resource, and Ecosystem tooling with Microsoft Azure’s Lachlan Evenson
How do we create free and open data sets that are trustworthy? Microsoft Azure’s Principal Program Manager Lachlan Evenson and Sam Ramji discuss standards for accessing data, and the magic that can happen with data on Kubernetes. See omnystudio.com/listener for privacy information.
Oct 29, 2020
Data, Kubernetes, and Our Best Selves with Google’s Kelsey Hightower
Inspire, collaborate, and solve together. Google Cloud Principal Engineer, Kelsey Hightower joins Sam Ramji to discuss the future of Data and Kubernetes, and what it means to participate in a welcoming developer community, while igniting positive growth. See omnystudio.com/listener for privacy information.
Oct 15, 2020
Culture and Cognition in DevOps with Alchemist Accelerator’s Rachel Chalmers
Sam invites Rachel Chalmers, an investor, advisor, and technology industry analyst for over 20 years, for a candid conversation about the DevOps culture of shared purpose and blamelessness. Sam and Rachel explore how process, trust, and care for each other creates more innovation and gives us the opportunity to change and grow as human beings. See omnystudio.com/listener for privacy information.
Oct 1, 2020
Open Source Sustainability with AWS Exec + Tech Columnist Matt Asay
Matt Asay shares his journey through open source and behind-the-scenes stories on what gives these communities their strength: its people and their voices. See omnystudio.com/listener for privacy information.
Sep 15, 2020
Storytelling in Product Development with Google’s Patricia Boswell
Behind every great product is a great story. Sam invites Google Staff Technical Writer Patricia Boswell to discuss her role of technical writing in software and the importance of using narrative as a North Star when designing a product. See omnystudio.com/listener for privacy information.
Sep 3, 2020
Introducing Open||Source||Data with Sam Ramji
What can we learn from cloud-native development and how can we share that with developers, engineers, product owners, and product managers of the new world? Join DataStax Chief Strategy Officer and 25-year open source veteran, Sam Ramji, as he interviews innovators who are shaping the future of open source data, open source software, data on Kubernetes, data in DevOps, data in AI, and much more. See omnystudio.com/listener for privacy information.