Synopsis
Kirill Eremenko is a Data Science coach and lifestyle entrepreneur. The goal of the Super Data Science podcast is to bring you the most inspiring Data Scientists and Analysts from around the World to help you build your successful career in Data Science. Data is growing exponentially and so are salaries of those who work in analytics. This podcast can help you learn how to skyrocket your analytics career. Big Data, visualization, predictive modeling, forecasting, analysis, business processes, statistics, R, Python, SQL programming, tableau, machine learning, hadoop, databases, data science MBAs, and all the analytcis tools and skills that will help you better understand how to crush it in Data Science.
Episodes
-
817: The Positron IDE, Tidy NLP and MLOps with Dr. Julia Silge
10/09/2024 Duration: 01h36minDr. Julia Silge, Engineering Manager at Posit, introduces the brand-new Positron IDE, perfect for exploratory data analysis and visualization. She also lays out her top picks for LLMs that boost coding efficiency and discusses when traditional NLP methods might be the smarter choice over LLMs. Plus, Julia highlights some must-know open-source libraries that make managing MLOps easier than ever. Tune in for insights that every data scientist, ML engineer, and developer will find useful. This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Overview of Posit and Positron IDE [05:20] • How the needs of a data scientist differ from those of a software developer [10:54] • How to contribute to the open-source Positron [19:50] • MLOps and Vetiver: Tools for deploying and maintaining ML mode
-
816: Explaining AGI to a 94-Year-Old
06/09/2024 Duration: 19minJon Krohn takes on a listener's challenge to explain his work in data science to his 94-year-old grandmother, Annie. This heartwarming conversation covers what data is, the role of a data scientist, and breaks down artificial intelligence (AI) and artificial general intelligence (AGI) in simple terms. The episode provides a fresh take on how to communicate complex topics to a lay audience, offering both clarity and insight. Additional materials: www.superdatascience.com/816 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
815: Polars: Faster DataFrame Ops, with Marco Gorelli
03/09/2024 Duration: 01h27minPolars, Python, Narwhals, Rust, and Pandas: Marco Gorelli talks to Jon Krohn about the many ways to use the newest data libraries available, the joys of open-source development, and the best method to win prizes in forecasting competitions. This episode is brought to you by AWS Inferentia and AWS Trainium, by Babbel, the science-backed language-learning platform, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • When to use Polars vs Pandas [08:26] • How Polars optimizes string operations and data processing [20:08] • Where Narwhals outstrips Polars and Pandas [48:37] • The benefits of using Altair [55:21] • Addressing the lack of women in data science [1:09:58] • How to win a forecasting competition [1:16:58] Additional materials: www.superdatascience.com/815
-
814: Summer Reflections
30/08/2024 Duration: 04minAs summer winds down, this episode shifts focus from the usual tech discussions to something more personal: reflecting on the importance of balancing work with life’s simple pleasures. While the world of data science and AI continues to evolve rapidly, it's essential to remember that true success isn't just about professional milestones. It’s also about cherishing the moments that make life meaningful. Tune in for a brief but impactful reflection on how to redefine success to include not just achievements, but also the everyday joys that often go unnoticed. Additional materials: www.superdatascience.com/814 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
813: Solving Business Problems Optimally with Data, with Jerry Yurchisin
27/08/2024 Duration: 01h43minJerry Yurchisin from Gurobi joins Jon Krohn to break down mathematical optimization, showing why it often outshines machine learning for real-world challenges. Find out how innovations like NVIDIA’s latest CPUs are speeding up solutions to problems like the Traveling Salesman in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • The Burrito Optimization Game and mathematical optimization use cases [03:36] • Key differences between machine learning and mathematical optimization [05:45] • How mathematical optimization is ideal for real-world constraints [13:50] • Gurobi’s APIs and the ease of integrating them [21:33] • How LLMs like GPT-4 can help with optimization problems [39:39] • Why integer variables are so complex to model [01:02:37] • NP-hard problems [01:11:01] • The history of optimization and its early applications [01:26:23] Additional materials: www.superdatascience.com/813
-
812: The AI Scientist: Towards Fully Automated, Open-Ended Scientific Discovery
23/08/2024 Duration: 11minIn this episode of Five-Minute Friday, Jon Krohn investigates published findings from the startup Sakana AI and its paper’s co-authors from the University of Oxford, the University of British Columbia and the Vector Institute in Toronto. These authors explore the potential of The AI Scientist, a framework that could change the way we conduct scientific research forever. Additional materials: www.superdatascience.com/812 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
811: Scaling Data Science Teams Effectively, with Nick Elprin
20/08/2024 Duration: 01h14minNick Elprin talks to Jon Krohn about how and when to scale a data science team and its workflows to secure a company’s commercial viability. You’ll also hear how to launch your own data science startup and why it’s so important to understand that AI tools are not one-size-fits-all. This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How Nick served enterprises with his AI startup, Domino Data Lab [05:36] • About the Navy’s own mine detection models [17:43] • The hype surrounding GenAI [30:35] • How AI platforms integrate with business strategies [39:49] • When it’s time to integrate an AI tool into your business [51:12] • Why Nick started Domino Data Lab [1:03:53] Additional materials: www.superdatascience.com/811
-
810: The Five Levels of Self-Driving Cars
16/08/2024 Duration: 08minSelf-driving cars are here, and Jon Krohn is breaking down the five levels of automation that could change driving forever. From full human control at Level 0 to cars that drive themselves in any condition at Level 5, get the real story on what these levels mean. With firsthand insights from a recent autonomous vehicle experience, this episode cuts through the buzz and tells you what’s coming next. Additional materials: www.superdatascience.com/810 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
809: Agentic AI, with Shingai Manjengwa
13/08/2024 Duration: 01h10minAgentic AI is revolutionizing the tech landscape, and Shingai Manjengwa from ChainML is here to tell us why. Discover how AI agents are becoming an integral part of our lives, automating tasks like travel bookings and daily inspiration. Shingai explains the power of multi-agent systems, where AI agents collaborate to solve complex challenges, and highlights how blockchain technology is enhancing AI transparency and trust. Plus, get an inside look at ChainML’s innovative Theoriq protocol and the groundbreaking Council Analytics tool. This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • What A.I. agents are [10:51] • How blockchain technology helps humans trust A.I. agents [18:27] • The Theoriq protocol developed by ChainML [34:05] • How Council Analytics lets you “speak” to their dat
-
808: In Case You Missed It in July 2024
09/08/2024 Duration: 38minAdvice for emerging data scientists, the latest in model merging, and how GenAI can supercharge your creativity: Host Jon Krohn gives us his highlights from a month of interviews, packed with tips from some of the leading names in data science and beyond. Guests include Daliana Liu, Charles Duhigg, Charles Goddard, Rosanne Liu and Andrey Kurenkov. Additional materials: www.superdatascience.com/808 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
807: Superintelligence and the Six Singularities, with Dr. Daniel Hulme
06/08/2024 Duration: 01h09minThe singularity could soon be upon us. The PESTLE framework, developed by this episode’s guest Daniel Hulme, expresses not one but six types of singularity that could occur: political, environmental, social, technological, legal and economic. Jon Krohn and Daniel Hulme discuss how each of these singularities could bring good to the world, aligning with human interests and pushing forward progress. They also talk about neuromorphic computing, machine consciousness, and applying AI at work. This episode is brought to you by AWS Inferentia and AWS Trainium, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • About the six singularities [03:43] • How the singularity could improve life on earth [09:01] • The credibility of AI experts [32:51] • How the decentralization of technology could benefit earth [43:14] • How AI might enhance creativity [1:04:33] Addit
-
806: Llama 3.1 405B: The First Open-Source Frontier LLM
02/08/2024 Duration: 18minLlama 3.1 is here, and it’s a game-changer. Meta’s latest AI model, especially the massive 405B variant, finally brings an open-source option to compete with giants like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet. While Meta didn’t fully open-source everything, the availability of "open weights" is a strategic move to shake up the AI landscape. The model boasts an impressive 128,000-token context window and multilingual support in eight languages. Meta is also focusing on responsible AI development with tools like Llama Guard 3 for content moderation. This release is more than just a tech upgrade—it's about democratizing AI and sparking innovation across industries. How will you leverage Llama 3.1 to make a real impact? Tune into this week’s FMF episode and let’s explore the future with this latest AI development together. Additional materials: www.superdatascience.com/806 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
805: How to Be a Supercommunicator, with Charles Duhigg
30/07/2024 Duration: 55minBecome a Supercommunicator! New York Times bestselling author Charles Duhigg, known for The Power of Habit and Smarter Faster Better, gets real about mastering communication in this episode. Discover insights from his latest book, Supercommunicator, where he reveals how to align conversation styles for deeper connections, handle conflicts effectively, and why AI can't replicate the emotional depth of human interactions. This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • The inspirations behind Supercommunicator [03:41] • The three types of conversations: Practical, emotional, and social conversations [05:22] • The matching principle: Align communication styles for better connection [10:36] • What is neural entrainment: Achieve a mind meld through synchronized brain activity [13:22] • The series of steps/principles to connect w
-
804: AI x Solar Power = Abundant Energy
26/07/2024 Duration: 13minSolar power now provides 6% of the world's electricity, thanks to rapid growth. Host Jon Krohn discusses the factors driving this rise, the challenges ahead, and how AI and data science are optimizing solar technologies. Tune in for insights on the future of solar power, and don't forget to like, share, and subscribe! Additional materials: www.superdatascience.com/804 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
803: How to Thrive in Your (Data Science) Career, with Daliana Liu
23/07/2024 Duration: 01h54minDaliana Liu is a big name in data science teaching, and she has always been generous in sharing everything she knows about getting a job in data science. In this episode, she continues to extend her generosity, helping listeners define their approach to achieving a fulfilling career in data science and tech. This episode is brought to you by AWS Inferentia and AWS Trainium, by Babbel, the science-backed language-learning platform, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Common career challenges for data scientists [34:57] • Advice for people who don’t know where to go in their career [48:05] • How to build resilience and protect against Imposter Syndrome [1:06:23] • Skills that data scientists should develop today [1:39:17] • The future of the data science and AI job market [1:46:55] Additional materials: www.superdatascience.com/803
-
802: In Case You Missed It in June 2024
19/07/2024 Duration: 23minHow to grab investor interest with your AI startup idea, revisiting algorithms, and helping practitioners ensure AI safety with regulatory frameworks and beyond: This month, you missed a whole bunch of great interviews. But don’t worry, Jon Krohn is here to recap all the best bits for you! Additional materials: www.superdatascience.com/802 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
-
801: Merged LLMs Are Smaller And More Capable, with Arcee AI's Mark McQuade and Charles Goddard
16/07/2024 Duration: 01h17minMerged LLMs are the future, and we’re exploring how with Mark McQuade and Charles Goddard from Arcee AI on this episode with Jon Krohn. Learn how to combine multiple LLMs without adding bulk, train more efficiently, and dive into different expert approaches. Discover how smaller models can outperform larger ones and leverage open-source projects for big enterprise wins. This episode is packed with must-know insights for data scientists and ML engineers. Don’t miss out! Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Explanation of Charles' job title: Chief of Frontier Research [03:31] • Model Merging Technology combining multiple LLMs without increasing size [04:43] • Using MergeKit for model merging [14:49] • Evolutionary Model Merging using evolutionary algorithms [22:55] • Commercial applications and success stories [28:10] • Comparison of Mixture of Experts (MoE) vs. Mixture of Agents [37:57] •
-
800: A Transformative Century of Technological Progress, with Annie P.
12/07/2024 Duration: 43minThe SuperDataScience Podcast is celebrating its 800th episode! Host Jon Krohn speaks to his grandmother, Annie, about growing up at a time when so many technologies we take for granted today were yet to be developed. Listen in to hear Annie’s experience of the changes in technology across 94 years and how she and her family fared in 1940s Ukraine with no electricity or running water. Additional materials: www.superdatascience.com/800
-
799: AGI Could Be Near: Dystopian and Utopian Implications, with Dr. Andrey Kurenkov
09/07/2024 Duration: 01h45minNo-code games with GenAI, the creative possibilities of LLMs, and our proximity to AGI: In this episode, Jon Krohn talks to Andrey Kurenkov about what turned him from an AGI skeptic to a positivist. You’ll also hear about his wildly popular podcast “Last Week in AI” and how the NVIDIA-backed startup Astrocade is helping videogame enthusiasts to create their own games through generative AI. A must-listen! This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • All about The Gradient and Last Week in AI [10:42] • All about Astrocade and Andrey’s role at the startup [24:35] • Balancing UX and creative control at Astrocade [42:00] • The creative possibilities of LLMs [1:04:15] • The rapid emergence of AGI [1:10:31] Additional materials: www.superdatascience.com/799
-
798: Claude 3.5 Sonnet: Frontier Capabilities & Slick New "Artifacts" UI
05/07/2024 Duration: 15minClaude 3.5 Sonnet, Anthropic’s newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it’s twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI. Additional materials: www.superdatascience.com/798 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.