Synopsis
Kirill Eremenko is a Data Science coach and lifestyle entrepreneur. The goal of the Super Data Science podcast is to bring you the most inspiring Data Scientists and Analysts from around the World to help you build your successful career in Data Science. Data is growing exponentially and so are salaries of those who work in analytics. This podcast can help you learn how to skyrocket your analytics career. Big Data, visualization, predictive modeling, forecasting, analysis, business processes, statistics, R, Python, SQL programming, tableau, machine learning, hadoop, databases, data science MBAs, and all the analytcis tools and skills that will help you better understand how to crush it in Data Science.
Episodes
-
561: Engineering Data APIs
29/03/2022 Duration: 53minIn this episode, Ribbon Health CTO Nate Fox joins us to discuss the ins and outs of APIs. Tune in to hear him share how he and his team build out APIs from scratch; how they ensure the uptime and reliability of APIs and how they leverage machine learning to improve the quality of healthcare delivery and maximize their social impact. In this episode you will learn: What are APIs? [13:20] How Ribbon Health’s data API leverages ML models to improve the quality of healthcare delivery [16:08] How to design a data API from scratch [20:00] How to ensure the uptime and reliability of APIs [25:28] How Ribbon uses knowledge graphs, manually labeled data samples, and an XGBoost model with hundreds of inputs to assign a confidence score [27:14] Nate’s favorite tool for easily scaling up the impact of data science [37:40] What is Nate’s day-to-day like? [34:34] The qualities Nate looks for when hiring data scientists [39:50] How scientists and engineers can make a big social impact in health technology [42:50]
-
560: Daily Habit #7: Read Two Pages
25/03/2022 Duration: 04minIn this episode, Jon shares his daily habit of reading two pages and explains how it has transformed his productivity. Additional materials: www.superdatascience.com/560
-
559: GPT-3 for Natural Language Processing
22/03/2022 Duration: 01h28minNatural language processing expert and PhD student Melanie Subbiah sits down with Jon Krohn to discuss GPT-3, its strengths and weaknesses, and the future of NLP. In this episode you will learn: What is GPT-3? [6:24] The strengths and weaknesses of GPT-3 [14:38] What is autoregression? [18:03] GPT-3's new fine-tuning abilities [20:02] Bias issues with GPT-3 [22:47] The future of natural language processing models [27:54] How Melanie ended up working at OpenAI [38:13] Melanie’s self-study process [42:19] Melanie's work on OpenAI API [45:45] How to address the climate change and bias issues that cloud discussions of large natural language models [49:40] Why Melanie chose to do a PhD at Columbia University [1:01:17] The machine learning tools Melanie’s most excited about [1:08:09] Additional materials: www.superdatascience.com/559
-
558: Jon's Answers to Questions on Machine Learning
18/03/2022 Duration: 06minIn this episode, Jon shares the key topics he recently discussed with the Open Data Science Conference. From the approach behind his extensive machine learning and deep learning content library to revealing the key tools and software he uses daily, get to know Jon and his process a little better. Additional materials: www.superdatascience.com/558
-
557: Effective Pandas
15/03/2022 Duration: 01h30minPandas expert Matt Harrison sits down with Jon Krohn to discuss tips, tricks and best practices for Pandas learning and mastery. In this episode you will learn: Pros and cons of self-publishing and working with a publisher [5:05] Matt's six tips for using Pandas [17:13] The best way for corporate teams to level up their skills [40:04] How to learn anything effectively [47:14] Matt’s tricks for staying motivated [50:00] Matt’s recommendations for using Git and the Unix command line [1:00:14] Matt’s recommended software libraries for working with tabular data [1:19:45] Additional materials: www.superdatascience.com/557
-
556: Jon's Machine Learning Courses
11/03/2022 Duration: 07minDiscover Jon’s extensive library of machine learning content and learn why Jon's Machine Learning House forms the knowledge structure of an outstanding data scientist or ML engineer. Additional materials: www.superdatascience.com/556
-
555: Sports Analytics and 66 Days of Data with Ken Jee
08/03/2022 Duration: 01h13minData scientist and Youtuber Ken Jee joins Jon Krohn for a deep dive into the world of sports analytics and brings us behind the makings of his large, online data science community. In this episode you will learn: The inspiration behind Ken’s YouTube videos [18:03] Ken’s four steps for getting started in data science [24:18] How sports analytics is transforming sports like golf [33:32] Ken’s favorite tools for software scripting as well as for production code development [41:10] How the #66DaysofData hashtag can supercharge your capacity as a data scientist [42:51] Ken’s data science podcast Ken’s Nearest Neighbors [54:11] LinkedIn Q&A [1:00:32] Additional materials: www.superdatascience.com/555
-
554: Jon's Deep Learning Courses
04/03/2022 Duration: 05minIn this episode, Jon shares where you can find his extensive deep learning video content and courses. Tune in to learn more about his deep learning curriculum and where you can learn for free. Additional materials: www.superdatascience.com/554
-
553: The Statistics and Machine Learning Quests of Dr. Josh Starmer
01/03/2022 Duration: 01h48minIn this episode, Dr. Josh Starmer, the creative, musical genius behind the wildly popular YouTube channel StatQuest joins the podcast to discuss statistics, learning and communication secrets, and how he grew his YouTube channel to over 650,000 subscribers. In this episode you will learn: The inspiration behind Josh’s YouTube channel [18:39] Josh's simple approach to learning something new [34:25] Josh's secret tool for creating YouTube videos with over a million views [51:01] The StatQuest Illustrated Guide to Machine Learning [53:34] How and when Josh uses R vs. Python [1:07:53] How to cluster any types of data using the R randomForest package [1:11:24] Why Josh left his academic career [1:14:24] The two stats concepts Josh thinks everyone should know [1:38:50] Additional materials: www.superdatascience.com/553
-
552: The Most Popular SuperDataScience Episodes of 2021
25/02/2022 Duration: 04minIn this episode of Five-Minute Friday, Jon recaps the most popular SuperDataScience podcast episodes from 2021. See what you might have missed and catch up today! Additional materials: www.superdatascience.com/552
-
551: Deep Reinforcement Learning — with Wah Loon Keng
22/02/2022 Duration: 01h21minIn this episode, gifted author and software engineer Wah Loon Keng joins the podcast to dive deep into reinforcement learning. From its history to limitations, modern industrial applications, and future developments– there's no better expert to learn from if you want to know more about this complex topic. In this episode you will learn: What is reinforcement learning? [4:50] Deep reinforcement learning vs reinforcement learning [13:17] A timeline of reinforcement learning breakthroughs [16:17] The limitations of deep RL today [39:53] Deep RL applications [53:10] Keng's open-source SLM-Lab framework [57:51] Keng’s responsibilities as an AI engineer [1:02:17] What is the future of RL? [1:08:05] Additional materials: www.superdatascience.com/551
-
550: Daily Habit #6: Write Morning Pages
18/02/2022 Duration: 04minJon is back with another Five-Minute Friday habit-tracking episode! Listen in as he explains how writing morning pages has helped his data science work flourish with creativity. Inspired by Julia Cameron's book The Artist's Way, he details his morning pages routine and how it kickstarted a new chapter in his career. Additional materials: www.superdatascience.com/550
-
549: Engineering Natural Language Models — with Lauren Zhu
15/02/2022 Duration: 01h06minIn this episode, Glean software engineer and Stanford graduate Lauren Zhu joins us to discuss her role at a fast-growing startup, working on natural language processing projects, and how she remains inspired by pursuing her side passions. In this episode you will learn: Lauren's experience as a course assistant [5:53] Stanford's Hacking the Coronavirus Course [11:53] How do you empower minority groups in AI [19:45] Lauren on zero-shot multilingual neural machine translation [23:25] Lauren's work at Glean [27:58] The Contrary Talent Network [34:30] The tools Lauren uses at Glean [43:39] The most important skills to possess as a data scientist [47:29] Additional materials: www.superdatascience.com/549
-
548: Daily Habit #5: Meditate
11/02/2022 Duration: 03minOur Five-Minute Friday series on habit tracking returns with a look at one of Jon's daily mindfulness habits–meditation. Learn how to keep this habit going for the long run and discover which tools help Jon stay on track. Additional materials: www.superdatascience.com/548
-
547: How Genes Influence Behavior — with Prof. Jonathan Flint
08/02/2022 Duration: 01h16minIn this episode, Dr. Jonathan Flint, Professor of Psychiatry and Biobehavioral Sciences at the University of California Los Angeles, joins us to discuss how he uses data science and machine learning to explore the link between genetics and depression. In this episode you will learn: Johnathan's background [2:53] How we know that genetics plays a role in complex human behaviors including psychiatric disorders like anxiety, depression, and schizophrenia [8:00] The role that data science and ML play in modern genetics research [15:08] About Jonathan book "How Genes Influence Behavior" [19:45] The day-to-day life of a world-class medical sciences researcher [32:24] The open-source software libraries that Jonathan uses for data modeling [40:33] A single question you can ask to prevent a severely depressed person from committing suicide [52:00] LinkedIn Q&A [54:41] The future of psychiatric treatments [1:05:35] Additional materials: www.superdatascience.com/547
-
546: Daily Habit #4: Alternate-Nostril Breathing
04/02/2022 Duration: 04minOur Five-Minute Friday habit-tracking series continues! Learn more about alternate-nostril breathing–the mindfulness technique that is scientifically proven to lower blood pressure and regulate the stress response. Additional materials: www.superdatascience.com/546
-
545: Scaling Data-Intensive Real-Time Applications — with Matthew Russell
01/02/2022 Duration: 01h16minData scientist and entrepreneur Matthew Russell joins Jon Krohn to discuss the intersection of machine learning and fitness and dive deep into the strategies he and his team at Strongest AI use to scale data-intensive real-time applications. In this episode you will learn: About Strongest's event platform and iOS app [6:06] How Strongest scaled to serve million [8:14] Strongest's unique approach to building a fitness app [17:50] How to rapidly test ML models for deployment [29:01] The three critical traits Matthew looks for in anyone he hires [33:11] Mining the Social Web [41:14] The values instilled in Matthew by pursuing a military education [53:30] The key skills Matthew wishes he’d learned earlier in his career [1:03:51] Additional materials: www.superdatascience.com/545
-
544: Daily Habit #3: Make Your Bed
28/01/2022 Duration: 02minOur habit-tracking series continues with a look at how making your bed can jumpstart your mornings, prevent you from taking part in negative habits and help you become happier. Additional materials: www.superdatascience.com/544
-
543: Sparking A.I. Innovation — with Nicole Büttner
25/01/2022 Duration: 55minNicole Büttner (Founder and CEO of Merantix Labs) joins the podcast to discuss driving A.I. innovation, automation, and transformation and building the ideal A.I. start-up founding team. In this episode you will learn: The three factors that spark A.I. innovation [12:48] How to make great use of the unlabelled, unbalanced data sets [18:54] How to engineer reusable data and software components [25:09] Merantix's A.I. Canvas framework for successful innovation [29:59] How to be a part of Merantix's program as a founder [45:23] Additional materials: www.superdatascience.com/543
-
542: Continuous Calendar for 2022
21/01/2022 Duration: 02minRevisit the much-underrated continuous calendar and get started with this uncommon planning method thanks to Jon's 2022 template. Additional materials: www.superdatascience.com/542