Book

Doing Data Science: Straight Talk from the Frontline

📖 Overview

Doing Data Science provides a practical guide to the emerging field of data science through real-world examples and expert insights. The book draws from O'Neil's experience teaching Columbia's Introduction to Data Science course and includes contributions from multiple industry practitioners. The text covers core concepts like statistical inference, algorithms, and machine learning while connecting them to concrete applications in business, technology, and research. Technical content is balanced with discussions of data ethics, team dynamics, and organizational challenges that working data scientists encounter. Each chapter tackles specific aspects of the data science workflow - from initial data collection through modeling, validation, and deployment. Case studies from companies like Google, OkCupid, and The New York Times demonstrate how data science methods are applied in production environments. The book presents data science not just as a technical discipline but as a rapidly evolving field that requires both rigorous analytical skills and careful consideration of social impact. Its structure reflects the multifaceted nature of modern data work, where statistical expertise meets business acumen and ethical judgment.

👀 Reviews

Readers describe this as a survey-style overview that captures real-world data science practices and challenges. Multiple reviews note it works best as a companion to the Columbia data science course it's based on. Liked: - Concrete examples from industry practitioners - Coverage of ethical considerations in data science - Clear explanations of statistical concepts - Practical code examples in R Disliked: - Uneven depth across chapters - Some content feels outdated (particularly social media examples) - Too basic for experienced practitioners - Code examples lack proper context - Jumps between topics without clear connections "The real-world case studies were invaluable" notes one Amazon reviewer, while another states "the chapters feel like independent blog posts rather than a cohesive book." Ratings: Goodreads: 3.8/5 (1,872 ratings) Amazon: 4.1/5 (242 ratings) O'Reilly: 4/5 (43 ratings)

📚 Similar books

Data Science for Business by Foster Provost and Tom Fawcett This book connects data science concepts to business applications through case studies and practical examples.

Naked Statistics by Charles Wheelan The text explains statistical concepts through real-world applications in business, policy, and social science.

The Signal and the Noise by Nate Silver This book examines prediction methods across fields including economics, climate science, and sports through the lens of probability and statistics.

Weapons of Math Destruction by Cathy O'Neil The book reveals how mathematical models and algorithms impact society through decisions in employment, education, and criminal justice.

Data Science from Scratch by Joel Grus The text builds data science concepts from fundamental principles using Python implementations and practical examples.

🤔 Interesting facts

📚 Author Cathy O'Neil previously worked as a quantitative analyst at D.E. Shaw hedge fund during the 2008 financial crisis, which heavily influenced her views on the ethical use of data. 🎓 The book emerged from a data science course taught at Columbia University by Rachel Schutt, who collaborated with O'Neil to transform the course content into this practical guide. 🔍 Many of the case studies in the book feature real-world examples from prominent companies like Netflix, Facebook, and OkCupid, providing readers with authentic insight into industry applications. ⚖️ Cathy O'Neil went on to write "Weapons of Math Destruction" (2016), which explores how algorithmic decision-making can perpetuate inequality and was longlisted for the National Book Award. 🌐 The book was one of the first mainstream publications to address the intersection of data science and ethics, predating the current widespread discussions about algorithmic bias and AI responsibility.