The premise of this book is to let the data sets lead the student along a path of discovery throughout the world of data science and many fascinating data sets. "Data science is an emerging field that brings together ideas that have been around for years, or even centuries. Most people define data science as 'an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms'." "This book is designed to be used in conjunction with external tools like Google Sheets and Jupyter Notebooks. You will need to move back and forth between browser tabs as you work with the tools, and follow the instructions in the book. You will be asked to answer the questions in the book as you read. This is to encourage you to type in the code we have provided and experiment with it."
Type of Material:
Open (Access) Textbook
Recommended Uses:
Interactive, multimedia text to support a data science course
Self-paced learning, if supported by additional materials addressing external tools
Technical Requirements:
Web browser
Identify Major Learning Goals:
Articulate the data science processing pipeline
Extract data using SQL
Gather data from the Internet using web API’s and screen scraping
Combine data from different sources
Clean the data
Handle missing data/finding outliers/fixing data
Normalize and rescaling data
Visualize the data
Translate questions to analysis and analysis to interesting stories
Analyze data
Single variable regression, logistic regression
Market basket analysis
Cohort analysis
Sentiment analysis, exposure to Bayes Theorem
Time series
Geographic analysis
Simulations, Monte Carlo
Understand statistical significance and how to test for it using practical simulation techniques.
Target Student Population:
College Lower Division, College Upper Division, Graduate School, Professional
Prerequisite Knowledge or Skills:
Multi-tab web-browser and online-tool usage because the book is designed to be used in conjunction with external tools (like Google Sheets and Jupyter Notebooks).
Basic programming concepts including fundamental flow control is needed.
Content Quality
Rating:
Strengths:
The content provided can help students to learn the topics in data science and acquire hand on experiences.
It focus on how make things done by illustrating the techniques with genuine examples, instead of lengthy descriptions on the theories.
It also makes good use of some cases with the data set obtainable by the users, so that they can work out by themselves on some meaning experience instead of just small toy problems. e.g. Text Analysis with UN General Debates, and CIA World Factbook Data
In summary:
1. Valid data science principles and models
2. Substantial coverage of the associated theory
3. Exploratory mode reflects practical usage
Potential Effectiveness as a Teaching Tool
Rating:
Strengths:
- Goals are well-articulated
- Blending of concepts, theory, and practical application increases potential for learning
- Reinforces conceptual understanding through combination of demonstration and interactive practice
- The learning goals are cleared stated in the preface.
- It does not provide many fancy stuff to attract the reader. Instead, it states clearly the details on how to done by python programming.
- The main methods and techniques introduced is critical in learning data science using pythod.
- Assignments are also available for user after registration and login.
- It also comes with some quick short quiz to reinforce the concepts.
Ease of Use for Both Students and Faculty
Rating:
Strengths:
- The contents are mainly text based added with come videos, source codes, expected outputs and tables.
- The website is well structured for sequencial learning for readers going topics to topics. It also provides some cross references linking to topics mentioned in other sections.
- It also comes with a pages of Scratch ActiveCode that allows readers to run code directly on the web site.
- The material provides clear, accurate instructions for its use
- Clear and internally consistent
- Obvious and straight-forward navigation
Concerns:
At least one link from the Table of Contents, "Assignments", leads to a page-not-found error.
Some aspects, especially those associated with the use of external tools, are likely to require resources in addition to the materials provided.
A subtle observation in about the navigation. To go the starting page, the user should click the button "httlads". It is not easily to be recognized to serve the purpose.
Creative Commons:
Search by ISBN?
It looks like you have entered an ISBN number. Would you like to search using what you have
entered as an ISBN number?
Searching for Members?
You entered an email address. Would you like to search for members? Click Yes to continue. If no, materials will be displayed first. You can refine your search with the options on the left of the results page.
Searching for Members?
You entered an email address. Would you like to search for members? Click Yes to continue. If no, materials will be displayed first. You can refine your search with the options on the left of the results page.