Skills & Tools
Use Python to mine datasets and predict patterns.
Build statistical models — regression and classification — that generate usable information from raw data.
The Big Picture
Master the basics of machine learning and harness the power of data to forecast what’s next.
Meet Your Support Team
Our educational excellence is a community effort. When you learn at GA, you can always rely on an in-house team of experts to provide guidance and support, whenever you need it.
Learn industry-grade frameworks, tools, vocabulary, and best practices from a teacher whose daily work involves using them expertly.
Taking on new material isn’t always easy. Through office hours and other channels, our TAs are here to provide you with answers, tips, and more.
Our alumni love their Course Producers, who kept them motivated throughout the course. You can reach out to yours for support anytime.
See What You'll Learn
Unit 1: Research Design and Exploratory Data Analysis
- What is Data Science
- Describe course syllabus and establish the classroom environment
- Answer the questions: "What is Data Science? What roles exist in Data Science?"
- Define the workflow, tools, and approaches data scientists use to analyze data
- Research Design and Pandas
- Define a problem and identify appropriate data sets using the data science workflow
- Walkthrough the data science workflow using a case study in the Pandas library
- Import, format and clean data using the Pandas Library
- Statistics Fundamental I
- Use NumPy and Pandas libraries to analyze datasets using basic summary statistics: mean, median, mode, max, min, quartile, inter-quartile, range, variance, standard deviation, and correlation
- Create data visualization – scatter plots, scatter matrix, line graph, box blots, and histograms – to discern characteristics and trends in a dataset
- Identify a normal distribution within a dataset using summary statistics and visualization
- Statistics Fundamental II
- Explain the difference between causation vs. correlation
- Test a hypothesis within a sample case study
- Validate your findings using statistical analysis (p-values, confidence intervals)
- Instructor Choice
- Focus on a topic selected by the instructor/class in order to provide deeper insight into exploratory data analysis
Unit 2: Foundations of Data Modeling
- Introduction to Regression
- Define data modeling and linear regression
- Differentiate between categorical and continuous variables
- Build a linear regression model using a dataset that meets the linearity assumption using the scikit-learn library
- Evaluating Model Fit
- Define regularization, bias, and errors metrics;
- Evaluate model fit by using loss functions including mean absolute error, mean squared error, root mean squared error
- Select regression methods based on fit and complexity
- Introduction to Classification
- Define a classification model
- Build a K–Nearest Neighbors using the scikit–learn library
- Evaluate and tune model by using metrics such as classification accuracy⁄error
- Introduction to Logistic Regression
- Build a Logistic regression classification model using the scikit learn library
- Describe the sigmoid function, odds, and odds ratios and how they relate to logistic regression
- Evaluate a model using metrics such as classification accuracy⁄error, confusion matrix, ROC⁄AOC curves, and loss functions
- Communicate Results from Logistic Regression
- Explain the tradeoff between the precision and recall of a model and articulate the cost of false positives vs. false negatives.
- Identify the components of a concise, convincing report and how they relate to specific audiences ⁄ stakeholders
- Describe the difference between visualization for presentations vs. exploratory data analysis
- Flexible Class Session
- Focus on a topic selected by the instructor ⁄ class in order to provide deeper insight into data modeling
Unit 3: Data Science in the Real World
- Decision Trees and Random Forest
- Describe the difference between classification and regression trees and how to interpret these models
- Explain and communicate the tradeoffs of decision trees vs regression models
- Build decision trees and random forests using the scikit-learn library
- Natural Language Processing
- Demonstrate how to tokenize natural language text using NLTK
- Categorize and tag unstructured text data
- Explain how to build a text classification model using NLTK
- Dimensionality Reduction
- Explain how to perform a dimensional reduction using topic models
- Demonstrate how to refine data using latent dirichlet allocation (LDA)
- Extract information from a sample text dataset
- Working with Time Series Data
- Explain why time series data is different than other data and how to account for it
- Create rolling means and plot time series data using the Pandas library
- Perform autocorrelation on time series data
- Creating Models with Time Series Data
- Decompose time series data into trend and residual components
- Validate and cross-validate data from different datasets
- Use the ARIMA model to forecast and detect trends in time series data
- The Value of Databases
- Describe the use cases for different types of databases
- Explain differences between relational databases and document-based databases
- Write simple select queries to pull data from a database and use within Pandas
- Moving Forward with your Data Science Career
- Specify common models used within different industries
- Identify the use cases for common models
- Discuss next steps and additional resources for data science learning
- Flexible Class Session
- Focus on a topic selected by the instructor⁄class in order to provide deeper insight into data science in the real world
- Final Presentations
- Present final presentation to peers, instructor, and guest panelists who will identify strengths and areas for improvement
Need payment assistance? Our financing options allow you to focus on your goals instead of the barriers that keep you from reaching them.
Apply for an interest-free loan up to 18 months, or a fixed fee loan up to 48 months.⁵
⁵Must be a Hong Kong citizen or permanent resident.
Financing options differ in each market and are only available to students accepted into our programs.
Contact a local admissions officer for more info.
About the School
General Assembly is a pioneer in education and career transformation, specializing in today’s most in-demand skills. The leading source for training, staffing, and career transitions, we foster a flou ... Read More