Our Courses  
 
Data Science Course Content

Python for Data Science

  • Python Programming Introduction
  • Data Types and Data Structures
  • Control Statements
  • Functions
  • User Defined Functions
  • Python Packages
  • NumPy, Pandas, Seaborn, Matplotlib

R-Programming

  • R and R-Studio Installation
  • Data Types and Data Structures
  • Logical Operators
  • Loops
  • Packages and Functions in R
  • Data Frame operations
  • Getting Data into R from flat files and Data bases

Business Statistics

  • Probability refresher
  • Measures and Central tendency
  • Measures of speed
  • Distributions
  • Hypothesis testing
  • Correlations and Covariance

Explore Data Analytics and Visualizations

  • Summary Statistics
  • Data Distributions
  • Data Transformations
  • Outlier Detection
  • Charts and Graphs
  • One Dimension Charts
  • Two Dimensional Charts

Data Pre-processing

  • Data types and Conversions
  • Normalization
  • Scaling
  • Imputation

Machine Learning

Supervised Learning

Introduction

  • Steps in Supervised Learning
  • Regression and Classification
  • Training and Testing
  • Measures of Performances
  • R-Square,RMSE ,MAE for regression
  • Confusion matrix
  • Accuracy, Precision and Recall
  • F-1 Score
  • Sensitivity and Specificity
  • Roc and Auc
  • Linear Regression

    • Simple Linear Regression
    • Cost Functions
    • Sum of Least Squares
    • Variable selection
    • Model Development and Improvement
    • Model Validation and Diagnostics
    • Gradient Descent Approach
    • Real Time Project

    Logistic Regression

    • Variable selection methods
    • Forward, Backward and Stepwise
    • Model Development and Validation
    • Measurements of accuracy
    • Interpretation and Implementation

    Decision Trees

    • Rule Based Learning
    • Construction of Rules
    • Decision Nodes vs Leaf Nodes
    • Choosing Variables for Decision Nodes
    • Measures of Impurity
    • Entropy, Gini Index and Information gain
    • Overfitting and Pruning

    Bagging and Random Forest

    • Resampling Methods
    • Resampling with replacement
    • Resampling without replacement
    • Random Forest

    Boosting

    • Adaboost
    • Gradient Boost
    • Extreme Gradient Boosting - Xgboost

    Support Vector Machines

    • Maximum Margin Classifier
    • Support vector Classifier
    • Kernels -Linear and Non-Linear

    Cross Validation

    • K-Fold Cross Validation
    • Cross Validation Usage
    • Bias and Variance

    Unsupervised Learning,Clustering (Segmentation)

    • K-Means Clustering
    • Cluster Grouping

    Dimensional Reduction Techniques

    • Principle component of analysis
    • Vector calculations

    Association Rules

    • Market Basket Analysis
    • APRIORI Algorithm

    Text Mining

    • Text Analysis
    • Cleaning Text Data
    • Tokenization and Pre-Processing
    • Word Counts and Word Clouds
    • Text Classification
    • Natural Language Processing

    Probabilistic Methods Introduction

    • Naïve Bayes
    • Joint and Conditional Probabilities
    • Classification using Naïve Bayes

    Extra topics For Freshers

    • SQL
    • Joins
    • Where clause, group by clause
    • Datawarehouse concepts

    For Experienced Persons

    Web Scraping

    • Getting data from single website
    • Getting data from multiple websites

    Forecasting

    • Time Series
    • Components of Time Series
    • Trend, Seasonality, Randomness
    • Additive and Multiplicative
    • Holt Winters
    • Moving Averages
    • Exponential smoothing
    • Arima
     
     
     
        Copyright © 2014 www.ifocusitolutions.com All rights reserved. Facebook Twitter LinkedIn Blog