449x Filetype PDF File size 1.92 MB Source: americancollege.edu.in
MSC – DATA SCIENCE
The American College
(An Autonomous Institution Affiliated to Madurai Kamaraj University)
(Re-accredited [2nd Cycle] by NAAC with Grade „A‟ & CGPA of 3.46 on a 4 point scale)
Madurai
Proposed PG Grid for June 2020
Sem. Course Code Course Title Hr. Cr. Mark
I PDS 1501 Concepts of Data Science 5 5 75
PDS 1503 Data Analytics (T + L) 6 5 75
PDS 1505 Artificial Intelligence 5 5 75
PDS 1607 Python Programming 5 6 80
PDS 1409 Python Programming Lab 4 4 60
PDS 1511 Probability and Statistics 5 5 75
Total 30 30 420
II PDS 1502 Data Mining and Warehousing 5 5 60
PDS 1404 Big Data Analytics 4 4 60
PDS 1406 Big Data Analytics Lab 4 4 80
PDS 1408 Machine Learning 4 4 60
PDS 1410 Computer Vision 4 4 80
PDS 1512 Linear Algebra 5 5 80
PDS 1414 Elective I 4 4 60
Total 30 30 420
III PDS 2501 Natural Language Processing (T + L) 5 5 60
PDS 2403 Deep Learning 4 4 80
PDS 2405 Reinforcement learning 4 4 80
PDS 2507 Operation Research 5 5 80
PDS 2409 Effective Communications 4 4 80
PDS 2411 Elective II 4 4 60
PDS 2413 Mini Project Lab 4 4 40
Total 30 30 480
IV PDS 2302 Industry Project 30 30 480
Total 30 24 480
GRAND 120 90 1800
TOTAL
Concepts of Data Science
Unit I : Introduction
Benefits and uses of data science - Facets of data - The big data ecosystem and data
science - data science process.
Unit II: Machine Learning
What is machine learning and why should you care about it? - The modeling process -
Types of machine learning. Handling large data on a single computer: General techniques
for handling large volumes of data - General programming tips for dealing with large data
sets - Case study 1: Predicting malicious URLs.
Unit III: Big Data
Distributing data storage and processing with frameworks - Case study: Assessing risk
when loaning money - Introduction to NoSQL - Case study: What disease is that? -
Introducing connected data and graph databases - Connected data example: a recipe
recommendation engine.
Unit IV: Text mining and text analytics
Text mining in the real world - Text mining techniques: Bag of words - Stemming and
lemmatization - Decision tree classifier - Case study: Classifying Reddit posts
Unit V: Data visualization to the end user
Data visualization options - Crossfilter, the JavaScript MapReduce library - Creating an
interactive dashboard with dc.js - Dashboard development tools
Text Book:
1. Davy Cielen, Arno D. B. Meysman, Mohamed Ali, Introducing Data Science, Manning
Publications Co, 2016.
Reference Book:
1. John D. Kelleher and Brendan Tierney, “Data Science”, First Edition, The MIT Press,
London, 2018.
2. Lillian Pierson, “Data Science for Dummies”, 2nd Edition, John Wiley & Sons
publications, 2017.
3. EMC Education Services, Data Science & Big Data Analytics: Discovering,
Analyzing, Visualizing and Presenting Data, John Wiley & Sons, Inc, 2015.
4. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical
Learning Data Mining, Inference, and Prediction, Second Edition, Springer, 2017.
Data Analytics
Course Objectives:
To develop problem solving abilities using Mathematics
To apply algorithmic strategies while solving problems
To develop time and space efficient algorithms
To study algorithmic examples in distributed, concurrent and parallel
environments
Course Outcomes:
On completion of the course, student will be able to–
To write case studies in Business Analytic and Intelligence using mathematical
models.
To present a survey on applications for Business Analytic and Intelligence.
To write problem solutions for multi-core or distributed, concurrent/Parallel
environments
Unit I: Data Analytics Overview
Introduction – Importance- Types of Data Analytics – data analytics life style: overview –
discovery- data preparation – model planning – model building – communicate result -
Operationalize. Case study: Global Innovation Network and Analysis (GINA).
Unit II: Statistics for Analytics
Statistical Methods for Evaluation: Hypothesis Testing - Difference of Means- Wilcoxon
Rank-Sum Test - Type I and Type II Errors – Power and Sample Size - ANOVA
Unit III: Time Series & Text Analysis
Overview of Time Series Analysis: - Box-Jenkins Methodology - ARIMA Model -
Additional Methods. Text Analysis: Text Analysis Steps – A Text Analysis Example -
Collecting Raw Text - Representing Text - Term Frequency-Inverse Document
Frequency (TFIDF - Categorizing Documents by Topics - Determining Sentiments -
Gaining Insights.
Unit IV: Supervised Learning
Introduction - Variable Types and Terminology - Least Squares and Nearest Neighbors -
Statistical Decision Theory - Structured Regression Models - Classes of Restricted
Estimators. Support Vector Machines and Flexible Discriminants: The Support Vector
Classifier - Support Vector Machines and Kernels. Prototype Methods and Nearest-
Neighbors: Prototype Methods
Unit V: Unsupervised Learning
Introduction - Association Rules - Cluster Analysis - Random Forests: Definition of
Random Forests - Details of Random Forests - Analysis of Random Forests - Undirected
Graphical Models - Markov Graphs and Their Properties - Undirected Graphical Models
for Continuous Variables - Undirected Graphical Models for Discrete Variables.
Text Book
1. EMC Education Services, Data Science & Big Data Analytics: Discovering,
Analyzing, Visualizing and Presenting Data, John Wiley & Sons, Inc, 2015.
2. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical
Learning Data Mining, Inference, and Prediction, Second Edition, Springer, 2017.
Uint I (Text Book 1): Chapter 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8,
Unit II (Text Book 1): Chapter 3.3.
Unit III (Text Book 1): Chapter 8.1, 8.2, 8.3, 9.1 – 9.8.
Unit IV (Text Book 2): Chapter 2.1, 2.2, 2.3, 2.4, 2.7, 2.8, 12.2, 12.3, 13.2.
Unit V (Text Book 2): 14.1, 14.2, 14.3, 15.2, 15.3, 15.4, 17.2, 17.3, 17.4
Reference Book:
1. Anil Maheshwari, Data Analytics, McGraw Hill Education; First edition, 2017.
2. Annalyn Ng, Data Science for the Layman, Shroff Publishers; First edition, 2018.
3. Ramesh Sharda, Dursun Delen, Efraim Turban, Business Intelligence, Analytics, and
Data Science: A Managerial Perspective, Pearson Education, Fourth edition, 2019.
no reviews yet
Please Login to review.