258x Filetype PDF File size 1.92 MB Source: americancollege.edu.in
MSC – DATA SCIENCE The American College (An Autonomous Institution Affiliated to Madurai Kamaraj University) (Re-accredited [2nd Cycle] by NAAC with Grade „A‟ & CGPA of 3.46 on a 4 point scale) Madurai Proposed PG Grid for June 2020 Sem. Course Code Course Title Hr. Cr. Mark I PDS 1501 Concepts of Data Science 5 5 75 PDS 1503 Data Analytics (T + L) 6 5 75 PDS 1505 Artificial Intelligence 5 5 75 PDS 1607 Python Programming 5 6 80 PDS 1409 Python Programming Lab 4 4 60 PDS 1511 Probability and Statistics 5 5 75 Total 30 30 420 II PDS 1502 Data Mining and Warehousing 5 5 60 PDS 1404 Big Data Analytics 4 4 60 PDS 1406 Big Data Analytics Lab 4 4 80 PDS 1408 Machine Learning 4 4 60 PDS 1410 Computer Vision 4 4 80 PDS 1512 Linear Algebra 5 5 80 PDS 1414 Elective I 4 4 60 Total 30 30 420 III PDS 2501 Natural Language Processing (T + L) 5 5 60 PDS 2403 Deep Learning 4 4 80 PDS 2405 Reinforcement learning 4 4 80 PDS 2507 Operation Research 5 5 80 PDS 2409 Effective Communications 4 4 80 PDS 2411 Elective II 4 4 60 PDS 2413 Mini Project Lab 4 4 40 Total 30 30 480 IV PDS 2302 Industry Project 30 30 480 Total 30 24 480 GRAND 120 90 1800 TOTAL Concepts of Data Science Unit I : Introduction Benefits and uses of data science - Facets of data - The big data ecosystem and data science - data science process. Unit II: Machine Learning What is machine learning and why should you care about it? - The modeling process - Types of machine learning. Handling large data on a single computer: General techniques for handling large volumes of data - General programming tips for dealing with large data sets - Case study 1: Predicting malicious URLs. Unit III: Big Data Distributing data storage and processing with frameworks - Case study: Assessing risk when loaning money - Introduction to NoSQL - Case study: What disease is that? - Introducing connected data and graph databases - Connected data example: a recipe recommendation engine. Unit IV: Text mining and text analytics Text mining in the real world - Text mining techniques: Bag of words - Stemming and lemmatization - Decision tree classifier - Case study: Classifying Reddit posts Unit V: Data visualization to the end user Data visualization options - Crossfilter, the JavaScript MapReduce library - Creating an interactive dashboard with dc.js - Dashboard development tools Text Book: 1. Davy Cielen, Arno D. B. Meysman, Mohamed Ali, Introducing Data Science, Manning Publications Co, 2016. Reference Book: 1. John D. Kelleher and Brendan Tierney, “Data Science”, First Edition, The MIT Press, London, 2018. 2. Lillian Pierson, “Data Science for Dummies”, 2nd Edition, John Wiley & Sons publications, 2017. 3. EMC Education Services, Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, John Wiley & Sons, Inc, 2015. 4. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning Data Mining, Inference, and Prediction, Second Edition, Springer, 2017. Data Analytics Course Objectives: To develop problem solving abilities using Mathematics To apply algorithmic strategies while solving problems To develop time and space efficient algorithms To study algorithmic examples in distributed, concurrent and parallel environments Course Outcomes: On completion of the course, student will be able to– To write case studies in Business Analytic and Intelligence using mathematical models. To present a survey on applications for Business Analytic and Intelligence. To write problem solutions for multi-core or distributed, concurrent/Parallel environments Unit I: Data Analytics Overview Introduction – Importance- Types of Data Analytics – data analytics life style: overview – discovery- data preparation – model planning – model building – communicate result - Operationalize. Case study: Global Innovation Network and Analysis (GINA). Unit II: Statistics for Analytics Statistical Methods for Evaluation: Hypothesis Testing - Difference of Means- Wilcoxon Rank-Sum Test - Type I and Type II Errors – Power and Sample Size - ANOVA Unit III: Time Series & Text Analysis Overview of Time Series Analysis: - Box-Jenkins Methodology - ARIMA Model - Additional Methods. Text Analysis: Text Analysis Steps – A Text Analysis Example - Collecting Raw Text - Representing Text - Term Frequency-Inverse Document Frequency (TFIDF - Categorizing Documents by Topics - Determining Sentiments - Gaining Insights. Unit IV: Supervised Learning Introduction - Variable Types and Terminology - Least Squares and Nearest Neighbors - Statistical Decision Theory - Structured Regression Models - Classes of Restricted Estimators. Support Vector Machines and Flexible Discriminants: The Support Vector Classifier - Support Vector Machines and Kernels. Prototype Methods and Nearest- Neighbors: Prototype Methods Unit V: Unsupervised Learning Introduction - Association Rules - Cluster Analysis - Random Forests: Definition of Random Forests - Details of Random Forests - Analysis of Random Forests - Undirected Graphical Models - Markov Graphs and Their Properties - Undirected Graphical Models for Continuous Variables - Undirected Graphical Models for Discrete Variables. Text Book 1. EMC Education Services, Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, John Wiley & Sons, Inc, 2015. 2. Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning Data Mining, Inference, and Prediction, Second Edition, Springer, 2017. Uint I (Text Book 1): Chapter 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, Unit II (Text Book 1): Chapter 3.3. Unit III (Text Book 1): Chapter 8.1, 8.2, 8.3, 9.1 – 9.8. Unit IV (Text Book 2): Chapter 2.1, 2.2, 2.3, 2.4, 2.7, 2.8, 12.2, 12.3, 13.2. Unit V (Text Book 2): 14.1, 14.2, 14.3, 15.2, 15.3, 15.4, 17.2, 17.3, 17.4 Reference Book: 1. Anil Maheshwari, Data Analytics, McGraw Hill Education; First edition, 2017. 2. Annalyn Ng, Data Science for the Layman, Shroff Publishers; First edition, 2018. 3. Ramesh Sharda, Dursun Delen, Efraim Turban, Business Intelligence, Analytics, and Data Science: A Managerial Perspective, Pearson Education, Fourth edition, 2019.
no reviews yet
Please Login to review.