Computer Science 557: Introduction to Data Analytics
The transformation of today’s enterprises toward data-driven discovery and prediction is being fueled by the availability of large data in commerce and research. There is a need for scalable data management, statistical analysis, parallel algorithms, and individuals proficient in handling data on a large scale. Data analytics is an amalgamation of techniques from a multiple disciplines and is difficult to master. This course is designed to provide you the basic techniques of data science, that includes prominent algorithms used to mine data (e.g., clustering and association rule mining) and basic statistical modeling (e.g., logistic and non-linear regression). The course is targeted toward individuals who would like to know the practices used to analyze large-scale data and the potential use of large-scale data analytics. The objective of this course is to ascertain that the students know the fundamental techniques and tools used to design and analyze large volumes of data. Topics include:
• Data cleaning, data management, and the map reduce framework,
• Data preprocessing techniques, multivariate data analysis, and hypothesis testing,
• Predictive analytics, logistic regression analysis,
• Model building, model selection and error estimation,
• Overview of tools of data analytics,
• Data collection and ‘big data’ management life cycle,
• Analysis of streaming data, and data visualization,
• Frequent pattern mining, and
• An introduction to data warehousing, topologies, querying, analysis of data.
For additional information, please contact Dr. Pradeep Chowriappa (email@example.com).