Program
This course provides an introduction to the fundamental concepts, methods, and technologies in applied data science. Students will gain exposure to how data are organized, managed, cleaned, processed, and analyzed. Core themes include data management practices, ethics, data quality control, and the communication of insights for effective decision-making.
Through a combination of lectures, individual assignments and a group project, students will develop hands-on skills in R programming and applied statistics, while exploring advanced topics such as data visualization, text mining, and machine learning. Weekly tasks reinforce lecture concepts, giving students opportunities to practice coding, transform and clean data, and conduct real-world analyses. The semester culminates in a final project where students apply what they have learned to analyze a real-world dataset and deliver actionable insights.
Open source books, codes and data set will be used. When using the books, codes, and data sets, please make sure to cite the original open source providers. We are greatful for their contribution to the filed.