Overview | Long Description | This course presents an introductory roadmap into the newly emerged and rapidly evolving field of data science, with the objective of introducing the problem-solving mindset in a data-intensive context. The course projects data science as a productive synthesis of its parent disciplines, including mathematics, statistics, computing, data mining, system science and data visualization, etc. Such a productive synthesis is applicable across many fields to bring about scientific discovery through data-intensive, analytical methods. This course aims to help navigate students in taking more advanced courses in its parent disciplines to build up their expertise in data science. We cover topics including: History and impact of Data Science. Collecting Data: sources, types and categorization of data. Visualising Data: summary statistics, data display, data dictionaries, schema and graphical visualization. Analysing Data: pattern recognition, correlations and relationships, hypotheses testing, statistical significance. Investigating Data: data mining, machine learning, inference, meta-data, modeling, eliciting meaning and validation. Applicational contexts: Examples of useful applications from case studies. | 這科提供基礎藍圖幫助學習數據科學化,這是嶄新及急速發展的領域,目的是學習利用科學化的方法及分析大量數據來輔助解決實用問題,訓練學生具備相關的思維。 數據科學化牽涉各學科包括數學、統計、電子計算、數據挖掘、系統科學及數據視覺化。 它可應用於各領域,透過處理及分析數據來支持科學探索與發現。這科也可增加學生對相關學科的認識,幫助他們日後修讀更深入的科目,擁有數據科學化的專業技術。 科目範圍包括: 數據科學化的歷史及影響,數據採集: 來源、種類、範疇,數據視覺化: 數據顯示、數據字典,數據架構與圖像化,數據分析:模式識別、相關性、關聯性,假設檢驗、統計顯著性測試,數據研究:數據挖掘、機器學習、推理,元數據設計,數據模型,含義抽取及驗證,實用個案及應用。 |