High Dimensional Data Analysis


Modern high throughput technologies easily generate data on thousands of variables; e.g. health care data, genomics, chemometrics, environmental monitoring, web logs, movie ratings, … 

Conventional statistical methods are no longer suited for effectively analysing such high-dimensional data.  Multivariate statistical methods may be used, but for often the dimensionality of the data set is much larger than the number of (biological) samples. Modern advances in statistical data analyses allow for the appropriate analysis of such data.

Methods for the analysis of high dimensional data rely heavily on multivariate statistical methods. Therefore a large part of the course content is devoted to multivariate methods, but with a focus on high dimensional settings and issues.

Multivariate statistical analysis covers many methods. In this course a selection of techniques is covered based on our experience that they are frequently used in industry and research institutes.

The course is taught using case studies with applications from different fields (analytical chemistry, ecology, biotechnology, genomics, …).

  1. Dimension reduction: Singular Value Decomposition (SVD), Principal Component Analysis (PCA), Multidimensional Scaling (MDS) and biplots for dimension-reduced data visualisation
  2. Sparse SVD and sparse PCA 
  3. Prediction with high dimensional predictors: principal component regression; ridge, lasso and elastic net penalised regression methods 
  4. Classification (prediction of class membership): (penalised) logistic regression and linear discriminant analysis
  5. Evaluation of prediction models: sensitivity, specificity, ROC curves, mean squared error, cross validation
  6. Clustering
  7. Large scale hypotheses testing: FDR, FDR control methods, empirical Bayes (local) FDR control

  • Type of course: This is an on campus course.
  • Dates & times: February 7, 10, 14, 17, 21 and 24, 2022, from 5.30 pm to 9.30 pm
  • Venue: UGent, Faculty of Sciences, Campus Sterre, Krijgslaan 281, building S9, 9000 Gent
  • Target audience: This course targets professionals and investigators from all areas that are high-dimensional.
  • Exam/certificate: Participants who attend all classes receive a certificate of attendance via e-mail at the end of the course. Additionally, participants can, if they wish, take part in an exam. Upon succeeding in this test a certificate from Ghent University will be issued. The exam consists of a take home project assignment. Students are required to write a report by a set deadline.
  • Course prerequisites: Ready at hand knowledge of basic statistics: data exploration and descriptive statistics, statistical modeling, and inference: linear models, confidence intervals, t-tests, F-tests, anova, chi-squared test, such as covered in Module 2 - Drawing Conclusions from Data: an Introduction , Module 5 - Exploiting Sources of Variation in your Data: the ANOVA Approach and Module 12 - Explaining and Predicting Outcomes with Linear Regression of this year's course program.
  • Funding: => Our academy is recognised as a service provider for the 'KMO-portefeuille'. In this way small and middle sized businesses located in the Flanders region can save up to 30% on the registration fee for our courses. You can request this subsidy via www.kmo-portefeuille.be up until 14 calender days after the course has started. => UGent PhD students can apply for a full refund from their Doctoral School.
  • Reduction: => If two or more employees from the same company enrol simultaneously for this course a reduction of 20% on the module price is taken into account starting from the second enrolment => Reduced prices apply to coworkers in governmental institutions, non-profit organisations and higher eduction as well as for students and the unemployed.
