For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
The Data Science theme focuses on research on systems and methods for managing, obtaining insights and acting on data.

Societal context

With the ongoing digitization of society ever more data are produced both in the private and in the public sector. Social media companies, retail companies, telecom companies and banks, to name but a few, collect lots of digital data related to their products, services and clients. The same applies increasingly in the public sphere for example to government organizations, health care, education, tax-office, judiciary, police and to science itself.

The main challenge for data scientists is to extract valuable knowledge from these data. In order to do that, they have to solve problems related to the size of data sets, the quality of data, the management of databases, data sets that change over time, structures or the lack thereof in data, and with ways of finding patterns in the data, for example by using statistics and machine learning.

Scientific challenges

In the domain of data-analysis we study at IvI how to develop data-driven methods to understand content, to analyse and predict user behaviour and to make sense of context and information.

To gain valuable insights from data, the sheer analysis of data alone is not enough. Questions arise how people should best interact with the outcomes of the data analysis. How to design and build systems to help people understand and work with data? How to best help people understand large collections of multimedia data (images, video, text, audio, graphs…)?

As data science finds ever more applications in society, it starts to affect people’s personal lives. Thus all kinds of ethical question arise, which we also study at IvI. How to share data in a trustable and transparant way? How to make sure that data used by intelligent systems are reliable, fair, un-biased so that society can benefit most? How to ensure that advances in data science lead to advances in human values, dignity, wellbeing and flourishing?

Discovering a host of data sources
Automatically clean up data sets to prevent ‘garbage in, garbage out’
Multilingual language technology that goes beyond where ChatGPT ends
Making better use of health data without sacrificing privacy
Making data management responsible

IvI research groups associated with the Data Science theme

as their primary focus is:

  • INtelligent Data Engineering Lab (INDElab)

as their secondary focus are:

  • Amsterdam Machine Learning Lab (AMLab)
  • Complex Cyber Infrastructure (CCI)
  • Computer Vision research group (CV)
  • Digital Interactions Lab (DIL)
  • Information Retrieval Lab (IRLab)
  • Language  Technology Lab (LTL)
  • Multimedia Analytics (MultiX)
  • MultiScale Networked Systems (MNS)
  • Parallel Computing Systems (PCS)
  • Quantitative Healthcare Analysis group (qurAI)
  • Socially Intelligent  Artificial Systems (SIAS)
  • Theory of Computer Science (TCS)
  • Video & Image Sense Lab (VIS)