Research in data-centric programming
Data-centric programming is a research topic at the forefront of data science. Nowadays, most of the effort in building artificial intelligence applications concerns data processing. Challenges are raised by the complexity of data, which can be more or less structured, available in very large volumes, distributed, heterogeneous, etc. Data-centric programming involves topics studied in various computer science areas such as programming languages, data management, analytics, artificial intelligence, knowledge extraction, and distributed programming. In the Tyrex project-team, we conduct research at the intersection of these areas.
With the ANR-funded CLEAR project, we investigate programming techniques to facilitate the extraction of value from big data. This includes studying how to efficiently query very large graphs, how to generate distributed (e.g. Spark) code optimized for processing huge datasets, and how to develop highly scalable interpretable predictive models.