Data Science and the Fourth Paradigm: Data-Intensive Scientific Discovery

Data are being collected at unprecedented scale and speed in all industrial sectors and scientific domains, e.g., the Large Hadron Collider at CERN. This means that scientific breakthroughs are increasingly dependent on advanced data analytics capabilities that can help them organize, explore, and analyze massive volumes of data in order to gain scientific knowledge. The process of doing so, termed data science, is quickly emerging as a key foundational discipline for most domain sciences, like mathematics has been for centuries.

Indeed, data scientists are said to have “the sexiest job of the 21st century”. This event will feature three exciting talks that explain the basics of data science and how it enables a new paradigm of data-intensive scientific discovery, coined “the Fourth Paradigm” by the late Jim Gray, winner of the Turing Award – the “Nobel” prize of computer science. We invite everyone with an interest in data science, especially domain scientists using data intensively, to participate in this event.


14:00-14:10: Welcome

14:10-15:10: Democratizing Data Science by Bill Howe
Why does data science remain so difficult? Despite years of focused research in systems and methods, data science remains a high-touch, attention-intensive exercise. There has been a “Cambrian explosion” of big data systems proposed and evaluated in the last eight years, but relatively little understanding of how these systems or the ideas they implement compare and complement one another.  

15:10-15:25: Break

15:25-16:10: Data Science and some (classical) challenges by Søren Højsgaard
Once upon a time, statisticians (data scientists of the old age) often distinguished between data from planned experiments and data from observational studies. They also focused on writing down the questions to be asked (sometimes phrased as hypotheses) in carefully prepared protocols before data was collected. The purpose was, amongst other things, to protect against jumping to unwarranted conclusions based on spurious data. Modern data science together with powerful computers offers new possibilities on the type and scale of problems that can be handled. Perhaps some of the good old deeds should be revitalized in this context?

15:25-16:10: Extracting Value from Big Data – The Case of Vehicular Traffic Data by Christian S. Jensen
Almost all areas of everyday life are accompanied, guided, and influenced by computing and communication devices that are embedded in large networks, most notably the Internet. These devices produce increasing amounts of data. There is consensus that society and businesses can benefit substantially from the ability to base their functioning on these large amounts of data, often called big data. Notions such as “data-driven business” and “data-driven society” have been advanced, suggesting that entities that are able to base their decisions and operation on data are capable of being more competitive than those that are not. Big data is slated to have a profound effect on society. We are in the middle of a revolution of the way we live, work, and interact.

16:10-16:30: Plenary discussion

22. juni kl. 14:00 - 16:30 May, date to be confirmed
Institut for Datalogi, Aalborg Universitet, Selma Lagerløfs Vej 300, lokale 0.1.95, 9220 Aalborg Øst
Gratis, men tilmelding (og evt. afbud) er dog påkrævet

