image 1 image 2 image 3

The 9th IEEE International Conference on Data Science and Advanced Analytics

October 13-16, 2022
Online

image 1 image 2 image 3

The 9th IEEE International Conference on
Data Science and Advanced Analytics

October 13-16, 2022
Online

Keynote Speakers

Title: Knowledge-Guided Machine Learning: A New Framework for Accelerating Scientific Discovery and Addressing Global Environmental Challenge


Abstract:

Process-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to incomplete or inaccurate representations of the physical processes being modeled. There is a tremendous opportunity to systematically advance modeling in these domains by using state of the art machine learning (ML) methods that have already revolutionized computer vision and language translation. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the "black box" use of ML often leads to serious false discoveries in scientific applications. Because the hypothesis space of scientific applications is often complex and exponentially large, an uninformed data-driven search can easily select a highly complex model that is neither generalizable nor physically interpretable, resulting in the discovery of spurious relationships, predictors, and patterns. This problem becomes worse when there is a scarcity of labeled samples, which is quite common in science and engineering domains. This talk makes the case that in real-world systems that are governed by physical processes, there is an opportunity to take advantage of fundamental physical principles to inform the search of a physically meaningful and accurate ML model. While this talk will illustrate the potential of the knowledge-guided machine learning (KGML) paradigm in the context of environmental problems (e.g., Fresh water science, Hydrology, Agronomy), the paradigm has the potential to greatly advance the pace of discovery in a diverse set of discipline where mechanistic models are used, e.g., climate science, weather forecasting, and pandemic management.

Bio:

Vipin Kumar is a Regents Professor at the University of Minnesota, where he holds the William Norris Endowed Chair in the Department of Computer Science and Engineering. Kumar received the B.E. degree in Electronics & Communication Engineering from Indian Institute of Technology Roorkee (formerly, University of Roorkee), India, in 1977, the M.E. degree in Electronics Engineering from Philips International Institute, Eindhoven, Netherlands, in 1979, and the Ph.D. degree in Computer Science from University of Maryland, College Park, in 1982. He also served as the Head of the Computer Science and Engineering Department from 2005 to 2015 and the Director of Army High Performance Computing Research Center (AHPCRC) from 1998 to 2005.

Kumar's current research interests span data mining, high-performance computing, and their applications in Climate/Ecosystems and health care. His research has resulted in the development of the concept of isoefficiency metric for evaluating the scalability of parallel algorithms, as well as highly efficient parallel algorithms and software for sparse matrix factorization (PSPASES) and graph partitioning (METIS, ParMetis, hMetis). He has authored over 300 research articles, and has coedited or coauthored 10 books including two text books "Introduction to Parallel Computing" and "Introduction to Data Mining", that are used world-wide and have been translated into many languages. Kumar's current major research focus is on bringing the power of big data and machine learning to understand the impact of human induced changes on the Earth and its environment. Kumar served as the Lead PI of a 5-year, $10 Million project, "Understanding Climate Change - A Data Driven Approach", funded by the NSF's Expeditions in Computing program that is aimed at pushing the boundaries of computer science research.

Kumar has served as chair/co-chair for many international conferences in the area of data mining, big data, and high performance computing, including 25th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019), 2015 IEEE International Conference on Big Data, IEEE International Conference on Data Mining (2002), and International Parallel and Distributed Processing Symposium (2001). Kumar co-founded SIAM International Conference on Data Mining and served as a founding co-editor-in-chief of Journal of Statistical Analysis and Data Mining (an official journal of the American Statistical Association). Currently, Kumar serves on the steering committees of the SIAM International Conference on Data Mining and the IEEE International Conference on Data Mining, and is series editor for the Data Mining and Knowledge Discovery Book Series published by CRC Press/Chapman Hall.

Kumar has been elected a Fellow of the American Association for Advancement for Science (AAAS), Association for Computing Machinery (ACM), Institute of Electrical and Electronics Engineers (IEEE), and Society for Industrial and Applied Mathematics (SIAM). He received the Distinguished Alumnus Award from the Indian Institute of Technology (IIT) Roorkee (2013), the Distinguished Alumnus Award from the Computer Science Department, University of Maryland College Park (2009), and IEEE Computer Society's Technical Achievement Award (2005). Kumar's foundational research in data mining and high performance computing has been honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD), and the 2016 IEEE Computer Society Sidney Fernbach Award, one of IEEE Computer Society's highest awards in high-performance computing.

Title: Visual Domain Adaptation in the Deep Learning Era


Abstract:

As computer vision systems are being deployed in mission critical applications whose predictions have real-world impact, but where real-world testing data statistics differ significantly from lab-collected training data, domain adaptation (DA) is gaining an increasing societal importance. The aim of this talk is to give an overview of visual domain adaptation methods, starting with a brief introduction and recall of traditional domain adaptation algorithms proposed before the deep learning era. Then, I will provide an overview of the main trends in deep domain adaptation and I will discuss how to handle situations that depart form the classic domain adaptation setting such as multi-domain learning, domain generalization, test-time adaptation or source-free domain adaptation. During the talk, I will discuss different DA application scenarios such as autonomous driving, visual localization, biomedical imaging, biometry and surveillance.

Bio:

Gabriela Csurka is a Principal Scientist at NAVER LABS Europe, France. Her main research interests are in computer vision for image understanding, 3D reconstruction, visual localization as well as domain adaptation and transfer learning. She has contributed to more than 100 scientific communications, many in major CV conferences and journals. Concerning domain adaptation, in addition to related publications, she has given several invited talks and organized a related tutorial at ECCV’20. In 2017 she edited a Springer book entitled Domain Adaptation for Computer Vision Applications and recently co-authored a Morgan & Clayton book entitled Visual Domain Adaptation in the Deep Learning Era which is under publication.

Title: Some bad practices in data analysis and machine learning


Abstract:

With the democratization of data analysis and machine learning through many easy-to-use platforms, many lay analysts are now involved in analyzing data to hopefully produce actionable insight, as well as developing tools for modelling their data. Unlike professional statisticians who have the benefits of many years of rigorous training and many years of practising and perfecting the art of data analysis, lay analysts (like me, a computer scientist and logician) have rather ad hoc training. As a result, we have developed some bad data analysis habits, and some of us have even irresponsibly propagated these. In this talk, I will explain and bring attention to a few of these bad habits (including misusing principal component analysis as a dimension reduction tool, misunderstanding correlation as association, and mistreating accuracy as a one-dimensional performance measure), as well as discuss some impact of these bad habits (e.g., self-perpetuation of biased datasets.)

Bio:

Limsoon Wong is Kwan-Im-Thong-Hood-Cho-Temple Chair Professor in the School of Computing at the National University of Singapore (NUS). He was also a professor (now honorary) of pathology in the Yong Loo Lin School of Medicine at NUS. Before coming to NUS, he was the Deputy Executive Director for Research at A*STAR's Institute for Infocomm Research. He currently works mostly on knowledge discovery technologies and their application to biomedicine. He has also done, in the earlier part of his career, significant research in database query language theory and finite model theory, as well as significant development work in broad-scale data integration systems. Limsoon has written about 300 research papers, some of which are among the best cited of their respective fields. Limsoon is a Fellow of the ACM, named in 2013 for his contributions to database theory and computational biology. Some of his other recent awards include the 2003 FEER Asian Innovation Gold Award for his work on treatment optimization of childhood leukemias, the 2006 Singapore Youth Award Medal of Commendation for his sustained contributions to science and technology, and the ICDT 2014 Test of Time Award for his work on naturally embedded query languages. He was also conferred, in 2014, a Public Administration Medal (Bronze) by the Singapore Government for outstanding efficiency, competence, and industry. He serves/served on the editorial boards of Journal of Bioinformatics and Computational Biology, Bioinformatics, Biology Direct, Drug Discovery Today, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Genomics Proteomics & Bioinformatics, Journal of Biomedical Semantics, Methods, Scientific Reports, Information Systems, and IEEE Transactions on Big Data. He is also an ACM Books Area Editor. Limsoon received his BSc(Eng) in 1988 from Imperial College London and his PhD in 1994 from University of Pennsylvania.

Title: Online Learning of Data Streams with Concept Drift


Abstract:

A growing number of applications operate in such a way that new data arrive with time, i.e., as a data stream. We do not have an offline data set for training. We can learn only when data arrive, either as a single data sample or as a chunk of data samples. The challenge of learning a data stream is to continuously learn from such an incoming stream. To make things worse, the underlying distribution of the data may change with time (i.e., concept drift). This talk first describes the learning-in-the-model-space framework, which can be used effectively to learning data streams with few assumptions. Online fault diagnosis will be used as an example to illustrate how learning-in-the-model-space can facilitate detecting and classifying unknown faults. Then the talk will present an ensemble approach that can adapt ensemble diversity after a drift is detected in order to learn new concept quickly and more accurately. Finally, the talk will introduce a new method for detecting both real and virtual drifts more accurately.

Bio:

Xin Yao is a Chair Professor of Computer Science at the Southern University of Science and Technology, Shenzhen, China, and a part-time Professor of Computer Science at the University of Birmingham, UK. His major research interests include evolutionary computation, ensemble learning and search-based software engineering. His work won the 2001 IEEE Donald G. Fink Prize Paper Award; 2010, 2015 and 2017 IEEE Transactions on Evolutionary Computation Outstanding Paper Awards; 2010 BT Gordon Radley Award for Best Author of Innovation (Finalist); 2011 IEEE Transactions on Neural Networks Outstanding Paper Award; and many other best paper awards. He received a prestigious Royal Society Wolfson Research Merit Award in 2012 and the IEEE CIS Evolutionary Computation Pioneer Award in 2013. He was recently selected to receive the 2020 IEEE Frank Rosenblatt Award.

© Copyright | DSAA 2022