Jeju Island
Jeju Island
Jeju Island
Jeju Island

Tutorials

Efficient Compression and Queries of Large Graphs

Speakers: Fan Zhang (Guangzhou University)

Summary

As the volume and ubiquity of graphs increase, a compact graph representation becomes essential for enabling efficient storage, transfer, and processing of graphs. Given a graph, the graph summarization problem seeks a compact representation that comprises a summary graph and corrections, allowing for the exact recreation of the original graph from the representation. Studies in this field aim to explore general, queryable compressed storage structures for graph data, which are considered a potential solution for graph processing tasks in memory-constrained scenarios. In this tutorial, we first highlight the importance of graph summarization in a variety of applications and the unique challenges that need to be addressed. Subsequently, we provide an overview of the existing methods for the graph summarization problem and the research on its various variants. Finally, we discuss the future research directions in this important and growing research area.

Bio

Fan Zhang is a professor at Guangzhou University. He is also the co-director of the Big Data Computing and Intelligence Institute and the executive deputy director of the Intelligent Transportation Joint Lab. His research interests focus on the topics of large- scale graph data, including cohesive subgraphs, graph summarization, network stability, and influence study. He has published over 30 papers in top-tier venues such as SIGMOD, KDD, VLDB, ICDE, AAAI, IJCAI, VLDB Journal, and TKDE, mostly as the first author or corresponding author. He received the CCF Technology Achievement Award in Natural Science in 2022 and the ACM SIGMOD China Rising Star Award in 2023. In recent years, he serve as (S)PC member or reviewer for VLDB, KDD, TheWebConf, ICDE, AAAI, TKDE, etc. His research is supported by the National Natural Science Foundation of China and the key enterprises such as Alibaba and South China Road & Bridge. More information can be found on his academic homepage (fanzhangcs.github.io).

Trustworthy Foundation Models with a Data-Centric Approach

Speakers: Wenjie Fang (University of Science and Technology of China), Dan Li (Sun Yat-sen University), Jian Lou (Sun Yat-sen University)

Summary

Foundation models have become a central infrastructure of modern AI systems, enabling stunning ability and strong generalization across tasks, domains, and modalities. However, their increasing scale, opacity, and deployment in high-stakes settings raise critical challenges regarding models??trustworthiness and controllability, including reliability, robustness, explainability, safety, and privacy preservation. Addressing these challenges requires moving beyond purely model-centric solutions toward a systematic data-centric perspective, in which model behaviors are understood, audited, and corrected through their dependence on data across the entire model lifecycle. This tutorial presents a comprehensive overview of trustworthy foundation models from a data-centric viewpoint. We begin with a concise introduction to foundation models and core notions of trustworthiness, clarifying problem definitions, evaluation criteria, and a unifying data attribution perspective that connects model behaviors to data pipelines. We then survey data-centric methodologies across the foundation model lifecycle, covering pre-training data-hygiene-based risk mitigation, trustworthy-aware optimization during training, and post-training patching via data manipulation, including instruction tuning, machine unlearning, and knowledge editing. Throughout, we emphasize how all key data-centric stages fundamentally influence and shape trustworthy model behaviors. Building on these principles, the tutorial examines domain-specific data-centric practices in graph, time-series, and multimodal foundation models, highlighting structural, temporal, and cross-modal data challenges through representative examples and case studies. We conclude by discussing open challenges and future research directions at the intersection of data-centric learning and trustworthy foundation models. This tutorial is intended for researchers and practitioners in machine learning and AI. A general background in machine learning is sufficient; no prior expertise in trustworthiness or foundation models is required.

Bio

Wenjie Feng is a Professor in the School of Artificial Intelligence and Data Science at University of Science and Technology of China. Previously, he was a postdoctoral researcher at the Institute of Data Science (IDS), National University of Singapore (NUS); he received his Ph.D. degree in 2020 from the Institute of Computing Technology, Chinese Academy of Sciences. His research interests focus on large-scale data mining for graphs and trustworthy machine learning. He has received the ECML-PKDD Best Student Paper Award. His work has been published in leading international venues, including CCS, ICML, NeurIPS, ACM MM, TheWebConf, and WSDM etc. He has also served as a program committee member or reviewer for venues such as KDD, ICML, NeurIPS, ICLR, TKDE, TKDD, and TDSC etc. For more information, please refer to his personal website at https://wenchieh.github.io

Dan Li is an Associate Professor in the School of Software Engineering at Sun Yat-sen University. Previously, she was a postdoctoral researcher at the Institute of Data Science (IDS), National University of Singapore (NUS). She obtained her Ph.D from Nanyang Technological University (NTU). Prior to that, she received a Bachelor?셲 degree from the University of Electronic Science and Technology of China (UESTC). She has received the PHM-AP Best Paper Award and AIoTsys Best Paper Award. Her work has been published in leading international venues, including FSE, ICSE, ISSTA, ICDE, etc. She has also served as PC co-chair and area chair for ICSS and ICSOC, program committee member or reviewer for venues such as AAAI, TKDE, ECML-PKDD, etc. For more information, please refer to his personal website at https://sites.google.com/view/dr-dan-li

Jian Lou is an Associate Professor in the School of Software Engineering at Sun Yat-sen University. Previously, he was a postdoctoral researcher with Professor Li Xiong at Emory University. He has received the CCS Distinguished Paper Award, ACISP Best Paper Award, WI-IAT Best in Theoretical Paper Award. His work has been published in leading international venues, including CCS, S&P, USENIX Security, ICML, NeurIPS, CVPR, ICCV, and TheWebConf etc. He has also served as an area chair for ICLR, ICML, a senior program committee member for TheWebConf, AAAI, and program committee members for CCS, VLDB, etc. For more information, please refer to his personal website at https://sites.google.com/view/jianlou

Reliable Visual Analytics with Dimensionality Reduction: Quality Evaluation and Interpretation of Projections

Speakers: Jeon Hyeon (Seoul National University), Takanori Fujiwara (University of Arizona), Rafael M. Martins (Linnaeus University)

Summary

Dimensionality reduction (DR) is widely used for visual analytics, but the insights obtained from these visualizations may often be unreliable. For example, DR projections distort the intrinsic structure of high-dimensional data in ways that may not be obvious at first glance, potentially leading analysts to inaccurate interpretations. Even reliable visual patterns may be hard to interpret regarding what exactly they convey about the underlying data, due to the often severe compression from hundreds (or thousands) of dimensions down to the visual space. In this tutorial, we discuss how to enhance the reliability of visual analytics with DR by focusing on two perspectives: quality evaluations and interpretations. While the former helps users identify or create projections with fewer distortions, the latter provides a reliable method for deriving insights from those projections. By combining lecture and coding exercises, we expect our tutorial to provide a grounded basis for audiences to use DR in a more reliable manner.

Bio

Hyeon Jeon is a Ph.D. Student at Seoul National University, Korea. He works at the intersection of visual analytics and machine learning, specializing in making high-dimensional data analysis and DR more human-centered.

Takanori Fujiwara is an assistant professor at the University of Arizona. His expertise spans visual analytics and machine learning. He specializes in developing interactive dimensionality reduction techniques and contributes to advancing high-dimensional data analysis.

Rafael M. Martins is an assistant professor in Computer Science and Media Technology working at Linnaeus University, Sweden. He has contributed to the discussion in quality evaluation and interpretation of DR projections for over a decade, and is currently interested in its ramifications for explainable AI and interpretable machine learning.

Continual Recommender Systems

Speakers: Seunghan Lee (Korea University), Seunghyun Baek (Korea University), Dojun Hwang (Korea University), Hyunsik Yoo (University of Illinois Urbana-Champaign), SeongKu Kang (Korea University)

Summary

Modern recommender systems operate in uniquely dynamic settings: user interests, item pools, and popularity trends shift continuously, and models must adapt in real time without forgetting past preferences. While existing tutorials on continual or lifelong learning cover broad machine learning domains (e.g., vision and graphs), they do not address recommendation-specific demands?봲uch as balancing stability and plasticity per user, handling cold-start items, and optimizing recommendation metrics under streaming feedback. This tutorial aims to make a timely contribution by filling that gap. We begin by reviewing the background and problem settings, followed by a comprehensive overview of existing approaches. We then highlight recent efforts to apply continual learning to practical deployment environments, such as resource-constrained systems and sequential interaction settings. Finally, we discuss open challenges and future research directions. We expect this tutorial to benefit researchers and practitioners in recommender systems, data mining, AI, and information retrieval across academia and industry.

Bio

Seunghan Lee is a first-year Master?셲 student in the Department of Computer Science and Engineering at Korea University. His research focuses on learning from heterogeneous information in recommender systems, encompassing multi-modal content and complex user behaviors, as well as continual recommender systems. His work has been published in major conferences, including CIKM.

Seunghyun Baek is a first-year Master?셲 student in the department of Computer Science and Engineering at Korea University. He has worked on designing continually updated multi-stage pipeline for recommender systems. His research interests lie in LLM-based recommendation systems and continual learning for recommendation.

Dojun Hwang is currently a final year B.S. student in the Department of Computer Science and Enginnering at Korea University. He has worked on designing recommender system, especially large language models as a re-ranker. His research interests are large language models for recommendation and information retrieval.

Hyunsik Yoo is a fourth-year Ph.D. student in the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign. His research focuses on developing data mining and machine learning techniques for recommender systems and graph mining models that are adaptive, trustworthy, and user-inclusive. His work has been published in major conferences, including KDD, SIGIR, TheWebConf, WSDM, and ICML. He has also served as a program committee member or reviewer for venues such as KDD, CIKM, TheWebConf Companion, AAAI, NeurIPS, DSAA, and TIST.

SeongKu Kang is an Assistant Professor in the Department of Computer Science and Engineering at Korea University. Prior to that, he was a postdoctoral researcher at the University of Illinois Urbana-Champaign. His research interests lie in data mining, recommender systems, and information retrieval. He has published more than 30 papers in major conferences such as KDD, TheWebConf, CIKM, SIGIR, and EMNLP. He received the Stars of Tomorrow Award from Microsoft Research Asia in 2023, and his paper was selected as a Best Paper at WSDM 2025. He has served as a program committee member or reviewer for venues including KDD, TheWebConf, AAAI, SIGIR, ACL, SDM, TIST, and TKDE, and was recognized as an outstanding reviewer at KDD.

Advances in Real-Time Processing of Longitudinal Data: From Statistical and Deep Learning Methods to Applications

Speakers: Ying-Ren Chien (National Taipei University of Technology), Pavel Loskot (Zhejiang University ??UIUC Institute), Yu Gao (Midea Group)

Summary

Signal processing has long been concerned with processing the sensor outputs in wide range of applications. As the sensors are being massively deployed in many current systems, it is timely to review the recent progress in signal processing methods and algorithms for processing multi-dimensional, and possibly heterogeneous longitudinal data. The specific signal processing tasks to consider involve signal filtering to suppress various distortions and to extract useful information, forecast future samples, identifying anomalies and changes in signal characteristics, generating signals with desired properties, compressing signals, classifying and labeling signal segments, and other. Typical challenges involve the strict deadlines in real-time processing, capturing patterns across multiple temporal scales, accounting for non-stationary spatial dependencies, and working with highly non-Gaussian distributions to name a few. In this tutorial, we will cover the recent developments in methods based on traditional statistical signal processing, discuss how the deep learning that were developed for computer vision and natural language processing were adopted to become effective for signal processing, and finally, the issues in practical implementations will be also considered.

Bio

Ying-Ren Chien (Senior Member, IEEE) is a Full Professors in the Department of Electronic Engineering, National Taipei University of Technology, Taipei, China. His research interests include consumer electronics, multimedia denoising algorithms, adaptive signal processing theory, active noise control, machine learning, the Internet of Things, and interference cancellation. Dr. Chien was the recipient of the best paper awards, including ICCCAS 2007, ROCKLING 2017, IEEE ISPACS 2021, IEEE CESoc/CTSoc Service Awards in 2019, NSC/MOST Special Outstanding Talent Award in 2021, 2023, and 2024, Excellent Research-Teacher Award in 2018 and 2022, and Excellent Teaching Award in 2021. From 2023 to 2024, he was Vice Chair of the IEEE Consumer Technology Society Virtual Reality, Augmented Reality, and Metaverse (VAM) Technical Committee (TC). Since 2025, he has been the Secretary of IEEE CTSoc Audio/Video Systems and Signal Processing TC. He is currently an Associate Editor for IEEE Transactions on Consumer Electronics.

Pavel Loskot (Senior Member, IEEE) joined the ZJU-UIUC Institute in January 2021 as Associate Professor after 14 years with College of Engineering, Swansea University, UK. In the past 30 years, he was involved in numerous collaborative research and development projects, and also held a number of paid consultancy contracts with industry mainly, but not only in wireless communications. His research interests focus on mathematical and probabilistic modeling, statistical and digital signal processing, and machine learning for multi-sensor, tabular, and longitudinal data. He received 8 best paper awards, and delivered 18 tutorials in international conferences including BigComp 2024, APSIPA ASC 2017/2021/2022/2024, and IEEE MILCOM 2018/2019. He is the Fellow of the HEA, UK, Recognized Research Supervisor of the UKCGE, and the IARIA Fellow. He is the Editor in ICT Express.

Yu Gao obtained BSc and MSc degrees in EE from the USTC. He is currently the Head of Human-Computer Interaction Algorithms at Midea Group's AI Innovation Center, and the Director of the National New Generation AI Innovation Platform for Home Robots. He holds over 50 domestic and international patents in speech processing and NLP concerning intelligent speech & language algorithms, and the AI industrialization. He has led the development of 10+ IEEE and national standards. He is also an Executive Member of the CCF Speech & Dialogue Special Committee, and a Member of the National Standardization Committee (TC46/TC28).