1st Workshop on New Ideas for Large-Scale Neurosymbolic Learning Systems (LS-NSL)
Held in conjunction with the 51st International Conference on Very Large Data Bases (VLDB) - London, United Kingdom, September 5, 2025.
About
Deep learning has been a striking success in various fields from engineering and science. However, the criticism against it is getting bigger as scientists and practitioners apply it more broadly. Neurosymbolic learning (NSL) vows to transform deep learning by combining the strong induction capabilities of neural models with rigorous deduction from symbolic knowledge representation and reasoning techniques. Despite that NSL has shown its potential in different application domains, including image and video understanding, natural language processing, and data management, several questions remain open regarding whether current techniques are mature enough to be applied to large-scale, real-world problems.
This workshop aims to:
- Identify key large-scale, real-world scenarios from different domains, such as computer vision and data management, that can benefit from NSL techniques.
- Identify key techniques from the database literature that could enhance NSL techniques for training and inference.
- Identify new theoretical and engineering challenges that arise when integrating deep networks with symbolic systems and propose solutions towards overcoming them.
- Discuss scalable techniques for training deep networks using symbolic solvers.
- Investigate benchmarks across different application domains to assess the strengths of NSL techniques in runtime efficiency, task-specific accuracy, and other aspects.
Topics of Interest
The topics of interest include (but are not limited to):
- Large-scale NSL applications, e.g., from computer vision, natural language processing, and data management.
- Scalable integration of deep networks with symbolic systems, such as logic programs, or combinatorial solvers.
- Scalable techniques to train deep networks subject to symbolic constraints or logical theories.
- New NSL architectures and semantics.
- Uncertain databases and logic programs.
- Query answering via transformers and graph neural networks.
- Data management over new hardware.
- New forms of databases, e.g., databases to store tensor data.
- Database creation and querying via machine learning.
- NSL benchmarks.
Call For Contributions
We welcome regular papers (up to eight pages, including the bibliography) that present complete novel research outcomes not previously presented elsewhere and extended abstracts (up to four pages, including the bibliography) on preliminary results that can trigger discussions. We also welcome papers accepted by VLDB 2025 or other recent top-tier AI, machine learning, and database venues. At least one author of each accepted paper is expected to register to the workshop and give an oral presentation.
The proceedings of all papers accepted will be hosted by VLDB.
Paper Submission Instructions
Submitted manuscripts must be in pdf format and use the VLDB 2025 template. The submissions will be single-blind and the authors should comply with the conflict of interest policy for ACM publications.
Submission site: https://cmt3.research.microsoft.com/LSNSL2025/
Acknowledgement: The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Important Dates
- Paper submission:
Friday, May 30th, 2025 - Notification of acceptance:
Friday, June 27th, 2025 - Camera-ready submission:
Friday, July 11th, 2025. - Workshop Date: September 5th, 2025.
Programme
The workshop will be held on September 5, 2025, at the Queen Elizabeth II Centre (QEII Centre), London, UK, room Albert.
- [1.30 pm - 2.15 pm]. Keynote by Dan Roth. On Retrieving & Reasoning LLMs: Myths, Merits, and How to Move Forward
- [2.15 pm - 2.30 pm]. Hoa Le Thi, Angela Bonifati, Andrea Mauri. Graph Consistency Rule Mining with LLMs: An Exploratory Study
- [2.30 pm - 2.45 pm]. Abelardo Carlos Martinez Lorenzo, Alexander Perfilyev, Volker Markl, Martha Clokie, Thomas Sicheritz-Pontén, Zoi Kaoudi. Modular Neuro-Symbolic Knowledge Graph Completion
- [2.45 pm - 3.00 pm]. Essam Mansour. AI-Enabled Query Engine: LLM-GNN Integration for Evolving and Incomplete Knowledge Graphs
- [3.00 pm - 3.30 pm]. Coffee break
- [3.30 pm - 4.15 pm]. Keynote by Jeff Pan. Decoding the Interaction of Symbolic and Parametric Knowledge
- [4.15 pm - 4.30 pm]. Pranava Madhyastha. ASP Scaffolds for Robust Reasoning and Decoding
- [4.30 pm - 4.45 pm]. Ofer Idan. Few Shots Text-to-Image Retrieval: New Benchmarking Dataset and Optimization Methods
- [4.45 pm - 5.00 pm]. Mykhailo Buleshnyi, Anna Polova, Zsolt Zombori, Michael Benedikt. Constraint-aware Learning of Probabilistic Sequential Models for Multi-Label Classification
Keynotes
Dan Roth, University of Pennsylvania and Oracle

Title: On Retrieving & Reasoning LLMs: Myths, Merits, and How to Move Forward
Abstract: The rapid progress made over the last few years in generating linguistically coherent natural language has blurred, in the mind of many, the difference between natural language generation, understanding, knowledge retrieval and use, and the ability to reason with respect to the world. Nevertheless, reliably and consistently supporting high-level decisions that depend on natural language understanding and heterogenous information retrieval is still difficult, mostly, but not only, since most of these tasks are computationally more complex than language models can support. I will discuss some of the challenges underlying reasoning and information access and argue that we should exploit what LLMs do well while delegating responsibility to special purpose models and solvers for decision making. I will present some of our work in this space, focusing on supporting reasoning and information access via Neuro-symbolic methods.
Bio: Dan Roth is the Eduardo D. Glandt Distinguished Professor at the University of Pennsylvania and Chief AI Scientist at Oracle. Until June 2024 Dan was a VP/Distinguished Scientist at AWS AI where he led the scientific effort behind Amazon’s first-generation GenAI products, including Titan Models, Amazon Q, and Amazon Bedrock. Dan is a Fellow of the AAAS, ACM, AAAI, and ACL, and a recipient of the IJCAI John McCarthy Award “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” He has published broadly in natural language processing, machine learning, knowledge representation and reasoning, and learning theory, was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR) and has served as a Program Chair and Conference Chair for the major conferences in his research areas. Roth has been involved in several ML/NLP/GenAI startups in domains that range from legal and compliance to health care. Dan received his B.A Summa cum laude in Mathematics from the Technion, Israel and his Ph.D. in Computer Science from Harvard University in 1995.
Jeff Pan, University of Edinburgh and Huawei Labs

Title: Decoding the Interaction of Symbolic and Parametric Knowledge
Abstract: Large Language Models (LLMs) have taken Knowledge Representation – and the world – by storm. This inflection point marks a shift from symbolic knowledge representation to a renewed focus on the hybrid representation of both symbolic knowledge and parametric knowledge. This is a big step for the field of Knowledge Representation. In this talk, I will briefly introduce some initial findings in such a big step. If time allows, I will also speculate on opportunities and visions that the renewed focus brings.
Bio: Jeff Pan is professor of knowledge computing in the School of Informatics at the University of Edinburgh. He is a chair of the Knowledge Graphs group at the Alan Turing Institute. He is the Chief Editor and main author of the first book on Knowledge Graph. Recently, he teamed up with many group leaders in the world on a visionary paper on large language models and knowledge graphs (LINK)
Program Committee
- Victor Gutierrez Basulto, Cardiff University
- Vaishak Belle, University of Edinburgh
- Angela Bonifati, Lyon 1 University
- Gianluca Cima, Sapienza University of Rome
- Floris Geerts, University of Antwerp
- Christoph Haase, University of Oxford
- Ziyang Li, University of Pennsylvania
- Ankur Mali, University of South Florida
- Nikos Ntarmos, Huawei Labs
- Hai Pham, Samsung AI
- Ernesto Jimenez Ruiz, City St George’s, University of London
- Luciano Serafini, Fondazione Bruno Kessler
- Gerardo I. Simari, Universidad Nacional del Sur
Organizers
- Efthymia (Efi) Tsamoura, Huawei Labs
- Pablo Barceló, Universidad Católica de Chile
- Jacopo Urbani, Vrije Universiteit Amsterdam
For any questions, please do reach out to Efi at efthymia.tsamoura@gmail.com.