Domain Specific Systems for Information Extraction and Retrieval (DoSSIER)
What is DoSSIER
DoSSIER is an EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval. DoSSIER will elucidate, model, and address the different information needs of professional users. It mobilizes an excellent and highly synergistic team of world-leading Information Retrieval (IR) experts from 5 EU States who, together with 3 academic partners (universities in the US, Japan, and Australia), and 11 industrial partners (dynamic SMEs and large corporations) will produce fundamental insights into how users comprehend, formulate, and access information in professional environments.
DoSSIER research is structured in three areas:
- Models: fundamental models of users and domain specificity,
- Methods: contextual and personalized search, and
- Applications: workflow, task and the interface.
The result will be a new generation of information access systems, which will accelerate innovation cycles in EU academia and industry. To be both concrete and generic, DoSSIER consists of 8 projects identifying a target domain and 7 projects acting horizontally across domains. Three domains are used: science & technology innovation, law, and health. Questions currently unanswerable (e.g. What is the key innovation difference between these two patents?) will be answerable either directly by a system, or by the development of cognition-enhancing instruments for interacting with information.
Overview of the Research Programme
Objectives: DoSSIER will achieve the following objectives:
- Make a ground-breaking impact on professional search processes and systems through research in the areas of:
- Applications: Develop new hypotheses for information search behaviour based on observation and analysis of professional information search in practice. Develop reproducibility and repeatability aware user studies on newly developed tools. Explore the impact of the validated hypotheses on the design of tools to support professional search.
- Models: Develop models for users, contexts, and tasks, to represent the existing and accrued knowledge and information needs of users carrying out professional search tasks. These representations will be grounded in interdisciplinary research, combining computer science with economics and decision theory. These models will improve the estimates of topical relevance, diversity, and credibility of documents returned by the system to satisfy the information needs of these users for a specific task in a specific context.
- Methods: Operationalise information models toward ground-breaking approaches for context-based professional search, building on both explicit semantics (e.g. linked data, semantic web technologies) and latent semantics (word embedding, deep learning). Additionally, develop validation procedures based on the thorough experimental evaluation practice in information retrieval, that are able to consider the relatively small scale (i.e. high subjectivity) of professional search.
- Train a new generation of scientifically-principled, creative, entrepreneurial and innovative researchers with the academic and industrial experience necessary to make a significant impact on professional search in Europe, and hence on the European economy.
- Foster excellence by structuring research and doctoral training to lead to a professional certification of the competences: research knowledge and intellectual abilities; research personal effectiveness; research governance and organization (including ethics and sustainability); and researcher engagement, influence and impact.