Ingeniería de los Datos Como Soporte a los Grafos de Conocimiento

Responsable(s): Inmaculada Concepción Hernández Salmerón / David Ruiz Cortés

01/06/2020 – 31/05/2022

Knowledge graphs allow efficient and flexible data storage which is being most used nowadays by expert researchers and the leading companies alike (Google, Facebook, Microsoft, Amazon, or Netflix).

Unfortunately, creating and maintaining those graphs are not trivial tasks, whether by automated information extraction, NLP processes, or by hand.

We are putting our focus in this project in the essential data engineering tasks that produce knowledge graphs with complete, interlinked, trustable information, suitable for data science analysis, namely: creating, integrating and refining the graphs, and the optimization of our techniques (a sine qua non for any engineering approach).

Project details


Ingeniería de Datos Aplicada a la Extracción, Semantización, Refinamiento y Explotación de Grafos de Conocimiento a Escala Web

Responsable(s): Rafael Corchuelo Gil / David Ruiz Cortés

01/01/2020 – 31/12/2022

In this project, we aim to deal with the Web of Data, easily the largest data repository that exists nowadays. Although there are several approaches to generate linked open data in structured machine-readable format, many organisations still offer their data only in a tabular format, and published by means of HTML web pages.

To leverage the potential benefits of these data, it is necessaty to process them, namely: extracting data from tables, endowing data with semantics to build a knowledge graph, refining the graph to complete it and prune errors, and exploiting the graph to allow the user querying it with the help of virtual assistants.

Project details


Herramientas para la Ciencia de los Datos de la Web

Responsable(s): David Ruiz Cortés / Rafael Corchuelo Gil

30/12/2016 – 29/12/2020

In this project, we explore the Web Data Science, which is likely to be one of the hottest research areas in the short term. We cover several related topics, such as: clustering and extracting information from web documents, endowing information with semantics and detecting duplicated information, performing advanced opinion analysis, and validation of the former approaches in the context of big datasets.

The challenge is that our proposals require very little or no human intervention so that they can scale to the dimensions of the Web Data Science.

Project details


Semantización y Publicación de datos Abiertos para la Integración de Servicios Electrónicos

Responsable(s): David Ruiz Cortés / Rafael Corchuelo Gil

01/01/2014 – 31/12/2017

Our goal in this project is to do applied research to craft knowledge, techniques, and tools that our industry can use to reduce the production costs associated with publishing semantically-meaningful open linked data, the creation and integration of added-value electronic services, and analyse the reactions of citizens in social media.

Project details


Métodos y herramientas para la integración de grafos de conocimiento Web

Responsable(s): David Ruiz Cortés / Inmaculada Concepción Hernandez

01/01/2022 – 31/12/2022

Data integration tasks such as the creation and refinement of knowledge graphs have to increasingly deal with the matching and fusion of data from many sources, e.g., different web sites, already created knowledge bases and repositories. Integrating new data sources and their entities into a KG is challenging due to the typically large number of different kinds of entities and relationships, the high degree of heterogeneity in their representations and the often low data quality with frequently incomplete, wrong or contradicting information.

In this project, we deal with different tasks related to data integration, specifically in the context of knowledge graphs. We leverage the use of embeddings as a way of representing data in a low-dimensional space, thus enabling the application of state-of-the-art machine learning techniques on them..

Project details