Università Roma Tre
Big Data & Databases Research Group


The group is devoted to the development of new principles, methods and tools for the organization and management of information, in the form of databases. The focus is on the new requirements generated by the availability of big data, i.e., massive amounts of information whose size exceed the capacity of conventional database systems. The data in the sources can be heterogeneous, unstructured, and inconsistent and the goal is to work for effective and scalable solutions for their extraction, integration, management and analysis. The overall approach is to tackle problems that have a practical significance, providing both general methods (with a theoretical background if relevant) as well as concrete tools that demonstrate the approach.

Topics

Members

Paolo Atzeni

Paolo Atzeni

Full professor
Riccardo Torlone

Riccardo Torlone

Full professor
Paolo Merialdo

Paolo Merialdo

Full professor
Luca Cabibbo

Luca Cabibbo

Associate professor
Valter Crescenzi

Valter Crescenzi

Assistant professor
Donatella Firmani

Donatella Firmani

Assistant professor (Temporary)
Antonio Maccioni

Antonio Maccioni

Post doc
Disheng Qiu

Disheng Qiu

Post doc
Luigi Bellomarini

Luigi Bellomarini

Phd student
Matteo Cannaviccio

Matteo Cannaviccio

Phd student
Lorenzo Luce

Lorenzo Luce

Phd student
Federico Piai

Federico Piai

Phd student

Main Publications

Journals

Angela Bonifati, Werner Nutt, Riccardo Torlone, Jan Van den Bussche: Mapping-equivalence and oid-equivalence of single-function object-creating conjunctive queries. VLDB J. (2016)
Roberto De Virgilio, Antonio Maccioni, Riccardo Torlone: Approximate querying of RDF graphs via path alignment. Distributed and Parallel Databases 33(4): 555-581 (2015)
Valter Crescenzi, Paolo Merialdo, Disheng Qiu: Crowdsourcing large scale wrapper inference. Distributed and Parallel Databases 33(1): 95-122 (2015)
Paolo Atzeni, Francesca Bugiotti, Luca Rossi: Uniform access to NoSQL systems. Inf. Syst. 43: 117-133 (2014)
Mirko Bronzi, Valter Crescenzi, Paolo Merialdo, Paolo Papotti: Extraction and Integration of Partially Overlapping Web Sources. PVLDB 6(10): 805-816 (2013)
Davide Martinenghi, Riccardo Torlone: Taxonomy-based relaxation of query answering in relational databases. VLDB J. 23(5): 747-769 (2014)
Paolo Atzeni, Christian S. Jensen, Giorgio Orsi, Sudha Ram, Letizia Tanca, Riccardo Torlone: The relational model is dead, SQL is dead, and I don't feel so good myself. SIGMOD Record 42(2): 64-68 (2013)
Daniele Toti, Paolo Atzeni, Fabio Polticelli: Automatic Protein Abbreviations Discovery and Resolution from Full-Text Scientific Papers: The PRAISED Framework. Bio-Algorithms and Med-Systems 8(1): 13-52 (2012)
Paolo Atzeni, Luigi Bellomarini, Francesca Bugiotti, Fabrizio Celli, Giorgio Gianforme: A runtime approach to model-generic translation of schema and data. Inf. Syst. 37(3): 269-287 (2012)
Paolo Atzeni, Pierluigi Del Nostro, Stefano Paolozzi: Temporal Content Management and Website Modeling: Putting Them Together. T. Large-Scale Data- and Knowledge-Centered Systems 5: 158-182 (2012)
Paolo Atzeni, Giorgio Gianforme, Paolo Cappellari: Data model descriptions and translation signatures in a multi-model framework. Ann. Math. Artif. Intell. 63(3-4): 287-315 (2011)
Paolo Atzeni, Luigi Bellomarini, Francesca Bugiotti, Giorgio Gianforme: MISM: A Platform for Model-Independent Solutions to Model Management Problems. J. Data Semantics 14: 133-161 (2009)
Paolo Atzeni, Giorgio Gianforme, Paolo Cappellari: A Universal Metamodel and Its Dictionary. T. Large-Scale Data- and Knowledge-Centered Systems 1: 38-62 (2009)
Paolo Papotti, Riccardo Torlone: Schema exchange: Generic mappings for transforming data and metadata. Data Knowl. Eng. 68(7): 665-682 (2009)
Paolo Atzeni, Paolo Cappellari, Riccardo Torlone, Philip A. Bernstein, Giorgio Gianforme: Model-independent schema translation. VLDB J. 17(6): 1347-1370 (2008)
Riccardo Torlone: Two approaches to the integration of heterogeneous data warehouses. Distributed and Parallel Databases 23(1): 69-97 (2008)
Valter Crescenzi, Paolo Merialdo: Wrapper Inference for Ambiguous Web Pages. Applied Artificial Intelligence 22(1&2): 21-52 (2008)

Conferences

Antonio Maccioni, Edoardo Basili, Riccardo Torlone: QUEPA: QUerying and Exploring a Polystore by Augmentation. SIGMOD Conference 2016
Alessio Conte, Roberto De Virgilio, Antonio Maccioni, Maurizio Patrignani, Riccardo Torlone: Finding All Maximal Cliques in Very Large Social Networks. EDBT 2016: 185-196
Andrea Calì, Davide Martinenghi, Riccardo Torlone: Keyword Search in the Deep Web. AMW 2015
Roberto De Virgilio, Antonio Maccioni, Riccardo Torlone: A Unified Framework for Flexible Query Answering over Heterogeneous Data Sources. FQAS 2015: 283-294
Roberto De Virgilio, Antonio Maccioni, Riccardo Torlone: R2G: a Tool for Migrating Relations to Graphs. EDBT 2014: 640-643
Scott Britell, Lois M. L. Delcambre, Paolo Atzeni: Generic Data Manipulation in a Mixed Global/Local Conceptual Model. ER 2014: 246-259
Roberto De Virgilio, Antonio Maccioni, Riccardo Torlone: Model-Driven Design of Graph Databases. ER 2014: 172-185
Francesca Bugiotti, Luca Cabibbo, Paolo Atzeni, Riccardo Torlone: Database Design for NoSQL Systems. ER 2014: 223-231
Riccardo Torlone: Towards a new Foundation for Keyword Search in Relational Databases. AMW 2014
Paolo Atzeni, Luigi Bellomarini, Francesca Bugiotti: EXLEngine: executable schema mappings for statistical data processing. EDBT 2013: 672-682
Valter Crescenzi, Paolo Merialdo, Disheng Qiu: A framework for learning web wrappers from the crowd. WWW 2013: 261-272
Roberto De Virgilio, Giorgio Orsi, Letizia Tanca, Riccardo Torlone: NYAYA: A System Supporting the Uniform Management of Large Sets of Semantic Data. ICDE 2012: 1309-1312
Paolo Atzeni, Francesca Bugiotti, Luca Rossi: Uniform Access to Non-relational Database Systems: The SOS Platform. CAiSE 2012: 160-174
Roberto De Virgilio, Giorgio Orsi, Letizia Tanca, Riccardo Torlone: Semantic data markets: a flexible environment for knowledge management. CIKM 2011: 1559-1564
Paolo Ciaccia, Riccardo Torlone: Modeling the Propagation of User Preferences. ER 2011: 304-317
Davide Martinenghi, Riccardo Torlone: Querying Databases with Taxonomies. ER 2010: 377-390
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti: Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources. CAiSE 2010: 83-97
Paolo Atzeni, Luigi Bellomarini, Francesca Bugiotti, Giorgio Gianforme: A runtime approach to model-independent schema and data translation. EDBT 2009: 275-286
Luca Cabibbo: On keys, foreign keys and nullable attributes in relational mapping systems. EDBT 2009: 263-274

Seminars

2016

27/04/2016 12:00, sala riunioni DIA Francesco Bonchi (ISI Foundation) On information propagation, social influence, and communities abstract and bio
01/03/2016 14:00, sala riunioni DIA Andrea Calì (Birkbeck College, University of London) Querying the Deep Web: old and new perspectives abstract

2015

22/10/2015 11:00, sala riunioni DIA Divesh Srivastava (AT&T Labs-Research) Schema Extraction abstract and bio
20/10/2015 14:00, sala riunioni DIA Divesh Srivastava (AT&T Labs-Research) Big Data Integration abstract and bio
19/06/2015 15:00, aula conferenze C. Mohan (IBM Research) Big Data: Hype and Reality abstract and bio
19/06/2015 12:00, aula conferenze C. Mohan (IBM Research) Modern Database Systems: Modernized Classic Systems, NewSQL and NoSQL abstract and bio
12/06/2015 14:30, sala riunioni DIA Paolo Papotti (Qatar Computing Research Institute) Prescriptive Data Cleaning abstract
18/05/2015 11:00, aula conferenze Ronen Feldman (Hebrew University of Jerusalem) Unsupervised relation extraction - techniques and applications
04/05/2015 11:00, aula N04 Denilson Barbosa (University of Alberta) Inferencing in Information Extraction: Techniques and Applications (part I) abstract and bio
27/04/2015 9:30 aula N11 Mario A. Nascimento (University of Alberta) Optimizing Query Processing in Cache-Aware Wireless Sensor Networks abstract and bio presentation
10/04/2015 14:00, aula N2 Stefano Ortona (University of Oxford) TWADaR: Joint Wrapper and Data Repair in Web Data Extraction abstract

2014

02/07/2014 16:30, sala riunioni Davide Martinenghi (Politecnico di Milano) The joys and sorrows of teaching and developing apps for mobile devices, or: how to become a workaholic, earn no money, and live happily ever after abstract presentation

2013

24/10/2013 17:00, aula N4 Anastasia Ailamaki (EPFL - Ecole Polytechnique Federale de Lausanne) Querying and Exploring Big Brain Data abstract
18-26/09/2013 (detailed agenda) Pierre Senellart (Telecom ParisTech) Web Data Management abstract
12/09/2013 10:30, aula N3 Yannis Papakonstantinou (Univ of California at San Diego) Declarative, optimizable data-driven specifications of web and mobile applications abstract
11/09/2013 10:00, sala riunioni Paolo Papotti (Qatar Computing Research Institute) NADEEF: A Generalized Data Cleaning System abstract
09/07/2013 12:00, sala riunioni Davide Martinenghi (Politecnico di Milano) Humans Fighting Uncertainty in Top-K Scenarios abstract and presentation
31/05/2013 12:00, sala riunioni Peter Wood (Birkbeck, University of London) Top-k query answering with aggregation constraints and cached views abstract
06/05/2013 14, aula N10) Wei Wang (University of New South Wales, Australia) Similarity Query Processing Algorithms: Use of Enumeration and Divide and Conquer Techniques. abstract
25/01/2013 11:30, sala riunioni Michael Grossniklaus (Vienna University of Technology, Austria) Teach Your Data Streams to Do More. abstract

2012

05/12/2012 11:30, sala riunioni Andrea Calì (Birkbeck College, University of London & Oxford University, UK) Functional Constraints in Ontology Reasoning abstract
23/06/2011 11:30, aula N7 Andrea Calì (Birkbeck College, University of London & Oxford University, UK) Algoritmi trattabili di interrogazione di dati in presenza di ontologie: ER colpisce ancora abstract
31/05/2012 15:45, aula N1 Denilson Barbosa (University of Alberta, Edmonton, Canada) Information extraction for social media analysis. abstract
29/05/2012 11:30, sala riunioni Peter Wood (Birkbeck, University of London) Flexible Querying of Graph-Structured Data. abstract

2011

28/09/2011 15:00, sala riunioni Letizia Tanca (Politecnico di Milano) Problems and Opportunities in Context Based Personalization abstract
21/09/2011 15:00, sala riunioni Floris Geerts (University of Edinburgh, UK) Data completeness and the currency of data. abstract
19/09/2011 15:00, sala riunioni Floris Geerts (University of Edinburgh, UK) Repairing with quality improving dependencies: existing repair methods, related problems, and hint towards solution. abstract
15/09/2011 15:00, sala riunioni Floris Geerts (University of Edinburgh, UK) A principled approach to data quality: a general survey.
23/06/2011 11:30, aula N7 Andrea Calì (Birkbeck College, University of London & Oxford University, UK) Algoritmi trattabili di interrogazione di dati in presenza di ontologie: ER colpisce ancora abstract
10/06/2011 11:00, sala riunioni Tim Weninger (University of Ullinois at Urbana-Chaimpaign, USA) WinaCS (Web-based Information Network Analysis for Computer Science) abstract

2010

05/07/2010 15:00, aula N7 Denilson Barbosa (University of Alberta, Edmonton, Canada) Top-k Approximate Subtree Matching abstract
01/07/2010 15:00, aula N7 Denilson Barbosa (University of Alberta, Edmonton, Canada) A Framework for Automatic Schema Mapping Verification Through Reasoning abstract
30/06/2010 15:00, aula N3 Denilson Barbosa (University of Alberta, Edmonton, Canada) An Environment for Building, Exploring and Querying Social Networks abstract
10/06/2010
Andrea Calì, Davide Martinenghi (Brunel University, UK, Politecnico di Milano) Query Optimization in the Deep Web abstract and presentation
23/03/2010
Beniamino Di Martino (Seconda Università di Napoli) Semantic based Discovery and Management of Content, Services and Software abstract

2009

20/11/2009 Daniel Schwabe (PUC-Rio de Janeiro, Brasil) Looking into the crystal ball: Challenges for Web application design beyond Web 2.0
09/11/2009 Renée J. Miller (University of Toronto) Leveraging Data and Structure in Ontology Integration
14/07/2009 Vagelis Hristidis (Florida International University) Information Discovery on Vertical Domains
06/05/2009 Bruno Marnette (Computing Laboratory, University of Oxford) Schema-Mappings: From Termination To Tractability
22/04/2009 Andrea Calì (Computing Laboratory, University of Oxford) Expressive Ontologies for data modelling: the query answering problem

Previous years (partial list)

July 1, 2008 Nicoleta Preda - INRIA Saclay, Francia XML processing in DHT networks
May 21, 2008 Leo Bertossi - Carleton University, Ottawa, Canada Information Sharing Agents in a Peer Data Exchange System
April 14, 2008 Ernest Teniente - Universitat Politècnica de Catalunya, Barcelona SVTe: Database Schema Validation with Explanations
March 20, 2008 Andrea Calì - Computing Laboratory, University of Oxford The chase: A good old tool for query containment and data exchange
June 4, 2007 Raul F. Chong - IBM Toronto Lab DB2 overview course: Generic DB2 DBA & DB2 Developer skills
Content Part I (5.3 MB) Part II (3.9 MB)
June 4, 2007 Raul F. Chong - IBM Toronto Lab DB2 & Web 2.0 Demos: DB2 & Ruby on Rails, PHP, AJAX and Java
Materiale Presentazione (3.7 MB)
October 16, 2006 Raul F. Chong (IBM Toronto Lab) The role of DB2 Express-C in the Information on demand world presentation
June 13, 2006 Lucian Popa (IBM Almaden Research Center) Schema Mappings: From Flat to Nested
June 13, 2006 Wang-Chiew Tan (University of California Santa Cruz) Debugging Schema Mappings
December 15, 2004 David Toman (University of Waterloo, Canada) Logical Data Expiration
March 10, 2004 Reind .P. van de Riet (Vrije Universiteit, Amsterdam) SP@CE: Security & Privacy in Cyberspace
January 14, 2004 Themis Palpanas (University of California, Riverside) Incremental Maintenance for Non-Distributive Aggregate Functions
December 12, 2003 Divesh Srivastava (AT&T Labs-Research, USA) Approximate String Joins
December 11, 2003 Peter Buneman (University of Edinburgh, Scotland, UK) Curated Databases
December 9, 2003 Jorge Cardoso (Università di Madeira, Portogallo) Semantic Web Process Lifecycle
November 11, 2002 Philip A. Bernstein (Microsoft Research) Generic Model Management -- A Database Infrastructure for Schema Manipulation
July 17, 2002 Nicholas Kushmerick (University College Dublin) Machine learning for Web information extraction
January 22, 2002 Ernest Teniente (Universitat Politecnica de Catalunya) Relationship Type Refinement in Conceptual Models with Multiple Classification

Fundings

The research, teaching and professional activity of the group has been funded by various sources, in some cases by means of research grants, in other cases by means of contracts with diverse goals.

(list to be completed)

  • MIUR, PRIN: Data-Centric Genomic Computing (GenData 2020) 2013 - 2016
  • MIUR, PRIN: Entity-Aware Search Engines (EASE) 2010 - 2012
  • SOGEI (Società Generale d'Informatica), 2012-2015
  • MIUR, PRIN: Next Generation Search (NGS) 2007 - 2009
  • EC Research Project: IST-2001-37244 MOSES (MOdular and Scalable Environment for the Semantic web), 2002-2005
  • INTAS OPEN 97 11109: Modeling and management of semi-structured data for dynamic World-Wide-Web applications, 1999-2001
  • MIUR, FIRB: MAIS (Multichannel Adaptive Information Systems), 2002-2006
  • MIUR, PRIN: Data-X: Gestione, Trasformazione e Scambio di Dati in Ambiente Web, 1999-2001
  • MIUR, PRIN: "Wisdom" (Web Intelligent Search based on Domain Ontologies), 2004-2006
  • MIUR, Fondo Speciale per lo sviluppo della ricerca di interesse strategico: Progetto CNR "ECD" (Technologies and Services for Enhanced Content Delivery), 2002-2005
  • ISTAT (Istituto Centrale di Statistica), 2001-2003
  • AIPA (Autorità per l’Informatica nella Pubblica Amministrazione), 2000-2001
  • ENIT (Ente Nazionale Italiano per il Turismo), 2000-2003
  • EniTecnologie, 2000-2002
  • GPA (Gruppo Prodotti Avanzati), 2002-2003
  • Global Value Services SpA, 2002-2003
  • Sysdata Sud, 2001-2002

Teaching

Paolo Atzeni


Basi di dati
Basi di Dati II

Luca Cabibbo


Analisi e progettazione del software
Architetture Software

Valter Crescenzi


Programmazione orientata agli oggetti
Programmazione concorrente

Paolo Merialdo


Sistemi informativi su Web
Analisi e gestione dell'Informazione su Web

Riccardo Torlone


Calcolatori Elettronici
Big Data

How to Reach Us

Università Roma Tre
Dipartimento di Ingegneria
Via Vito Volterra, 48
00149 Roma, Italy
http://www.ing.uniroma3.it