Workplan and Deliverables
Deliverables take a form of report with, in some cases, associated demonstrators or technical descriptions.
D11 GENDATA 2020 DATA MODEL, VERSION 1 - This deliverable consolidates the results of the first 6 months of the project, and specifically builds upon strong initial interaction among partners and with biologists and clinicians; instrumental to the deliverable preparation is a project workshop, schedule at Month 6 of the project.
D51 ANALYSIS MODULES IDENTIFICATION - This deliverable provides a first classification of the modules which are common to the various aspects of data analysis and therefore should be considered as the data analysis building blocks.
D71 USE CASE SPECIFICATIONS - This deliverable contains the specification of the three use cases, which will already use the data model delivered in D11.
D12 SPECIFICATION OF THE GENOMIC DATA MODEL (GDM) AND GENOMETRIC QUERY LANGUAGE (GMQL) - The model and language have been defined as result of the first year of PRIN activity at Politecnico di Mlano; it is used as unifying data model and language for a number of ongoing collaborations between the various operating units of the PRIN project.
D21 and D22 - GENDATA 2020 ARCHITECTURE AND USER INTERFACES - This joint deliverable is the specification and prototype of the architecture of GenData 2020, a system supporting GDM and GMQL which is capable of querying thousands of datasets and samples; it also incorporates a description of the user interfaces which are used to support the biologists in the use of Gendata 2020. GENDATA 2020 is the core of the PRIN project, we expect it to be extended as the effect of several ongoing collaborations between the various operating units of the PRIN project.
D31 NGS DATA DELIVERING PIPELINE - This deliverable describes SMITH (Sequencing Machine Information Tracking and Handling), a system for producing and managing workflows of NGS data at the Genomic Unit of the Italian Institute of Technology (IIT).
D41 ONTOLOGY BASED SEARCH OF GENOMIC METADATA - This deliverable consists of methods for building ontologies which will describe specific experimental contexts, by interacting with existing ontological resources and relating them with content described according to the model.
D52 PATTERN-BASED QUERIES ON GENOMIC DATASETS - This deliverable introduces a specific operator of the Genometric Query Language (GMQL) introduced according to the D12. Such operator performs pattern-based queries on genomic datasets, i.e., it provides biologists with the ability, once they identify an interesting genomic pattern, to look for similar occurrences of it in the data.
D61 CRYPTOGRAPHIC TECHNIQUES AND PROTOCOLS - This deliverable will report on the techniques for managing and processing encrypted information that will be developed in the project, including efficient protocols for allowing parties to perform collaborative computations involving their datasets while maintaining the source datasets private.
D53 ANALYSIS MODULES SELECTION AND IMPLEMENTATION - This deliverable consists of a large body of data analysis methods, in line with the initial classification described by D51, which have been prototyped and applied to use cases.
D32 CLINICAL DATA MODELS AND APPLICATIONS - This deliverable describes safe methods for transferring health information among specific systems or communities, reflecting integrity constraints both on the sending and receiving sites.
D33 GENE REGIONS AND G-QUADRUPLEX STRUCTURES - Using GenData2020 data model with a G4 Predictor. This deliverable present preliminary results and statistics obtained by using the state of the art software tools able to predict G-quadruplex DNA conformations starting from the primary sequence.
D11 GENDATA 2020 DATA MODEL, VERSION 1 - This deliverable consolidates the results of the first 6 months of the project, and specifically builds upon strong initial interaction among partners and with biologists and clinicians; instrumental to the deliverable preparation is a project workshop, schedule at Month 6 of the project.
D51 ANALYSIS MODULES IDENTIFICATION - This deliverable provides a first classification of the modules which are common to the various aspects of data analysis and therefore should be considered as the data analysis building blocks.
D71 USE CASE SPECIFICATIONS - This deliverable contains the specification of the three use cases, which will already use the data model delivered in D11.
D12 SPECIFICATION OF THE GENOMIC DATA MODEL (GDM) AND GENOMETRIC QUERY LANGUAGE (GMQL) - The model and language have been defined as result of the first year of PRIN activity at Politecnico di Mlano; it is used as unifying data model and language for a number of ongoing collaborations between the various operating units of the PRIN project.
D21 and D22 - GENDATA 2020 ARCHITECTURE AND USER INTERFACES - This joint deliverable is the specification and prototype of the architecture of GenData 2020, a system supporting GDM and GMQL which is capable of querying thousands of datasets and samples; it also incorporates a description of the user interfaces which are used to support the biologists in the use of Gendata 2020. GENDATA 2020 is the core of the PRIN project, we expect it to be extended as the effect of several ongoing collaborations between the various operating units of the PRIN project.
D31 NGS DATA DELIVERING PIPELINE - This deliverable describes SMITH (Sequencing Machine Information Tracking and Handling), a system for producing and managing workflows of NGS data at the Genomic Unit of the Italian Institute of Technology (IIT).
D41 ONTOLOGY BASED SEARCH OF GENOMIC METADATA - This deliverable consists of methods for building ontologies which will describe specific experimental contexts, by interacting with existing ontological resources and relating them with content described according to the model.
D52 PATTERN-BASED QUERIES ON GENOMIC DATASETS - This deliverable introduces a specific operator of the Genometric Query Language (GMQL) introduced according to the D12. Such operator performs pattern-based queries on genomic datasets, i.e., it provides biologists with the ability, once they identify an interesting genomic pattern, to look for similar occurrences of it in the data.
D61 CRYPTOGRAPHIC TECHNIQUES AND PROTOCOLS - This deliverable will report on the techniques for managing and processing encrypted information that will be developed in the project, including efficient protocols for allowing parties to perform collaborative computations involving their datasets while maintaining the source datasets private.
D53 ANALYSIS MODULES SELECTION AND IMPLEMENTATION - This deliverable consists of a large body of data analysis methods, in line with the initial classification described by D51, which have been prototyped and applied to use cases.
D32 CLINICAL DATA MODELS AND APPLICATIONS - This deliverable describes safe methods for transferring health information among specific systems or communities, reflecting integrity constraints both on the sending and receiving sites.
D33 GENE REGIONS AND G-QUADRUPLEX STRUCTURES - Using GenData2020 data model with a G4 Predictor. This deliverable present preliminary results and statistics obtained by using the state of the art software tools able to predict G-quadruplex DNA conformations starting from the primary sequence.