======================================================================= Input to the Draft report of the ASTRONET Infrastructure Roadmap 5 May 2008 ======================================================================= Edwin A. Valentijn, Gijs Verdoes Kleijn 21 May 2008 OmegaCEN, Kapteyn Institute, University of Groningen ========================================================================= In Section 6 is missing a description of the distributed data production networks which are set up in Europe to host and produce the results of the large surveys planned in the near future for e.g. VST, VISTA and LOFAR. Also, the connectivity in between National datacenters involved in the production, and the connectivity to VO and the GRID communities is missing. LOFAR and AstroWISE are connecting very well to EGEE-GRID and the persons who phrased the expressions of frustration in the current draft should take notice of these constructive activities. In several respects the current text in Section 6 is not well belanced in my opinion. The text below is proposed to be included in Section 6. Following this, cross references at various places in Section 6 are proposed, particularly regarding GRID and VO. Finally, suggestions for modifications for other sections and appendix are given. ======================================================================== Distributed data analysis and survey production networks -------------------------------------------------------- ABSTRACT for Liverpool/ new Paragraph for roadmap Section 6 In the coming decades a new generation of survey telescopes, such as VST, VISTA and LOFAR, will produce Petabytes of raw observational data, which will have to be calibrated, processed and archived. Given the complexity and dedication required to calibrate and pipeline process this data avalanche several agencies operating observatories (ESO, Astron) have decided to place this activity in the astronomical community, in order to actively involve the research astronomer in the process. ESO's public surveys will be processed in the European community, the analysis and post processing of the Key science projects of LOFAR will be done at various institutes scattered over Europe. This requires a modern network and e-science infrastructure with distributed resources which allows teams spread over Europe to jointly collaborate on the data production. Parallel to the Euro-VO effort, the EC has decided to fund the design, construction and qualification of such a network, AstroWISE, which had the same starting date as the EURO-VO and has delivered a European wide distributed system in the fall of 2006. AstroWISE is fully operational and involves National datacenters in the Netherlands, Germany, France, Italy, and plans to further roll out the network in other European countries, such as Spain, Denmark and beyond (e.g. Chile). The two activities (Euro-VO/AstroWISE) are complementary in the sense that AstroWise involves the massive data production and analysis, using an own developed compute GRID and a direct connection to the EGEE-GRID, while the Euro-VO provides the tools to publish and disseminate data and connect data resident in different archives. AstroWise populates these archives. As section 6.2 emphasizes, state-of-the-art exploitation of science ready data is critically dependent on thorough data quality assessment by users for their specific science goals. Astro-WISE facilitates full quality assessment by users by tracking the workflow of data from raw to the final product. The majority of the data taken by VST and Lofar, and some of the VISTA data, will be pipelined through the AstroWISE system, and will be analysed and quality controlled by teams distributed over Europe. This involves distributed sourcecode libraries, distributed calibration results, distributed raw and result data, distributed storage and processing GRIDS. AstroWISE has integrated this in a single peer-to-peer network. The network is positioned in between the observatories and the EURO-VO and requires maximum connectivity to the various infrastructures. AstroWise publishes directly into the EURO-VO, and in the context of LOFAR the network is directly connected to EGEE GRID. In the future, given the high demand on connectivity to processing grids, storage grids and publication grids AstroWISE will play an important role as a working switchboard between these infrastructures. This mode of operations will have to be further expanded to other large data aquisitions, the expansion of Astro-WISE from optical and infrared to the radio marking a start. end new paragraph/ ABSTRACT ======================================================================= Suggested cross-references and modifications in Sections 6, 8 and Appendix related to a "Distributed data analysis and survey production networks" paragraph. CROSS-REFERENCES in Section 6: ------------------------------ *6.2 par.5 (p97): "As for quality assessment, the VO is an open system, where users must take into account the suitability and quality of the available resources for a given purpose (cross reference here)." *6.2.1 par.2 (p97) "..the onus of operating the physical systems that store the data, building and maintaining the archives and services, is on the data centres and research institutes (cross-reference here)." *6.4.1 par.1 (p102): "Just as observational astronomers form large collaborations to get big surveys done (cross-reference here), so....." Section 8 --------- *After Section 8.6.3: add a section Survey production networks In the coming decades new generations of survey telescopes will produce Petabytes of raw observational data, which will have to be calibrated and processed. Agencies operating observatories have decided for many large public surveys to place this activity in the astronomical research community because of the complexity and sheer size of the task and the required dedication and expertise. This requires a modern network and e-science infrastructure with distributed resources which allows teams spread over Europe to jointly collaborate on the data production. Appendix VI ----------- *Appendix VI A, next to last par. (p190): "VO activities in the Netherlands are coordinated by University of Groningen/OmegaCEN. They identify several planned data acquisition facilities...." end cross-references etc =========================================================================