[asac]Re: Draft of ASAC Report: section on RSC

John Richer jsr at mrao.cam.ac.uk
Wed Apr 3 08:51:51 EST 2002


Dear Ken,

This is an interesting point, but can't be quite right I think.  It
argues that any two data archives which are to be compared in some
sense (not sure exactly is meant by cross correlation here) must
be on an ultra-fast local network, i.e. be co-located.  Obviously,
co-location of the data will lead to faster data mining.  However, if
one accepts that the ALMA and the Subaru archive must be co-located
for archival research, you are forced to accept that you must have a
physical copy of *every* relevant Petabyte-scale archive at the same
location - Sloan, VISTA, VLT, Gemini, ...

One paradigm of the VO and the Grid is data-location independence.
Geographically distributed archives are 'federated' through special
software, allowing access from anywhere on the planet.  This works
because for most applications, one does not need to complete
petabyte-scale 'cross correlations': one is interested for example in
selecting all sources in the northern sky with SLOAN magnitudes below
20 which have ALMA 350 GHz fluxes greater than 5 mJy.  The actual data
volume transmitted over the global networks needed to do this is much
smaller than the raw-data archive sizes.

Of course, I agree you can find applications where a true petabyte
times petabyte correlation is required: but I think in practice these
are rare, and the VO is not designed to handle these in general.  If
these applications are important, then the world needs a 'physical' VO
- a single machine room with all the astronomy archives sent to to it.  

John

--
John Richer      
Astrophysics Group,   Cavendish Lab,   Madingley Road,  Cambridge CB3 0HE
http://www.mrao.cam.ac.uk/~jsr  Tel: +44-1223-337246 Fax: +44-1223-354599  



More information about the Asac mailing list