[asac]Re: Draft of ASAC Report: section on RSC
John Richer
jsr at mrao.cam.ac.uk
Wed Apr 3 08:51:51 EST 2002
Dear Ken,
This is an interesting point, but can't be quite right I think. It
argues that any two data archives which are to be compared in some
sense (not sure exactly is meant by cross correlation here) must
be on an ultra-fast local network, i.e. be co-located. Obviously,
co-location of the data will lead to faster data mining. However, if
one accepts that the ALMA and the Subaru archive must be co-located
for archival research, you are forced to accept that you must have a
physical copy of *every* relevant Petabyte-scale archive at the same
location - Sloan, VISTA, VLT, Gemini, ...
One paradigm of the VO and the Grid is data-location independence.
Geographically distributed archives are 'federated' through special
software, allowing access from anywhere on the planet. This works
because for most applications, one does not need to complete
petabyte-scale 'cross correlations': one is interested for example in
selecting all sources in the northern sky with SLOAN magnitudes below
20 which have ALMA 350 GHz fluxes greater than 5 mJy. The actual data
volume transmitted over the global networks needed to do this is much
smaller than the raw-data archive sizes.
Of course, I agree you can find applications where a true petabyte
times petabyte correlation is required: but I think in practice these
are rare, and the VO is not designed to handle these in general. If
these applications are important, then the world needs a 'physical' VO
- a single machine room with all the astronomy archives sent to to it.
John
--
John Richer
Astrophysics Group, Cavendish Lab, Madingley Road, Cambridge CB3 0HE
http://www.mrao.cam.ac.uk/~jsr Tel: +44-1223-337246 Fax: +44-1223-354599
More information about the Asac
mailing list