[Pafgbt] PAF beamformer size and cost

Mon Feb 8 10:26:42 EST 2010

Good points.  Data rate and storage capacity are just a matter of money. 
Data management is where complexity sets in, and this is the toughest 
thing to deal with.  For the moment we might use data rate as a blunt 
metric of the bottle neck, but let's keep the real problem at the fore. 
The system design MUST include data management.

Rick

On Mon, 8 Feb 2010, John Ford wrote:

>> Roughly 50 MB/s is probably a good stake in the ground for the limits of
>> output data bandwidth and storage limitations.  We can probably plan on
>> these capabilities doubling every 2 years or so, but it's not a smooth
>> function of time, of course.
>
> It's actually the data management that's a problem, not the collection
> speeds.  It's very easy to do 50-100 MB/sec data rates to disk over 10
> gigabit Ethernet.  40 gigabit infiniband is available and almost as cheap
> as 10 gbe right now.  It's what to do with all those terabytes once you
> have it on disk.
>
> I think that any of these designs should include the design of a computer
> system (software and hardware) to complete the entire scientific data
> flow, and not just the data acquisition.  We always think about it a
> little, and then punt it to the future, and we're always sorry...  (I
> think Scott will agree!)
>
> John
>
>>
>> Rick
>>
>> On Thu, 4 Feb 2010, Scott Ransom wrote:
>>
>>> Stage one (which has been what we have been using up until now) is:
>>>  256 channels over 100MHz @ 64us sampling times 7 beams
>>> Stage two, which is just starting now is:
>>>  1024 channels over 300MHz @ 64us sampling times 7 beams
>>>
>>> In both cases we are storing only total intensity and for long term
>>> storage, 4-bits per sample.
>>>
>>> Scott
>>>
>>>
>>> On Thursday 04 February 2010 02:44:38 pm Rick Fisher wrote:
>>>> What are the Arecibo 7-beam pulsar survey parameters?
>>>>
>>>> Rick
>>>>
>>>> On Thu, 4 Feb 2010, Scott Ransom wrote:
>>>>> For L-band with 500MHz of BW, you'd probably want 2K channels (maybe
>>>>> 1K) dumping at between 50-100us.
>>>>>
>>>>> Scott
>>>>>
>>>>> On Thursday 04 February 2010 02:35:57 pm Rick Fisher wrote:
>>>>>> John,
>>>>>>
>>>>>> How many frequency channels are there and what's the time resolution
>>>>>> of guppi in search mode?
>>>>>>
>>>>>> Rick
>>>>>>
>>>>>> On Thu, 4 Feb 2010, John Ford wrote:
>>>>>>>> Brian,
>>>>>>>>
>>>>>>>> Thanks very much for the beamformer outline.  How does Jason
>>>>>>>> handle the frequency dependence of the beamformer weights?  Is
>>>>>>>> there an FIR is each signal path or maybe an FFT - iFFT operation?
>>>>>>>>
>>>>>>>> What to do with the high bandwidth beam outputs at high time
>>>>>>>> resolution is a standard pulsar problem.  Since the beam outputs
>>>>>>>> are essentially the same as you'd have with a conventional horn
>>>>>>>> feed, all of the current tricks of the trade will apply.  As far
>>>>>>>> as I know, all pulsar surveys are done by generating spectra with
>>>>>>>> a frequency resolution consistent with the highest expected pulse
>>>>>>>> dispersion, squaring the signals, integrating for the selected
>>>>>>>> resolution time, and streaming this to disk.  Pulse searches are
>>>>>>>> then done off-line in computer clusters or supercomputers.  (Our
>>>>>>>> pulsar experts can chime in here.)  We'll need to match our output
>>>>>>>> data rates with available storage and post-processing capabilities
>>>>>>>> - a time-dependent target.  Maybe someone could give us some
>>>>>>>> currently feasible numbers and time derivatives.
>>>>>>>
>>>>>>> I'm not a pulsar expert, but I know a bit about what the 1 beam GBT
>>>>>>> currently can do, and what is available in the computer world.
>>>>>>>
>>>>>>> I don't think it's feasible to keep 40 beams of pulsar search data.
>>>>>>> We're having a time of it keeping up with managing the data from
>>>>>>> our 1 beam guppi.  We capture ~50-100 MB/sec with guppi in search
>>>>>>> mode, so 40 times that is completely insane. (20-40 Gigabytes per
>>>>>>> second!)
>>>>>>>
>>>>>>> It's not so much the capture rate (which, while difficult, is
>>>>>>> possible today with large disk arrays) as the fact that pulsar
>>>>>>> searches use long integrations, so the volume of data to be stored
>>>>>>> until it is processed is huge.  This example would be >500
>>>>>>> Terabytes in an 8 hour observation. Yikes!  That's possibly more
>>>>>>> pulsar search data than has been collected, ever.
>>>>>>>
>>>>>>> I have no idea how long it takes to process the data, but the only
>>>>>>> really feasible thing to do for this kind of data rate is to
>>>>>>> process it in (near) real time and throw it away when you are done.
>>>>>>>  That means a *really* large supercomputer or special-purpose
>>>>>>> pulsar search engine.
>>>>>>>
>>>>>>> Pulsar timing is another kettle of fish altogether.  There, the
>>>>>>> bottleneck is processing power. (assuming coherent dedispersion)
>>>>>>> The disk I/O is almost negligible.  The I/O requirements for
>>>>>>> getting the samples off the beamformer and onto the processing
>>>>>>> machine are also formidable.
>>>>>>>
>>>>>>> As usual, the pulsar requirements far outstrip all other observing
>>>>>>> requirements as far as data rates.  We can build spectrometers that
>>>>>>> integrate for milliseconds or seconds to cut data rates to
>>>>>>> manageable numbers.  So it seems reasonable to build a machine that
>>>>>>> could handle the 40 beams for "normal" observing, and reduce the
>>>>>>> number of beams used for pulsar search output.
>>>>>>>
>>>>>>> For transients and pulsars you need the fast-sampled time-domain
>>>>>>> data to work with, and can't really integrate it down much before
>>>>>>> using it.
>>>>>>>
>>>>>>> I think the biggest problem is really storage of the data, and not
>>>>>>> necessarily the processing in real time.
>>>>>>>
>>>>>>> John
>>>>>>>
>>>>>>>> Before completely buying into the voltage sum real-time beamformer
>>>>>>>> we should keep in mind that a lot of single dish applications
>>>>>>>> don't need voltage outputs as long as the time and frequency
>>>>>>>> resolution parameters are satisfied.  If there are big
>>>>>>>> computational savings in a post-correlation beamformer, and we
>>>>>>>> satisfy ourselves that there's not a hidden gotcha in this
>>>>>>>> approach, we should keep it on the table.  My guess is that any
>>>>>>>> computational advantages evaporate or even reverse when the
>>>>>>>> required time resolution approaches the inverse required frequency
>>>>>>>> resolution.
>>>>>>>>
>>>>>>>> Rick
>>>>>>>>
>>>>>>>> On Thu, 4 Feb 2010, Brian Jeffs wrote:
>>>>>>>>> Rick,
>>>>>>>>>
>>>>>>>>> See below:
>>>>>>>>>> Is your assumed beamformer architecture voltage sums or
>>>>>>>>>> post-correlation?
>>>>>>>>>> In other words, are the beams formed by summing complex weighted
>>>>>>>>>> voltages
>>>>>>>>>> from the array elements or by combining cross products of all of
>>>>>>>>>> the elements?  John's reference at
>>>>>>>>>> http://arxiv.org/abs/0912.0380v1 shows a voltage-sum beamformer.
>>>>>>>>>> The post-correlaion bamformer may use fewer processing
>>>>>>>>>> resources, but it precludes further coherent signal processing
>>>>>>>>>> of each beam.
>>>>>>>>>
>>>>>>>>> Our plans are based on a correlator/beamformer developed by Jason
>>>>>>>>> Manley for
>>>>>>>>> the ATA and some other users (the pocket packetized correlator).
>>>>>>>>> He recently
>>>>>>>>> added simultaneous beamforming to the existing correlator
>>>>>>>>> gateware, so they
>>>>>>>>> run concurrently.  In our application the only time this is
>>>>>>>>> required is during interference mitigation.  Normally we
>>>>>>>>> correlate during calibration and
>>>>>>>>> beamform otherwise.
>>>>>>>>>
>>>>>>>>> His design is a voltage sum real-time beamformer.  At this point
>>>>>>>>> he does not
>>>>>>>>> compute as many simultaneous beams as we need to, so I think we
>>>>>>>>> will have to
>>>>>>>>> exploit the computational trade-off to do either beamforming or
>>>>>>>>> correlation
>>>>>>>>> but not both, or it will not fit in the FPGA.  Post-correlation
>>>>>>>>> beamforming
>>>>>>>>> is really quite trivial, and has a low computational burden, so
>>>>>>>>> that could be
>>>>>>>>> added to the correlator and run simultaneously.  I believe that
>>>>>>>>> when we need
>>>>>>>>> simultaneous voltage sum beamforming and correlations (as when
>>>>>>>>> doing interference mitigation) we will have to reduce the
>>>>>>>>> effective bandwidth. We
>>>>>>>>> really cannot take Jasons' existing code and plug it right in for
>>>>>>>>> our application, but it will serve as a very good template.  That
>>>>>>>>> is why we have
>>>>>>>>> Jonathan out at UC Berkeley for 6 months, so he can learn the
>>>>>>>>> ropes and then
>>>>>>>>> work on our correlator/beamformer.
>>>>>>>>>
>>>>>>>>>> Very roughly, the science requirements for a beamformer fall
>>>>>>>>>> into two camps, which may be operational definitions of first
>>>>>>>>>> science and cadallac/dream machine: 1. spectral line surveys
>>>>>>>>>> with bandwidths in the 3-100 MHz range and very modest time
>>>>>>>>>> resolution and 2. pulsar and fast transient source surveys with
>>>>>>>>>> bandwidths on the order of 500+ MHz and <=50
>>>>>>>>>> microsecond time resolution.  The 2001 science case says pulsar
>>>>>>>>>> work requires bandwidths of 200+ MHz, but the bar has gone
>>>>>>>>>> higher in the meantime.  One can always think of something to do
>>>>>>>>>> with a wide bandwidth,
>>>>>>>>>> low time resolution beamformer, but it would be a stretch.  The
>>>>>>>>>> GBT sensitivity isn't high enough to see HI at redshifts below,
>>>>>>>>>> say, 1350 MHz
>>>>>>>>>> in a very wide-area survey.  Hence, building a beamformer with
>>>>>>>>>> wide bandwith but low time resolution may not be the optimum use
>>>>>>>>>> of resources.
>>>>>>>>>> Also, the 2001 science cases assumes 7 formed beams, but the
>>>>>>>>>> minimum now
>>>>>>>>>> would be, maybe, 19 and growing as the competition heats up.
>>>>>>>>>
>>>>>>>>> We are operating under the assumption of at least 19, and
>>>>>>>>> probably more than
>>>>>>>>> 40 formed beams.  If we only use the correlator for calibration,
>>>>>>>>> then we should be able to achieve both relatively wide bandwidth
>>>>>>>>> (250 MHz) and high
>>>>>>>>> time resolution (we will get a beamformer output per time sample,
>>>>>>>>> not just
>>>>>>>>> per STI interval).  Dan and Jason feed that based on their
>>>>>>>>> experience with
>>>>>>>>> existing codes this is achievable on the 40 ROACH system we
>>>>>>>>> sketched out, but
>>>>>>>>> we will have to wait and see.  If we run into bottlenecks we will
>>>>>>>>> have to
>>>>>>>>> reduce either bandwidth or the number of formed beams.
>>>>>>>>>
>>>>>>>>> One issue I am not clear on yet is what we do with the data
>>>>>>>>> streams for 40+
>>>>>>>>> voltage sum beams over 500+ frequency channels.  How do we get it
>>>>>>>>> off the
>>>>>>>>> CASPER array, and what will be done with it?  For 8 bit complex
>>>>>>>>> samples at
>>>>>>>>> the beamformer outputs you would need the equivalent of fourty 10
>>>>>>>>> Gbit ethernet links to some other big processor, such as a
>>>>>>>>> transient detector. If
>>>>>>>>> this is unreasonable then either the number of bits per sample,
>>>>>>>>> bandwidth, or
>>>>>>>>> number of beams will need to be reduced.  Alternatively, it is
>>>>>>>>> not hard to
>>>>>>>>> add a spectrometer to the beamformer outputs inside the
>>>>>>>>> correlator/beamformer, and this provides a huge data rate
>>>>>>>>> reduction. But how
>>>>>>>>> do we handle data for transient observations where fine time
>>>>>>>>> resolution is
>>>>>>>>> critical?
>>>>>>>>>
>>>>>>>>> Brian
>>>>>>>>>
>>>>>>>>>> Counter-thoughts?
>>>>>>>>>>
>>>>>>>>>> Rick
>>>>>>>>>>
>>>>>>>>>> On Wed, 3 Feb 2010, Brian Jeffs wrote:
>>>>>>>>>>> Rick,
>>>>>>>>>>>
>>>>>>>>>>> We have a rough architecture and cost estimate for a 40 channel
>>>>>>>>>>> correlator/beamformer capable of 40 channels (19 dual pol
>>>>>>>>>>> antennas
>>>>>>>>>>
>>>>>>>>>> plus
>>>>>>>>>>
>>>>>>>>>>> reference or RFI auxiliary) over 250 MHz BW.  We worked this
>>>>>>>>>>> out with CASOER
>>>>>>>>>>> head Dan Werthimer and his crack correlator/beamformer
>>>>>>>>>>> developer
>>>>>>>>>>
>>>>>>>>>> Jason
>>>>>>>>>>
>>>>>>>>>>> Manley.  It will require 20 ROACH boards, 20 iADC boards, 1
>>>>>>>>>>> 20-port
>>>>>>>>>>
>>>>>>>>>> 10
>>>>>>>>>>
>>>>>>>>>>> Gbit
>>>>>>>>>>> ethernet switch, and some lesser associated parts.
>>>>>>>>>>>
>>>>>>>>>>> Our recent ROACH order was $2750 each, iADC: $1300 each,
>>>>>>>>>>> enclosures:
>>>>>>>>>>
>>>>>>>>>> $750
>>>>>>>>>>
>>>>>>>>>>> each, XiLinx chip: free or $3000,  ethernet switch: $12000.
>>>>>>>>>>>
>>>>>>>>>>> You can use your existing data acquisition array of PCs as the
>>>>>>>>>>> stream-to-disk
>>>>>>>>>>> farm, but will need to buy 10 Gbit cards and hardware RAID
>>>>>>>>>>
>>>>>>>>>> controllers.
>>>>>>>>>>
>>>>>>>>>>> The total (which will be a bit low) assuming no free XiLinx
>>>>>>>>>>> parts and
>>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> including  is:  $168,000.
>>>>>>>>>>>
>>>>>>>>>>> Of course this does not include development manpower costs.
>>>>>>>>>>>
>>>>>>>>>>> Brian
>>>>>>>>>>>
>>>>>>>>>>> On Feb 3, 2010, at 3:05 PM, Rick Fisher wrote:
>>>>>>>>>>>> This is an incomplete question, but maybe we can beat it into
>>>>>>>>>>
>>>>>>>>>> something
>>>>>>>>>>
>>>>>>>>>>>> answerable:  Do we know enough about existing applications on
>>>>>>>>>>
>>>>>>>>>> CASPER
>>>>>>>>>>
>>>>>>>>>>>> hardware to make a conservative estimate of what it would cost
>>>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>> build
>>>>>>>>>>
>>>>>>>>>>>> a
>>>>>>>>>>>> PAF beamformer with a given set of specs?  I'm looking for at
>>>>>>>>>>>> least
>>>>>>>>>>
>>>>>>>>>> two
>>>>>>>>>>
>>>>>>>>>>>> estimates.  What is a realistic set of specs for the first
>>>>>>>>>>>> science
>>>>>>>>>>
>>>>>>>>>> PAF
>>>>>>>>>>
>>>>>>>>>>>> beamformer, and what would the dream machine that would make a
>>>>>>>>>>>> big scientific impact cost?  You're welcome to define the
>>>>>>>>>>>> specs that go with
>>>>>>>>>>>> either of these two questions or I'll start defining them by
>>>>>>>>>>
>>>>>>>>>> thinking
>>>>>>>>>>
>>>>>>>>>>>> "out
>>>>>>>>>>>> loud".  The first science beamformer will guide the initial
>>>>>>>>>>>> system design,
>>>>>>>>>>>> and the dream machine will help get a handle on longer range
>>>>>>>>>>>> expectations.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Rick
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pafgbt mailing list
>>>>>>>>>>>> Pafgbt at listmgr.cv.nrao.edu
>>>>>>>>>>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pafgbt mailing list
>>>>>>>>>> Pafgbt at listmgr.cv.nrao.edu
>>>>>>>>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pafgbt mailing list
>>>>>>>> Pafgbt at listmgr.cv.nrao.edu
>>>>>>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pafgbt mailing list
>>>>>> Pafgbt at listmgr.cv.nrao.edu
>>>>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>>>
>>>> _______________________________________________
>>>> Pafgbt mailing list
>>>> Pafgbt at listmgr.cv.nrao.edu
>>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>>>
>>>
>>>
>> _______________________________________________
>> Pafgbt mailing list
>> Pafgbt at listmgr.cv.nrao.edu
>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>
>
>