[Pafgbt] PAF beamformer size and cost

Thu Feb 4 14:23:54 EST 2010

> Brian,
>
> Thanks very much for the beamformer outline.  How does Jason handle the
> frequency dependence of the beamformer weights?  Is there an FIR is each
> signal path or maybe an FFT - iFFT operation?
>
> What to do with the high bandwidth beam outputs at high time resolution is
> a standard pulsar problem.  Since the beam outputs are essentially the
> same as you'd have with a conventional horn feed, all of the current
> tricks of the trade will apply.  As far as I know, all pulsar surveys are
> done by generating spectra with a frequency resolution consistent with the
> highest expected pulse dispersion, squaring the signals, integrating for
> the selected resolution time, and streaming this to disk.  Pulse searches
> are then done off-line in computer clusters or supercomputers.  (Our
> pulsar experts can chime in here.)  We'll need to match our output data
> rates with available storage and post-processing capabilities - a
> time-dependent target.  Maybe someone could give us some currently
> feasible numbers and time derivatives.

I'm not a pulsar expert, but I know a bit about what the 1 beam GBT
currently can do, and what is available in the computer world.

I don't think it's feasible to keep 40 beams of pulsar search data.  We're
having a time of it keeping up with managing the data from our 1 beam
guppi.  We capture ~50-100 MB/sec with guppi in search mode, so 40 times
that is completely insane. (20-40 Gigabytes per second!)

It's not so much the capture rate (which, while difficult, is possible
today with large disk arrays) as the fact that pulsar searches use long
integrations, so the volume of data to be stored until it is processed is
huge.  This example would be >500 Terabytes in an 8 hour observation.
Yikes!  That's possibly more pulsar search data than has been collected,
ever.

I have no idea how long it takes to process the data, but the only really
feasible thing to do for this kind of data rate is to process it in (near)
real time and throw it away when you are done.  That means a *really*
large supercomputer or special-purpose pulsar search engine.

Pulsar timing is another kettle of fish altogether.  There, the bottleneck
is processing power. (assuming coherent dedispersion)  The disk I/O is
almost negligible.  The I/O requirements for getting the samples off the
beamformer and onto the processing machine are also formidable.

As usual, the pulsar requirements far outstrip all other observing
requirements as far as data rates.  We can build spectrometers that
integrate for milliseconds or seconds to cut data rates to manageable
numbers.  So it seems reasonable to build a machine that could handle the
40 beams for "normal" observing, and reduce the number of beams used for
pulsar search output.

For transients and pulsars you need the fast-sampled time-domain data to
work with, and can't really integrate it down much before using it.

I think the biggest problem is really storage of the data, and not
necessarily the processing in real time.

John

>
> Before completely buying into the voltage sum real-time beamformer we
> should keep in mind that a lot of single dish applications don't need
> voltage outputs as long as the time and frequency resolution parameters
> are satisfied.  If there are big computational savings in a
> post-correlation beamformer, and we satisfy ourselves that there's not a
> hidden gotcha in this approach, we should keep it on the table.  My guess
> is that any computational advantages evaporate or even reverse when the
> required time resolution approaches the inverse required frequency
> resolution.
>
> Rick
>
> On Thu, 4 Feb 2010, Brian Jeffs wrote:
>
>> Rick,
>>
>> See below:
>>
>>>
>>> Is your assumed beamformer architecture voltage sums or
>>> post-correlation?
>>> In other words, are the beams formed by summing complex weighted
>>> voltages
>>> from the array elements or by combining cross products of all of the
>>> elements?  John's reference at http://arxiv.org/abs/0912.0380v1 shows a
>>> voltage-sum beamformer.  The post-correlaion bamformer may use fewer
>>> processing resources, but it precludes further coherent signal
>>> processing
>>> of each beam.
>>
>> Our plans are based on a correlator/beamformer developed by Jason Manley
>> for
>> the ATA and some other users (the pocket packetized correlator).  He
>> recently
>> added simultaneous beamforming to the existing correlator gateware, so
>> they
>> run concurrently.  In our application the only time this is required is
>> during interference mitigation.  Normally we correlate during
>> calibration and
>> beamform otherwise.
>>
>> His design is a voltage sum real-time beamformer.  At this point he does
>> not
>> compute as many simultaneous beams as we need to, so I think we will
>> have to
>> exploit the computational trade-off to do either beamforming or
>> correlation
>> but not both, or it will not fit in the FPGA.  Post-correlation
>> beamforming
>> is really quite trivial, and has a low computational burden, so that
>> could be
>> added to the correlator and run simultaneously.  I believe that when we
>> need
>> simultaneous voltage sum beamforming and correlations (as when doing
>> interference mitigation) we will have to reduce the effective bandwidth.
>>  We
>> really cannot take Jasons' existing code and plug it right in for our
>> application, but it will serve as a very good template.  That is why we
>> have
>> Jonathan out at UC Berkeley for 6 months, so he can learn the ropes and
>> then
>> work on our correlator/beamformer.
>>
>>
>>> Very roughly, the science requirements for a beamformer fall into two
>>> camps, which may be operational definitions of first science and
>>> cadallac/dream machine: 1. spectral line surveys with bandwidths in the
>>> 3-100 MHz range and very modest time resolution and 2. pulsar and fast
>>> transient source surveys with bandwidths on the order of 500+ MHz and
>>> <=50
>>> microsecond time resolution.  The 2001 science case says pulsar work
>>> requires bandwidths of 200+ MHz, but the bar has gone higher in the
>>> meantime.  One can always think of something to do with a wide
>>> bandwidth,
>>> low time resolution beamformer, but it would be a stretch.  The GBT
>>> sensitivity isn't high enough to see HI at redshifts below, say, 1350
>>> MHz
>>> in a very wide-area survey.  Hence, building a beamformer with wide
>>> bandwith but low time resolution may not be the optimum use of
>>> resources.
>>> Also, the 2001 science cases assumes 7 formed beams, but the minimum
>>> now
>>> would be, maybe, 19 and growing as the competition heats up.
>>>
>>
>> We are operating under the assumption of at least 19, and probably more
>> than
>> 40 formed beams.  If we only use the correlator for calibration, then we
>> should be able to achieve both relatively wide bandwidth (250 MHz) and
>> high
>> time resolution (we will get a beamformer output per time sample, not
>> just
>> per STI interval).  Dan and Jason feed that based on their experience
>> with
>> existing codes this is achievable on the 40 ROACH system we sketched
>> out, but
>> we will have to wait and see.  If we run into bottlenecks we will have
>> to
>> reduce either bandwidth or the number of formed beams.
>>
>> One issue I am not clear on yet is what we do with the data streams for
>> 40+
>> voltage sum beams over 500+ frequency channels.  How do we get it off
>> the
>> CASPER array, and what will be done with it?  For 8 bit complex samples
>> at
>> the beamformer outputs you would need the equivalent of fourty 10 Gbit
>> ethernet links to some other big processor, such as a transient
>> detector. If
>> this is unreasonable then either the number of bits per sample,
>> bandwidth, or
>> number of beams will need to be reduced.  Alternatively, it is not hard
>> to
>> add a spectrometer to the beamformer outputs inside the
>> correlator/beamformer, and this provides a huge data rate reduction.
>> But how
>> do we handle data for transient observations where fine time resolution
>> is
>> critical?
>>
>> Brian
>>
>>
>>
>>
>>> Counter-thoughts?
>>>
>>> Rick
>>>
>>> On Wed, 3 Feb 2010, Brian Jeffs wrote:
>>>
>>> > Rick,
>>> >
>>> > We have a rough architecture and cost estimate for a 40 channel
>>> > correlator/beamformer capable of 40 channels (19 dual pol antennas
>>> plus
>>> > reference or RFI auxiliary) over 250 MHz BW.  We worked this out with
>>> > CASOER
>>> > head Dan Werthimer and his crack correlator/beamformer developer
>>> Jason
>>> > Manley.  It will require 20 ROACH boards, 20 iADC boards, 1 20-port
>>> 10
>>> > Gbit
>>> > ethernet switch, and some lesser associated parts.
>>> >
>>> > Our recent ROACH order was $2750 each, iADC: $1300 each, enclosures:
>>> $750
>>> > each, XiLinx chip: free or $3000,  ethernet switch: $12000.
>>> >
>>> > You can use your existing data acquisition array of PCs as the
>>> > stream-to-disk
>>> > farm, but will need to buy 10 Gbit cards and hardware RAID
>>> controllers.
>>> >
>>> > The total (which will be a bit low) assuming no free XiLinx parts and
>>> not
>>> > including  is:  $168,000.
>>> >
>>> > Of course this does not include development manpower costs.
>>> >
>>> > Brian
>>> >
>>> >
>>> > On Feb 3, 2010, at 3:05 PM, Rick Fisher wrote:
>>> >
>>> > > This is an incomplete question, but maybe we can beat it into
>>> something
>>> > > answerable:  Do we know enough about existing applications on
>>> CASPER
>>> > > hardware to make a conservative estimate of what it would cost to
>>> build
>>> > > a
>>> > > PAF beamformer with a given set of specs?  I'm looking for at least
>>> two
>>> > > estimates.  What is a realistic set of specs for the first science
>>> PAF
>>> > > beamformer, and what would the dream machine that would make a big
>>> > > scientific impact cost?  You're welcome to define the specs that go
>>> > > with
>>> > > either of these two questions or I'll start defining them by
>>> thinking
>>> > > "out
>>> > > loud".  The first science beamformer will guide the initial system
>>> > > design,
>>> > > and the dream machine will help get a handle on longer range
>>> > > expectations.
>>> > >
>>> > > Cheers,
>>> > > Rick
>>> > >
>>>>> _______________________________________________
>>> > > Pafgbt mailing list
>>> > > Pafgbt at listmgr.cv.nrao.edu
>>> > > http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>> >
>>> _______________________________________________
>>> Pafgbt mailing list
>>> Pafgbt at listmgr.cv.nrao.edu
>>> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>>
> _______________________________________________
> Pafgbt mailing list
> Pafgbt at listmgr.cv.nrao.edu
> http://listmgr.cv.nrao.edu/mailman/listinfo/pafgbt
>