[Gb-ccb] epp data rates

Martin Shepherd mcs at astro.caltech.edu
Fri Mar 12 21:32:02 EST 2004


On Wed, 10 Mar 2004, John Ford wrote:
> Data rates for the EPP are quoted at "500K Bytes to 2 MB per second".
> It seems this is a viable approach, as we need only 260K Bytes per
> second streaming data bandwidth.

After 3 days of trying to check out this possibility, I finally have
some results. I breadboarded a simple EPP handshaking circuit, which
basically consists of a simple NOT gate between the data strobe and
wait lines of the EPP parallel port, and put together a custom
parallel port cable to hook this up. I then wrote a simple program
that uses the linux parport driver via the ppdev driver, to write a
4MB buffer to the parallel port, and monitored the resulting outputs
on an oscilloscope. Unfortunately, my laptop turned out not to support
EPP mode, and it took me two days of fiddling to figure this out. The
confusing thing was that EPP mode writes seemed to work, albeit at
only 182KBytes/s, and I could see the data coming out correctly on the
scope. It turned out that the linux driver, on finding that my laptop
didn't support EPP mode, was emulating EPP mode in software via the
standard SPP parallel port mode.

Today I ran the same setup on Tim's 1.4GHz Pentium 4 desktop PC, which
does have EPP support, but the results were depressingly bad.

As a baseline, I first forced the driver to emulate EPP mode in
software, to see how fast the non-EPP parallel port mode could
go. This resulted in a transfer rate of 146KBytes/s, which strangely
slower than on my 750MHz PII laptop, and much too slow for our
needs. Again, this took 99% CPU time, as one might expect for software
emulation.

I then tried the basic 1-byte at a time EPP mode. In this mode, I
issued a single write() to the linux parallel port driver, which then
looped using the intel outb() instruction, to send all of the bytes
one at a time. Each of the outb() instructions was followed by an
inb() instruction to check the EPP timeout bit. If a timeout had been
detected, the driver would have signalled an I/O error to my program
(this didn't happen).  This again took 99% CPU time, but the transfer
rate was almost doubled, at 286KBytes/s. This is right at the
borderline of what we need, but the 99% CPU utilization would stop us
from running anything else on the box without slowing the transfer
rate down.  I'm guessing that if I wrote my own driver, which only
checked the timeout bit at the end of the whole transfer, instead of
after sending each byte, the speed could be doubled, and we would see
the approx 500KB/s minimum maximum rate. Unfortunately, since my
laptop doesn't support EPP mode, and I don't have root permission
necessary to recompile the kernel and write a custom driver on the
machines in the department, I can't try this. Even if this did get us
twice the speed however, one integration worth of data would still
take the CPU to 99% utilization for half of every integration period,
so the continuous load would average to 50%, which seems uncomfortably
high, and might be even higher on the slower CPU of the SBC's CPU.

Next I tried an EPP driver mode which uses the Intel outsb() or
outsl() instructions to output either multiple bytes or multiple
4-byte double-words in a single assembly-code instruction. This takes
advantage of two optimizations.

  1. Some EPP mode interfaces provide an extra 3 registers which allow
     one to do a 32-bit write and leave the EPP hardware to stream
     these over the parallel port in 8-bit chunks. This is the mode
     that the outsl() instruction takes advantage of.

  2. The loop over double-words or bytes, depending on which of
     outsl() or outsb() is used, is performed in the CPU, rather than
     in a C for loop in the driver, and is thus faster.

In addition, the driver tests the EPP timeout bit after all bytes have
been sent, so this should speed things up, instead of after each
byte. The measured result was 1MByte/s, again with 99% CPU load. For
the CCB we need to transfer rates of 256KByte/s. Thus based on the
measured timing, this would take 25% of the CPU time on a 1.4GHz
processor, and maybe 50% on an 800MHz processor. This assumes that the
SBC's EPP mode interface provides the 4-bytes at a time optimization
mentioned above.

I'm puzzled about why I couldn't get the 2MByte/s maximum rate that is
claimed for EPP mode. Any ideas to try would be appreciated, although
driver level changes are currently untestable, since I don't have a
test PC with EPP mode to experiment with. Could my handshaking circuit
be at fault? All it is a NOT gate whose input is the EPP data-strobe
line, and whose output is the EPP WAIT line. This appears to comply
with the EPP handshaking cycle, and I can't see how I could make this
any faster.

The reason that I did this test was that I was worried that EPP mode
at the rate that we need, might take too much CPU time for streaming
integrated data back to the CPU. It looks as though my worry was
justified. My hope, in doing these tests was that without the
non-universal 4-byte at a time mode enabled, I would be able to
achieve data-rates that exceeded our requirements, and that the CPU
load would be insignificant. Since this hasn't turned out to be the
case, I am reluctant to drop the USB1.1 interface just yet.

I now also have two USB 1.1 test modules, and I plan to test them by
cross connecting their read and write lines together, such that data
are streamed from one USB port to another. Thus by writing to one USB
port, looping back the data to the other USB port and reading it
there, I can check both that the data are transfered correctly and
also measure the achieved transfer rate.

Martin



More information about the gb-ccb mailing list