[fitsbits] Processor cache friendly processing of RGB FITS image.

Seaman, Robert Lewis - (rseaman) rseaman at email.arizona.edu
Thu Feb 13 08:51:00 EST 2020


Howdy,

You don't specify the host OS or architecture and not all caches are created equal. There's nothing particularly special in an image processing algorithm needing to traverse multiple arrays at the same time even if not operating in RGB space. Generally even demanding algorithms won't require fretting overly much about the cache. By "trashes the processor cache", I guess thrashing is the issue, not some host-OS dependent exception being thrown or some such?

As a general principle any large data structure should not be traversed one item at a time. For images this usually means reading a line at a time into an array, or in your case three arrays. Your example corresponds to an image a few thousand pixels on a side, so just a few kilobytes in three pieces mapped from C arrays in the cache. I don't recall ever discussing RGB decomposition in conjunction with FITS tile compression, but a generalized rectangular tile would be another natural blocking. 2-D arrays in C are usually just multiple indices on 1-D arrays, so this is mostly just arguing for chunking the algorithm into larger blocks / arrays.

The main reason I'm replying is that last time I checked FITS had no standard RGB representation. Scanning just now through the current standard and registered conventions still doesn't show anything relevant:

	https://fits.gsfc.nasa.gov/fits_standard.html
	https://fits.gsfc.nasa.gov/fits_registry.html

If you are coding to some local convention, perhaps in a commercial package, others here would likely also welcome pointers. Your description appears to correspond to placing the colors in separate planes of a data cube, one logical choice for organizing RGB. I might prefer placing the decomposed RGB colors in separate FITS extensions after the grayscale image. Others might pack 8-bit colors (and maybe an alpha channel) into individual pixels as long integers or some such. Once you populate your internal representation won't you have similar issues? Which is to say it isn't clear this is a FITS issue.

Rob Seaman
University of Arizona
--

On 2/13/20, 5:15 AM, "fitsbits on behalf of David C. Partridge via fitsbits" <fitsbits-bounces at listmgr.nrao.edu on behalf of fitsbits at listmgr.nrao.edu> wrote:

    Assume that I load the entire image into storage - I have plenty.
    
    If I want to access the RGB tuple for location 1,1, there's a problem as
    assuming (e.g.) a 30MB image, the Red Pixel will be a location 0 in the
    buffer, the Green and +10MB, and the Blue at +20MB.
    
    So if my C code (massively simplified) looks like:
    
    
    
    		double fRed = 0.0, fGreen = 0.0, fBlue = 0.0;
    		unsigned long greenOffset = m_lWidth * m_lHeight;
    // index into buffer of the green image
    		unsigned long blueOffset = 2 * greenOffset;
    // index into buffer of the blue image
    
    		for (long row = 0; row < m_lHeight; ++row)
    		{
    			for (long col = 0; col < m_lWidth; ++col)
    			{
    				long index = col + (row * m_lWidth);	//
    index into the image for this plane
    
    				:
    				: omitted
    				:
    
    				switch (datatype)
    				{
    				case TBYTE:
    					fRed = byteBuff[index];
    					fGreen = byteBuff[greenOffset +
    index];
    					fBlue = byteBuff[blueOffset +
    index];
    					break;
    
    which works fine but totally trashes the processor cache on each of the
    three assignments.
    
    I do need all three pixel values at the same time to populate my own
    internal image representation.
    
    Can anyone suggest a better way?
    
    Many thanks
    David
    
    			
    
    
    _______________________________________________
    fitsbits mailing list
    fitsbits at listmgr.nrao.edu
    https://listmgr.nrao.edu/mailman/listinfo/fitsbits
    




More information about the fitsbits mailing list