-------------------------------------------------------------------------------
                Matrox Imaging Library (7.0) mmx.txt Readme File
                                  June 29, 2001
    Copyright © 2000 by Matrox Electronic Systems Ltd. All rights reserved.
-------------------------------------------------------------------------------


The following file contains a list of all the functions that have been 
optimized with MMX code. A supplementary section also suggests the data
alignment to obtain the best performance with MMX when a buffer is created
with the MbufCreate2d()/MbufCreateColor() function. Another section indicates
how MIL can enable/disable the use of MMX optimization. 


Contents

1. Image processing commands.
2. Buffer management commands.
3. Measurements commands.
4. Pattern matching commands.
5. Blob analysis commands.
6. Graphics command.
7. Data alignment. 


-------------------------------------------------------------------------------
Symbols used in the file
-------------------------------------------------------------------------------

Buffers:    Dst   : Destination
            Src   : Source
            Cnd   : Condition

Data type   UChar : unsigned char
            Char  : signed char
            UShort: unsigned short
            Short : signed short
            ULong : unsigned long
            Long  : signed long
            Float : float
            Bin   : binary

The buffer bit and sign refers to all buffers except floating point.


*******************************************************************************
1. Image processing commands.
*******************************************************************************

1.1   MimArith().

      1.1.1 Optimized versions:
      
         Dst     Src1     Src2          
         ------  ------   ------       
         UChar   UChar    UChar
         Char    Char     Char
         UChar   UShort   UChar
         Char    Short    Char
         UChar   UChar    UShort 
         Char    Char     Short
         UShort  UChar    UChar
         Short   Char     Char
         UShort  UShort   UChar
         Short   Short    Char
         UShort  UChar    UShort
         Short   Char     Short
         UShort  UShort   UShort
         Short   Short    Short
         ULong   ULong    ULong
         Long    Long     Long
       
      1.1.2 Notes.
      
         - Saturation is not optimized for 32-bit src buffers.

         - M_MULT / M_MULT_CONST is not optimized for 32-bit src buffers.

         - Saturation is not optimized for unsigned mixes with an 8-bit destination.
         
      1.1.3 Options optimized.
      
         1.1.3.1 Operations where saturation is optimized.
         
                 M_ADD_CONST     M_ADD
                 M_SUB_CONST     M_SUB
                 M_CONST_SUB
                  
         1.1.3.2 Operations without saturation (optimized).
         
                 M_AND           M_AND_CONST       
                 M_OR            M_OR_CONST       
                 M_XOR           M_XOR_CONST       
                 M_NAND          M_NAND_CONST       
                 M_NOR           M_NOR_CONST       
                 M_XNOR          M_XNOR_CONST       
                 M_MAX           M_MAX_CONST       
                 M_MIN           M_MIN_CONST       
                 M_MULT          M_MULT_CONST
                 M_ABS           M_SUB_ABS
                 M_NOT           M_NEG
                 
          1.1.3.3 Operations not optimized.
          
                 M_PASS          M_CONST_PASS
                 M_DIV           M_DIV+M_FIXED_POINT
                 M_DIV_CONST     M_DIV_CONST+M_FIXED_POINT
                 M_CONST_DIV     M_CONST_DIV+M_FIXED_POINT
  
          1.1.3.4 Particular cases optimized.
          
           M_MULT+M_SATURATION               8,8->8       SIGNED 
           M_MULT_CONST+M_SATURATION  8,Constant->8       SIGNED
           
          1.1.3.5 Mixed sign cases optimized.
          
          - M_ABS: All mixes are optimized (size & sign).
    
          - Logical (M_AND, M_NOR, etc.) operations on mixed types are
            optimized only when the size of all buffers is the same.
 
          - M_NEG operation is optimized on mixed types only when the size 
            of all buffers is the same.


1.2   MimArithMultiple().

      1.2.1 M_OFFSET_GAIN.
      
            Optimized versions:
            
            Buffers UChar
            Buffers UChar + M_SATURATION 
            Buffers Char
            Buffers Char + M_SATURATION
            Buffers Short
            Buffers Short + M_SATURATION
            
            Buffers UShort + M_SATURATION
            Buffers UShort
            
            Dst     Src1    Src2     Src3        
            ------  ------  ------   ------  
            UChar   UChar   Short    Short + M_SATURATION
            UChar   UChar   Char     UChar + M_SATURATION
            UChar   UChar   Char     Char + M_SATURATION
            
            NOTE: For in place operation, the SizeX * data size in 
                  byte must be multiple of 8.
            
      
      1.2.2 M_WEIGHTED_AVERAGE.
      
            Optimized versions:
            
            Buffers UChar
            Buffers UChar + M_SATURATION 
            Buffers Char
            Buffers Char + M_SATURATION
            Buffers Short
            Buffers Short + M_SATURATION
            Buffers UShort               
            Buffers UShort + M_SATURATION 
            
      1.2.3 M_MULTIPLY_ACCUMULATE_1.
      
            Optimized versions:
            
            Buffers UChar
            Buffers Char
            Buffers Char + M_SATURATION
            Buffers Short
            Buffers Short + M_SATURATION
            
            Buffers UChar + M_SATURATION (*)
            Buffers UShort + M_SATURATION (*)
            Buffers UShort (*)
            
            (*) For these versions to be optimized by MMX, the values in 
                the source buffer cannot exceed the maximum value 
                of a corresponding signed buffer (ex: 127 for an 8-bit buffer).

            NOTE: For in place operation, the SizeX * data size in 
                  byte must be multiple of 8.
                  
      1.2.4 M_MULTIPLY_ACCUMULATE_2.

            Optimized versions:
            
            Buffers UChar
            Buffers Char
            Buffers Char + M_SATURATION
            Buffers Short
            Buffers Short + M_SATURATION
            
            Buffers UChar + M_SATURATION (*)
            Buffers UShort + M_SATURATION (*)
            Buffers UShort (*)
            
            (*) For these versions to be optimized by MMX, the values in 
                the source buffer cannot exceed the maximum value 
                of a corresponding signed buffer (ex: 127 for an 8-bit buffer).

            NOTE: For in place operation, the SizeX * data size in 
                  byte must be multiple of 8.
                
                  
1.3   MimBinarize().

      Only the following versions of the function are optimized with MMX:
      
      Dst     Src
      ------  ------
      UChar   UChar
      UChar   Char
      Char    UChar
      Char    Char

      UShort  UShort
      UShort  Short
      Short   UShort
      Short   Short

      ULong   ULong
      ULong   Long
      Long    ULong
      Long    Long

      UShort  UChar
      UShort  Char
      Short   UChar
      Short   Char

      UChar   UShort
      UChar   Short
      Char    UShort
      Char    Short

      Binary   UChar
      Binary   Char
      Binary   UShort
      Binary   Short

      
1.4   MimClip().

      All integer versions for which source and destination are of the same size are
      optimized with MMX.

      Dst     Src
      ------  ------
      UChar   UChar 
      Char    UChar
      UChar   Char 
      Char    Char

      UShort  UShort 
      Short   UShort
      UShort  Short 
      Short   Short
      
      ULong   ULong 
      Long    ULong
      ULong   Long 
      Long    Long

      
1.5   MimClose().
      
      M_GRAYSCALE operation:
         All the buffer depths and sign combinations are optimized in MMX.
      
      M_BINARY operation:
         Not optimized in MMX.
      

1.6   MimConnectMap().      
         Not optimized with MMX.


1.7   MimConvert().

      Only the following versions of the function are optimized with MMX:
      
      Dst     Src
      ------  ------
      UChar   UChar 
      UChar   Char 
      Char    UChar 
      Char    Char 

      Src                  Dst                   Restriction
      ------               ------                -----------
      M_RGB24+M_PLANAR     HLS (8-bit)           DstSizeX > 7
      M_RGB24+M_PLANAR     H (8-bit)             DstSizeX > 7
      M_RGB24+M_PLANAR     L (8-bit)             DstSizeX > 7
      HLS                  M_RGB24+M_PLANAR      DstSizeX > 7 

      For other conversions, see MbufCopy() and MbufBayer().
      

1.8   MimConvolve().

      1.8.1 Custom kernels.

         Only the following versions of the function are optimized with MMX:
         
         Dst     Src     Kernel          
         ------  ------  ------
         Char    Char    Char          (*) (128, -128, 128, -128)
         Char    Char    UChar         (*) (256,     , 256,     )
         Char    UChar   Char          (*) (128, -128, 128, -128)
         Char    UChar   UChar         (*) (256,     , 128,     )
         UChar   Char    Char          (*) (128, -128, 128, -128)
         UChar   Char    UChar         (*) (256,     , 256,     )
         UChar   UChar   Char          (*) (128, -128, 128, -128)
         UChar   UChar   UChar         (*) (256,     , 128,     )

         Short   Char    Char          (**) (128, -128, 128, -128)
         Short   Char    UChar         (**) (256,     , 256,     )
         Short   UChar   Char          (**) (128, -128, 128, -128)
         Short   UChar   UChar         (**) (256,     , 128,     )
         UShort  Char    Char          (**) (128, -128, 128, -128)
         UShort  Char    UChar         (**) (256,     , 256,     )
         UShort  UChar   Char          (**) (128, -128, 128, -128)
         UShort  UChar   UChar         (**) (256,     , 128,     )

         Short   Char    Short         (**) (128, -128, 128, -128)
         Short   Char    UShort        (**) (256,     , 256,     )
         Short   UChar   Short         (**) (128, -128, 128, -128)
         Short   UChar   UShort        (**) (256,     , 128,     )
         UShort  Char    Short         (**) (128, -128, 128, -128)
         UShort  Char    UShort        (**) (256,     , 256,     )
         UShort  UChar   Short         (**) (128, -128, 128, -128)
         UShort  UChar   UShort        (**) (256,     , 128,     )
         
         Short   Short   Char          (***) (32768, -32768, 32768, -32768)
         Short   Short   UChar         (***) (65536,       , 65536,       )
         Short   UShort  Char          (***) (32768, -32768, 32768, -32768)
         Short   UShort  UChar         (***) (65536,       , 32768,       )
         UShort  Short   Char          (***) (32768, -32768, 32768, -32768)
         UShort  Short   UChar         (***) (65536,       , 65536,       )
         UShort  UShort  Char          (***) (32768, -32768, 32768, -32768)
         UShort  UShort  UChar         (***) (65536,       , 32768,       )

         Short   Short   Short         (***) (32768, -32768, 32768, -32768)
         Short   Short   UShort        (***) (65536,       , 65536,       )
         Short   UShort  Short         (***) (32768, -32768, 32768, -32768)
         Short   UShort  UShort        (***) (65536,       , 32768,       )
         UShort  Short   Short         (***) (32768, -32768, 32768, -32768)
         UShort  Short   UShort        (***) (65536,       , 65536,       )
         UShort  UShort  Short         (***) (32768, -32768, 32768, -32768)
         UShort  UShort  UShort        (***) (65536,       , 32768,       )




         (*)  For these versions, the sum of the kernel values is verified to be
              below or equal (greater or equal for negative values) to the values 
              specified in parenthesis. The first value is the sum of the positive
              values in the kernel, the second is the sum of the negative values 
              in the kernel, the third is the sum of the positive values divided by
              the normalization factor, and the fourth is the sum of the negative
              values divided by the normalization factor. If these conditions are
              respected, the MMX version with a 16-bit accumulator is called.
              If these conditions are not respected and the number of elements 
              in the kernel is smaller than 32025, the MMX function with a 32-bit 
              accumulator is called. If the number of elements in the kernel is greater 
              or equal to 32025 the non-MMX version is called.   

              The internal accumulator contains the sum of the products of kernel
              elements by image values before normalization.



         (**) Note that for these versions, the internal accumulator is 16-bit 
              unsigned when the source AND the kernel are unsigned. It is
              16-bit signed in all other cases. 

              To ensure that an overflow did not occur, the sum of the kernel values
              is verified to be below or equal (greater or equal for negative values)
              to the values specified in parenthesis. The first value is the sum 
              of the positive values in the kernel, the second is the sum of the 
              negative values in the kernel, the third is the sum of the positive
              values divided by the normalization factor, and the fourth is the sum
              of the negative values divided by the normalization factor. If these
              conditions are not respected, we resort to the non-MMX function.

              The internal accumulator contains the sum of the products of kernel
              elements by image values before normalization.

         (***)For these versions to be optimized by MMX, the values in 
              UNSIGNED source and UNSIGNED kernel buffers should not exceed
              the maximum value of a corresponding signed buffer.
              (ex: 127 for an 8-bit buffer)

              For these versions, the accumulator is 32-bit signed. We have a
              similar verification on the kernel's sum that the previous
              versions.

              Note that for a 5x5 convolution, the MMX version of the operations called 
              only on a Pentium processor or when saturation is needed. On a Pentium Pro and 
              Pentium II, the C++ version is faster when saturation is not needed.

              If these conditions are not met, no MMX optimization will be used.


      1.8.2 Predefined MIL kernels.

            Optimized with MMX in the following cases:

            All platforms:
               8 bits source buffers with 8- or 16-bit designations. 

            Pentium Pro (Pentium-II) and subsequent versions only:
               8-bit and 16-bit source buffers with 8- or 16-bit designations. 


      
1.9   MimCountDifference().

      General guideline: source and destination buffers must be 
                         of the same sign.

      Src1    Src2     
      ------  ------  

      UChar   UChar
      Char    Char

      UChar   UShort   
      Char    Short    
      
      UShort  UChar    
      Short   Char     

      UShort  UShort
      Short   Short
      
      ULong   ULong

      
1.10  MimDilate().
      
      M_GRAYSCALE operation:
         All the buffer depths and sign combinations are optimized with MMX.

      M_BINARY operation:
         Not optimized in MMX.
      

1.11  MimDistance().
         
      CHAMFER_3_4 8-bit and 16-bit signed and unsigned are optimized with MMX.
      CHESSBOARD 8-bit signed and unsigned is optimized with MMX.
      All version of CITY_BLOCK aren't optimized with MMX


1.12  MimEdgeDetect().

      Only the following versions of the function are optimized with MMX

      Angle   Gradient   Source
      ------  --------   ------

      UChar   UChar      UChar
      UChar   UChar      Char
      Char    UChar      UChar
      Char    UChar      Char

      UShort  UChar      UChar
      UShort  UChar      Char
      Short   UChar      UChar
      Short   UChar      Char

      UChar   Char       UChar
      UChar   Char       Char
      UShort  Char       UChar
      UShort  Char       Char

      Char    Char       UChar
      Char    Char       Char
      Short   Char       UChar
      Short   Char       Char
     
      UChar   UShort     UChar
      UChar   UShort     Char
      UShort  UShort     UChar
      UShort  UShort     Char

      Char    UShort     UChar
      Char    UShort     Char
      Short   UShort     UChar
      Short   UShort     Char

      UChar   Short      UChar
      UChar   Short      Char
      UShort  Short      UChar
      UShort  Short      Char

      Char    Short      UChar
      Char    Short      Char
      Short   Short      UChar
      Short   Short      Char
   
      
1.13  MimErode.
      
      M_GRAYSCALE operation:
         All the buffer depths and sign combinations are optimized in MMX.
     
      M_BINARY operation:
         Not optimized in MMX.
      

1.14  MimFindExtreme().

        All 8-bit and 16-bit integer buffer combinations are optimized with MMX.
      

1.15  MimFlip().

      M_FLIP_VERTICAL      : Optimized (8 and 16-bit versions)
      M_FLIP_HORIZONTAL    : Optimized (8 and 16-bit versions)
      

1.16  MimHistogram().
         Not optimized with MMX.
     
      
1.17  MimHistogramEqualize().
         Not optimized with MMX.
     
      
1.18  MimLabel().
         Not optimized with MMX.


1.19  MimLocateEvent().

      Only the following versions of the function are optimized with MMX.

      Src
      ------

      UChar
      Char
      UShort 
      Short

      
1.20  MimLutMap().
         Not optimized with MMX.
      

1.21  MimMorphic().

         M_ERODE, M_DILATE, M_THIN, M_THICK, M_HIT_OR_MISS, M_MATCH:
         
         M_GRAYSCALE operation:
            All the integer buffer depths and sign combinations are optimized with MMX.
            EXCEPTION: M_ERODE with a 32-bit + M_UNSIGNED source buffer is not optimized.
        
         M_BINARY operation:
            Not optimized with MMX.
      

1.22  MimOpen().
      
      M_GRAYSCALE operation:
         All the integer buffer depths and sign combinations are optimized with MMX.

      M_BINARY operation:
         Not optimized with MMX.

      
1.23  MimPolarTransform().

      M_RECTANGULAR_TO_POLAR():
         Not optimized with MMX.
      
      M_POLAR_TO_RECTANGULAR:
         Not optimized with MMX.


1.24  MimProject().
      
      Only the following versions of the function are optimized with MMX
      for M_0_DEGREE and M_90_DEGREE:
      
      Src
      ------
      UChar
      Char
      UShort
      Short


1.25  MimRank().
      
      All the integer buffer depths and sign combinations are optimized with MMX.

      The 3x3 median structuring element has been further optimized.


1.26  MimResize().

      Only the following versions of the function are optimized with MMX
      (with M_NEAREST_NEIGHBOR interpolation only):
      
      Dst     Src             
      ------  ------       
      Char    Char  
      UChar   UChar  
      Char    UChar  
      Char    UChar  

      UShort  UShort
      UShort  Short
      Short   UShort   
      Short   Short 
      
      Special Notes:
      
      To Zoom in X with 8-bit buffers:
            1. The width (SizeX) of the source buffer must be a multiple of 4.
            2. The zooming factor must be 2 or 4.
            
      To Zoom in X with 16-bit buffers:
            1. The width (SizeX) of the source buffer must be a multiple of 2.
            2. The width (SizeX) of the destination buffer must be a multiple of 8.
            3. The zooming factor must be 2 or 4.
              
      To Zoom out in X with 8-bit buffers:
            1. The width (SizeX) of the source buffer must be a multiple of 8.
            2. The zooming factor must be 1/2 or 1/4.
            
      Zooming out in X with 16-bit buffers is not optimized.

      To Zoom in Y, the factor must be an integer value.

      To Zoom out in Y, the factor must be (1/integer value).
      
     
      New version:
      
      Resize M_AVERAGE, 1/2 x 1/2 with unsigned char buffers only.
      No restrictions on the size of the buffers.

     
1.27  MimRotate().
      
      Not optimized with MMX, although greatly accelerated by the
      addition of fixed point in calculations.


1.28  MimShift().

      Only the following versions of the function are optimized with MMX:
      
      Dst     Src             
      ------  ------       
      Char    Char  
      UChar   UChar  

      UChar   UShort  
      Char    Short  

      UShort  UChar  
      Short   Char  

      UShort  UShort
      Short   Short   

      
1.29  MimThick().

      M_GRAYSCALE operation:
         All the buffer depth and sign combinations are optimized with MMX.

      M_BINARY operation:
         Not optimized with MMX.
         

1.30  MimThin().
      
      M_GRAYSCALE operation:
         All the buffer depth and sign combinations are optimized with MMX.
      
      M_BINARY operation:
         Not optimized with MMX.
         

1.31  MimTransform().

      M_FFT:
      
         This function supports only fixed-point versions of the transform. For
         the forward transform, the sources must be 8-bit signed or unsigned
         and the destination must be long. The M_NORMALIZE flag must be set to
         avoid internal overflows in the fixed-point computations.
         
         For the reverse transform, the restrictions are the same but the source
         and destination are swapped.
      
      M_DCT8x8: 

         Not optimized with MMX.
      

1.32  MimTranslate().

      Optimization for translations by fractional factors are optimized with MMX
      in the same way as MimConvolve, where 'Src' and 'Dst' are both the 
      destination type of the MimTranslate operation and 'Kernel' is always Char.
      

1.33  MimWarp():
         Not optimized with MMX.
      

1.34  MimZoneOfInfluence().
         Not optimized with MMX.


*******************************************************************************
2. Buffer management commands.
*******************************************************************************

2.1   MbufCopy().

      Only the following versions of the function are optimized with MMX:

      Src                  Dst                                                   Restriction
      ------               ------                                                -----------
      M_MONO8              M_YUV16_YUYV                                          DstSizeX multiple of 2
      M_MONO8              M_YUV16_UYVY                                          DstSizeX multiple of 2
      M_MONO8              M_YUV16+M_PACKED                                      DstSizeX multiple of 2
      M_RGB24+M_PLANAR     M_YUV24+PLANAR                                        DstSizeX > 4
      M_RGB24+M_PLANAR     M_YUV16_YUYV                                          DstSizeX multiple of 8
      M_RGB24+M_PLANAR     M_YUV16_UYVY                                          DstSizeX multiple of 8
      M_RGB24+M_PLANAR     M_YUV16                                               DstSizeX multiple of 8
      M_RGB24+M_PLANAR     M_YUV12+M_PLANAR                                      DstSizeX multiple of 8 and DstSizeY multiple of 2
      M_RGB24+M_PLANAR     M_YUV9+M_PLANAR                                       DstSizeX multiple of 16 and DstSizeY multiple of 4
      M_BGR24+M_PACKED     M_YUV16_YUYV                                          DstSizeX multiple of 8
      M_BGR24+M_PACKED     M_YUV16+M_PACKED                                      DstSizeX multiple of 8
      M_YUV16_YUYV         M_MONO8                                               DstSizeX multiple of 2
      M_YUV16_UYVY         M_MONO8                                               DstSizeX multiple of 2
      M_YUV16+M_PACKED     M_MONO8                                               DstSizeX multiple of 2
      M_YUV24+M_PLANAR     M_RGB24+M_PLANAR                                      SrcSizeX > 4
      M_YUV16+M_PLANAR     M_RGB24+M_PLANAR                                      SrcSizeX multiple of 8
      M_YUV16_YUYV         M_BGR24+M_PACKED                                      SrcSizeX multiple of 8
      M_YUV16_UYVY         M_BGR24+M_PACKED                                      SrcSizeX multiple of 8
      M_YUV16+M_PACKED     M_BGR24+M_PACKED                                      SrcSizeX multiple of 8
      M_MONO8              M_COMPRESS+M_MONO8+M_JPEG_LOSSLESS                    DstSizeX multiple of 4
      M_RGB24+M_PLANAR     M_COMPRESS+M_RGB24+M_JPEG_LOSSLESS+M_PLANAR           DstSizeX multiple of 4
      M_MONO8              M_COMPRESS+M_MONO8+M_JPEG_LOSSLESS_INTERLACED         DstSizeX multiple of 4
      M_MONO8              M_COMPRESS+M_JPEG_LOSSY+M_MONO8                       DstSizeY multiple of 8
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_RGB24+M_PLANAR              DstSizeY multiple of 8
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV24+M_PLANAR              DstSizeY multiple of 8 
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PLANAR              DstSizeY multiple of 8                                         
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV12+M_PLANAR              DstSizeY multiple of 16
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV9+M_PLANAR               DstSizeY multiple of 32
      M_BGR24+M_PACKED     M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_BGR32+M_PACKED     M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_YUV24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV24+M_PLANAR              DstSizeY multiple of 8
      M_YUV16+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PLANAR              DstSizeY multiple of 8
      M_YUV16              M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_YUV12+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY+M_YUV12+M_PLANAR              DstSizeY multiple of 16
      M_YUV9+M_PLANAR      M_COMPRESS+M_JPEG_LOSSY+M_YUV9+M_PLANAR               DstSizeY multiple of 32
      M_YUV16_YUYV         M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_YUV16_UYVY         M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED              DstSizeX multiple of 16 and DstSizeY multiple of 8
      M_MONO8              M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_MONO             DstSizeY multiple of 16
      M_RGB24+M_PLANAR     M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16
      M_BGR24+M_PACKED     M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16
      M_BGR32+M_PACKED     M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16 
      M_YUV16              M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16
      M_YUV16_YUYV         M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16 
      M_YUV16_UYVY         M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED   DstSizeX multiple of 16 and DstSizeY multiple of 16
                                           
      Src                                                     Dst                Restriction
      ------                                                  ------             -----------
      M_COMPRESS+M_JPEG_LOSSY+M_MONO8                         M_MONO8            SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_RGB24+M_PLANAR                M_RGB24+M_PLANAR   SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV24+M_PLANAR                M_RGB24+M_PLANAR   SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16                         M_RGB24+M_PLANAR   SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV12+M_PLANAR                M_RGB24+M_PLANAR   SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY+M_YUV9+M_PLANAR                 M_RGB24+M_PLANAR   SrcSizeY multiple of 32
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED                M_BGR24+M_PACKED   SrcSizeX multiple of 16 and SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED                M_BGR32+M_PACKED   SrcSizeX multiple of 16 and SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV24+M_PLANAR                M_YUV24+M_PLANAR   SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16                         M_YUV16+M_PLANAR   SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED                M_YUV16+M_PACKED   SrcSizeX multiple of 16 and SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV12+M_PLANAR                M_YUV12+M_PLANAR   SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY+M_YUV9+M_PLANAR                 M_YUV9+M_PLANAR    SrcSizeY multiple of 32
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED                M_YUV16_YUYV       SrcSizeX multiple of 16 and SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY+M_YUV16+M_PACKED                M_YUV16_UYVY       SrcSizeX multiple of 16 and SrcSizeY multiple of 8
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_MONO8              M_MONO8            SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED     M_RGB24            SrcSizeX multiple of 16 and SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED     M_BGR24            SrcSizeX multiple of 16 and SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED     M_BGR32            SrcSizeX multiple of 16 and SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16              M_YUV16+M_PLANAR   SrcSizeX multiple of 16 and SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED     M_YUV16_YUYV       SrcSizeX multiple of 16 and SrcSizeY multiple of 16
      M_COMPRESS+M_JPEG_LOSSY_INTERLACED+M_YUV16+M_PACKED     M_YUV16_UYVY       SrcSizeX multiple of 16 and SrcSizeY multiple of 16


2.2   MbufCopyCond().

      Only the following versions of the function are optimized with MMX:
      
      Dst    Src    Cnd          
      ------ ------ ------       
      UChar  UChar  UChar        
      UChar  UChar  Char           
      UChar  Char   UChar        
      UChar  Char   Char
      Char   UChar  UChar
      Char   UChar  Char
      Char   Char   UChar
      Char   Char   Char

      UShort UShort UShort
      UShort UShort Short
      UShort Short  UShort
      UShort Short  Short
      Short  UShort UShort
      Short  UShort Short
      Short  Short  UShort
      Short  Short  Short

2.3   MbufCopyMask().

      Only the following versions of the function are optimized with MMX:

      Dst    Src
      ------ ------
      UChar  UChar
      UChar  Char
      Char   UChar
      Char   Char
      
      UShort UShort
      UShort Short
      Short  UShort
      Short  Short
      
      ULong  ULong
      ULong  Long
      Long   ULong
      Long   Long
      
      UChar  UShort
      UChar  Short
      Char   UShort
      Char   Short
      
      UShort UChar
      UShort Char
      Short  UChar
      Short  Char


2.4   MbufBayer().

      Only the following versions of the function are optimized with MMX:
      
      Dst     Src
      ------  ------
      UChar   UChar 
      UShort  UShort
      
      Src                 Dst                   Restriction
      ------              ------                -----------
      M_MONO8 (Bayer)     M_MONO8               DstSizeX > 10, DstSizeY > 3, SrcSizeX > 10, SrcSizeY > 3
      M_MONO8 (Bayer)     M_RGB24+M_PLANAR      DstSizeX > 10, DstSizeY > 3, SrcSizeX > 10, SrcSizeY > 3
      M_MONO8 (Bayer)     M_BGR32+M_PACKED      DstSizeX > 10, DstSizeY > 3, SrcSizeX > 10, SrcSizeY > 3
      M_MONO16 (Bayer)    M_RGB48+M_PLANAR      DstSizeX > 6, DstSizeY > 3, SrcSizeX > 6, SrcSizeY > 3
      M_MONO8 (Bayer)     M_YUV_YUYV            See table below.
      
      If the Src is M_MONO8 (Bayer) and the Dst is M_YUV16_YUYV, the restrictions
      depend on the SizeX and the AncestorOffsetX as follows:

      AncestorOffsetX     SizeX                 Restriction
      ---------------     -----                 -----------
      Odd                 Even                  DstSizeX > 10, DstSizeY > 3, SrcSizeX > 10, SrcSizeY > 3
      Odd                 Odd                   DstSizeX > 11, DstSizeY > 3, SrcSizeX > 11, SrcSizeY > 3
      Even                Even                  DstSizeX > 12, DstSizeY > 3, SrcSizeX > 12, SrcSizeY > 3
      Even                Odd                   DstSizeX > 11, DstSizeY > 3, SrcSizeX > 11, SrcSizeY > 3

      Finally, there are restrictions on the values of the white balance coefficients
      to respect in order to use the MMX-optimized version of the function:

      Dst                 Coefficient #0     Coefficient #1      Coefficient #2
      ------              --------------     --------------      --------------
      M_MONO8                  < 64            Don't care          Don't care
      M_RGB24+M_PLANAR         < 64               < 64                < 64
      M_BGR32+M_PACKED         < 64               < 64                < 64
      M_RGB48+M_PLANAR         < 64               < 64                < 64
      M_YUV_YUYV               < 64            Don't care          Don't care

      For other conversions, see MbufCopy() and MimConvert().


*******************************************************************************
3. Measurements commands.
*******************************************************************************

3.1   MmeasAllocContext().
         Not optimized with MMX.


3.2   MmeasAllocMarker().
         Not optimized with MMX.


3.3   MmeasAllocResult().
         Not optimized with MMX.


3.4   MmeasCalculate().
         Not optimized with MMX.

      
3.5   MmeasControl().
         Not optimized with MMX.

      
3.6   MmeasFindMarker().  
         Optimized with MMX.
      

3.7   MmeasFree().
         Not optimized with MMX.
      

3.8   MmeasGetResult().  
         Not optimized with MMX.
      

3.9   MmeasInquire().     
         Not optimized with MMX.
      

3.10  MmeasRestoreMarker().
         Not optimized with MMX.
      

3.11  MmeasSaveMarker().
         Not optimized with MMX.
      

3.12  MmeasSetMarker().    
         Not optimized with MMX.


*******************************************************************************
4. Pattern matching commands.
*******************************************************************************

4.1   MpatAllocAutoModel().
         Optimized with MMX.


4.2   MpatAllocModel().      
         Not optimized with MMX.


4.3   MpatAllocResult().
         Not optimized with MMX.


4.4   MpatAllocRotatedModel().
         Not optimized with MMX.
      

4.5   MpatCopy().      
         Not optimized with MMX.


4.6   MpatFindModel().
         Optimized with MMX.


4.7   MpatFindMultipleModel().      
         Optimized with MMX.


4.8   MpatFindOrientation().         
         Optimized with MMX.


4.9   MpatFree().
         Not optimized with MMX.


4.10  MpatGetNumber().
         Not optimized with MMX.
      

4.11  MpatGetResult().      
         Not optimized with MMX.
      

4.12  MpatInquire().
         Not optimized with MMX.


4.13  MpatPreprocModel().
         Not optimized with MMX.


4.14  MpatRead().      
         - Genesis model support has been added.
         - Not optimized with MMX.


4.15  MpatRestore().      
         - Genesis model support has been added.
         - Not optimized with MMX.
      

4.16  MpatSave().      
         Not optimized with MMX.


4.17  MpatSetAcceptance().      
         Not optimized with MMX.


4.18  MpatSetAccuracy().      
         Not optimized with MMX.


4.19  MpatSetAngle().      
         Not optimized with MMX.


4.20  MpatSetCenter().      
         Not optimized with MMX.


4.21  MpatSetCertainty().      
         Not optimized with MMX.


4.22  MpatSetDontCare().      
         Not optimized with MMX.


4.23  MpatSetNumber().
         Not optimized with MMX.


4.24  MpatSetPosition().      
         Not optimized with MMX.


4.25  MpatSetSearchParameter().
         Not optimized with MMX.
      

4.26  MpatSetSpeed().      
         Not optimized with MMX.


4.27  MpatWrite().
         Not optimized with MMX.
      

*******************************************************************************
5. Blob analysis commands.
*******************************************************************************

All the blob analysis commands that perform calculation have been optimized with MMX.


5.1   MblobAllocFeatureList().
         Not optimized with MMX.


5.2   MblobAllocResult().
         Not optimized with MMX.

5.3   MblobCalculate().
         Only the following versions of the function are optimized with MMX:

         Src     Foreground          
         ------  ------------       
         UChar   M_ZERO
         Char    M_ZERO
         UShort  M_ZERO
         Short   M_ZERO
 

5.4   MblobControl().
         Not optimized with MMX.


5.5   MblobFill().
         Not optimized with MMX.


5.6   MblobFree().
         Not optimized with MMX.
      

5.7   MblobGetLabel().
         Not optimized with MMX.
      

5.8   MblobGetNumber().
         Not optimized with MMX.
      

5.9   MblobGetResult().
         Not optimized with MMX.
      

5.10  MblobGetResultSingle().
         Not optimized with MMX.
      

5.11  MblobGetRuns().
         Not optimized with MMX.
      

5.12  MblobInquire().
         Not optimized with MMX.
      

5.13  MblobLabel().      
         Not optimized with MMX.


5.14  MblobReconstruct().     
         Optimized with MMX.


5.15  MblobSelect().
         Not optimized with MMX.


5.16  MblobSelectFeature().
         Not optimized with MMX.
      

5.17  MblobSelectFeret().      
         Not optimized with MMX.


5.18  MblobSelectMoment().
         Not optimized with MMX.


*******************************************************************************
6. Graphics command.
*******************************************************************************


6.1   MgraAlloc().
         Not optimized with MMX.
      

6.2   MgraArc().
         Not optimized with MMX.
      

6.3   MgraArcFill().
         Not optimized with MMX.
      

6.4   MgraBackColor().      
         Not optimized with MMX.


6.5   MgraClear().      
         Not optimized with MMX.


6.6   MgraColor().      
         Not optimized with MMX.


6.7   MgraControl().      
         Not optimized with MMX.


6.8   MgraDot().      
         Not optimized with MMX.


6.9   MgraFill().      
         Not optimized with MMX.


6.10  MgraFont().      
         Not optimized with MMX.


6.11  MgraFontScale().      
         Not optimized with MMX.


6.12  MgraFree().      
         Not optimized with MMX.


6.13  MgraInquire().
         Not optimized with MMX.


6.14  MgraLine().      
         Not optimized with MMX.


6.15  MgraRect().      
         Not optimized with MMX.


6.16  MgraRectFill().
         Not optimized with MMX.


6.17  MgraText().
         Not optimized with MMX.

         
*******************************************************************************
7. Data alignment.
*******************************************************************************

When a MIL buffer is created using MbufCreate2d()/MbufCreateColor(), its image 
row data (scanline) should be aligned on 32-byte boundaries to give the best 
performance in conjunction with the MMX-enabled functions. When it is not possible 
to align on 32-byte boundaries, the buffer should at least be aligned on quadword 
(64-bit) or doubleword (32-bit) boundaries. Note that by using the MbufAlloc2b()/
MbufAllocColor() function, the user does not have to worry about data alignment, 
since in that case MIL automatically allocates the buffer with the proper alignment.

Moreover, 32 extra bytes should be available in reading at the beginning and end
of the buffer in order for the MMX-enabled algorithms to be able to perform 
pre-fetching. The performance could decrease dramatically if those extra pixels 
are not available. When they are available, then the define M_MMX_ENABLED must
be added to the attribute parameter at buffer creation time (MbufCreate2d()/
MbufCreateColor()) so that the MMX-enabled algorithms know that the pre-fetching
can be performed on them. It is also possible to set this flag after buffer
creation time by the use of the MbufControl(M_FORMAT) command. In this case, the
following syntax should appear like this:

MbufControl(MilImage,
            M_FORMAT,
            M_MMX_ENABLED|MbufInquire(MilImage, M_FORMAT, NULL));

(Note that this control is usually reserved for internal use only and thus does
not appear in the official documentation)