Multivariate Statistics in Image Analysis

Multivariate statistics deals with development, computer implementation and application of methods related to the analysis of data where more than one attribute or variable is available for each observation taking into account the relations between these variables. Typically, the data are related to physical, chemical, biological, other natural, economical or societal phenomena.

Traditional multivariate statistical methods include test theory, regression analysis, discriminant analysis and classification, principal component analysis and other orthogonal transformations. We also work with iterative extensions of some of these methods and more computer intensive statistical learning based extensions such as kernel methods, binary decision tree based methods, ensemble methods, artificial neural networks, machine learning and deep learning.

In the context of image analysis and image processing more specifically, the analysis of colour and multispectral images almost inevitably leads to the use of multivariate methods. Multivariate statistics (MS) or statistics is the concept of describing what we observe taking into account the randomness of observations and all their inter-dependencies. In the Section for Visual Computing, multivariate statistical methods are developed and used in designing tools for enhancing relevant properties in such images. Examples are multivariate alteration detection using (iterated) canonical correlation or mutual information methods, mapping of changes in polSAR images using complex Wishart distributions, and monitoring food fermentation using color scales based on canonical discriminant functions and kernel versions of other orthogonal transformations, here principal components and maximum autocorrelation factors.

Contact

Anders Nymark Christensen
Associate Professor
DTU Compute
+45 20 88 57 62

Professor emeritus
DTU Compute
+45 45 25 34 16
Allan Aasbjerg Nielsen
Emeritus, Associate Professor
DTU Compute
+45 45 25 34 25

A substantial fraction of the applied research in the Section for Visual Computing addresses problems on designing diagnostic tools where the proof of concept will involve quantification of uncertainties and deviations from state of the art that necessarily will involve statistical modelling.

A list of application areas is nearly endlessly varied and include

  • food science,
  • materials science,
  • medical applications,
  • industrial applications, and
  • remote sensing/earth observation (related to mapping, mineral exploration, geology, agriculture, forestry, environmental monitoring, marine biology, oceanography, geodesy, and security).

Highlights which illustrate both the method development and application aspects mentioned above with links to original papers and other material illustrated below include examples from

  • food fermentation monitoring,
  • environmental monitoring - change detection in remote sensing optical and SAR images,
  • sea ice mapping in Greenlandic waters based on fusion of Sentinel-1 and AMSR-2 data by means of deep learning,
  • statistical modelling - in-vivo dosimetry,
  • statistical modelling - fibre directionality estimation, and
  • statistical shape analysis and geometric morphometry.
  • Food Fermentation Monitoring

    Opening example on application of multivariate statistics in food science: The fermentation process of salamis was monitored with the Videometer multispectral imaging system using 19 different spectral bands. The images in the first column are from days 2 and 42 after production shown as false colour composites based on three spectral bands (660 nm, 470 nm, 435 nm).The next column shows the meat and the fat phases at day 14 found by means of canonical discriminant analysis. Column 3 show images from day 2 and day 42 using a statistical meat colour scale designed to enhance fermentation stages. The darker blue is fresh meat, whereas yellow and orange represent darker red, fermented meat. The last column shows the coefficients for computing respectively the discriminant function and the colour scale values.

    For more information see

    Camilla H. Trinderup, Flemming Møller, Anders Bjorholm Dahl, and Knut Conradsen (2018): Investigation of pausing fermentation of salamis with multispectral imaging for optimal sensory evaluations. Meat Science, vol. 146, pp. 9-17. DOI:10.1016/j.meatsci.2018.07.029

    Environmental Monitoring

    Forest Fire Monitoring, Multispectral Optical Data

    Here we use Sentinel-2 data to detect change in optical multispectral image data between two time points by means of the so-called IR-MAD method which is based on an iterative extension of canonical correlation analysis. The figure below shows the Tubbs Fire north and northeast of Santa Rosa (immediately above, i.e., to the south of the 101 sign), California, October 2017 seen from north looking towards the San Francisco Bay (with a part of a larger fire around Kenwood in the Sonoma Valley - route 12). Change variates are shown as RGB, burned areas are dark green (built-up areas), lighter green (mostly wooded) and bright yellow (mostly non-wooded), other non-fire related change mostly in blue (for example near Calistoga in the Napa Valley - route 29), and pale yellow (south of Santa Rosa).

    For more information see

    Aircraft Monitoring: Synthetic Aperture Radar (SAR) Data

    Here we use Sentinel-1 data (C-band radar VV and VH polarizations) to detect change between two time points (based on a new test statistic in the complex Wishart distribution and the Loewner order). The figure below shows the Frankfurt Airport, Germany, as depicted in Google Earth with aircraft, cars and ships coming and going (red means going, green means coming, yellow means we can't say but change is statistically significant).

    For more information see

    Eigenvalues of Hermitian 3x3 Matrices

    The figure below shows a small area around Tjele Gods near Foulum, Denmark. The left column shows the three eigenvalues of the complex (Hermitian) covariance matrix of airborne multilook EMISAR data at two time points as RGB, the middle column shows change images (dark is no change), and the right column shows a measure of direction of change (red means going, green means coming, yellow means we can't say but change is statistically significant).

    For more information on very fast array-based determination of eigenvalues of 3 by 3 Hermitian matrices see

    Allan A. Nielsen (2020). Fast matrix based computation of eigenvalues and the Loewner order in PolSAR data. IEEE Geoscience and Remote Sensing Letters 17(10), 1727-1731DOI:10.1109/LGRS.2019.2952202

    Flood Monitoring: Truly Multi-Temporal Synthetic Aperture Radar (SAR) Data

    Again, we use Sentinel-1 data (C-band radar VV and VH polarizations) to detect change now between several time points (based on a(nother than immediately above) new test statistic in the complex Wishart distribution and the Loewner order). Cyclone Idai, recorded as the worst weather-based event to ever occur in the southern hemisphere, made landfall near the city of Beira, Mozambique on 15 March 2019 causing widespread inundation in Mozambique and Tanzania and a death toll of more than 1300. The figure below shows a sequence of six change maps generated with a Google Earth Engine time series of Sentinel-1a and Sentinel-1b images, each of the six images covers an area of approximately 3000 km2. The reduction of reflectance from the water surfaces in both the VV and VH bands corresponds to a negative definite covariance matrix difference (green pixels, center left), the rapid receding of flood waters to a positive definite change (red pixels, center right).

    For more information see

    Morton J. Canty, Allan A. Nielsen, Henning Skriver and Knut Conradsen (2020). Statistical Analysis of Changes in Sentinel-1 Time Series on the Google Earth Engine. Remote Sensing 12(1), 46Open Access DOI:10.3390/rs12010046.

    Automated Sea Ice Products (ASIP)

      

    Sea ice mapping often used to be done manually based on radar images acquired from space. Therefore only a few maps covering quite limited geographical regions could be produced, say, weekly. In the ASIP project we automate the mapping process by fusing data from Sentinel-1 (C-band radar HH and HV polarizations) and AMSR-2 (weak microwave emission from the surface and the atmosphere) to detect sea ice concentrations in Greenlandic waters. This is done by means of convolutional neural networks trained on manually produced sea ice maps representing a large variety of situations geographically, seasonally and in different weather conditions (wind etc.). Unlike the case with manually produces maps, this approach allows us to produce sea ice maps for all situations (geographically, temporally etc.) where input data exist. The figure below shows an example of a probability map for sea ice coverage in a small area near the Greenland coast (blue is low probability going over green into yellow for high probability).

    ASIP is being developed and implemented further by one of the project partners, namely the Danish Meteorological Institute, DMI. Presently, ASIP runs in a semi-operational setup with a daily sea ice concentration product delivered to the European Union's Copernicus Marine Service, CMEMS. Also, ASIP's sea ice concentration product is being updated near-real time on DMI's geoserver. Work at DMI on extending the model to all of the Arctic Ocean and to the Antarctic Ocean (aka the Southern Ocean) is going on.

    For more information see

    In-vivo Dosimetry

    The behaviour of a novel soft tissue marker, that functions both as fiducial marker and as an in-vivo dosimeter, was validated through a pratical experiment and statistical modelling. In the image we see the x-ray contrast in the CT images and the activation in the PET images, that makes dosimetry possible.

    For more information see

    A.N. Christensen, J.S. Rydhög, R.V. Søndergaard, T.L. Andresen, S. Holm, P. Munck af Rosenschöld, K. Conradsen, and R.I. Jølck (2016). Injectable silver nanosensors: in vivo dosimetry for external beam radiotherapy using positron emission tomography, Nanoscale 21DOI:10.1039/C6NR00201C.

    Fibre Directionality Estimation

    The focus of the work described in the sequel is on developing statistical image analysis pipelines to characterise fibre geometry in unidirectional (UD) fibre reinforced composites at the micro-scale level and to provide experimental data enabling modelling of composite behaviour at different loads.

    In the image below we compare X-ray CT (XCT) at different resolutions to Synchrotron Radiation CT (SRCT), optical microscopy (OM) and scanning electron microscopy (SEM) (left). The marginal distributions and correlations are depicted through histograms and scatter plots (middle). The spatial dependence of the size is modelled by regression analysis (right).

    As part of investigating the sudden compressive failure of UD fibre reinforced composites at loads well below their tensile strengths we combine fast in-situ X-ray CT with advanced image analysis to capture the changes in fibre orientation in 3D during uninterrupted progressive loading in compression of a glass fibre reinforced polymer. Below we see – prior to load - all fibre trajectories in a region of interest, to the left a 3D side view and in the middle a plan view. Fibres completely aligned with the z-axis will in (b) appear as a point, misaligned fibres as a projected curve. To the right we see histograms for the fibre inclanations for different loads (0, 200, 600, 895 Newton).

    For further information see

    Statistical Shape Analysis and Geometric Morphometry

    In biology, analysis of shape based on landmarks has become a versatile tool in investigating evolution and relations between populations of different organisms. Traditionally, 3D-landmarks have been obtained using a tactile digitizer arm. En route to automating this, we have made 3D digital models of the surfaces of grey seal sculls and subsequently annotated those using a mouse and a 2D display. However, landmark annotation of a 3D model on a 2D display is tedious and may thus introduce errors. Furthermore, it can be more time-consuming than direct annotation on the physical object using a digitizer arm. Therefore, we as an alternative to the flat screen annotation introduce a virtual reality (VR) prototype for annotating landmarks on 3D models. The studies involved thorough statistical analyses among others using different ANOVA models to compare operators, methods, determining intra- and inter species variation etc.

    A specific biological result has been that in a further study of the data and models generated, it was concluded that the separate subspecies status of the Baltic and Atlantic grey seals is questionable from a morphological point of view.

    Checkpoints for digital landmark placement (using the Stratovan Checkpoint software), and to the right the 31 landmarks used.

    For more information see

    Courses

    Due to their nature images are inherently multivariate. The following course gives a foundation for statitical analysis in a multivariate setting.