Skip to main content
This content applies to Device version: 1.1.0.0

Endpoints

Explanation

The device works in a rather straighforward way: you send images, and you get back a DiagnosticReport, as defined in HL7's FHIR® specifications. The following chart explains the basics of it:

Diagnostic support

Although diagnosis is a quotidian way of speaking about the output of the device, keep in mind that what the device outputs is an interpretative distribution representation of possible International Classification of Diseases (ICD) classes that might be represented in the pixel content of the image.

Indeed, healthcare practitioners and organisations may use the data outputed by the device to inform a diagnosis, but what the device itself outputs is not a diagnosis. This is appropiately signaled in the output of the device, which follows the FHIR standard when noting that the output is a DiagnosticReport, with a status of preliminary.

Severity measure

Although severity measure is a quotidian way of speaking about the output of the device, keep in mind that what the device outputs is an quantifiable data on the intensity, count and extent of clinical signs such as erythema, desquamation, and induration, among others.

Indeed, healthcare practitioners and organisations may use the data outputed by the device to determine the degree of affectation of a patient, but what the device itself outputs is not the severity. This is appropiately signaled in the output of the device, which follows the FHIR standard when noting that the output is a DiagnosticReport, with a status of preliminary.

Clinical indicators

Clinical indicators are a set of values derived from the diagnostic support output of the device. The diagnosis support output of the device is an interpretative distribution representation of possible International Classification of Diseases (ICD) classes that might be represented in the pixel content of the image. In other words, it is a probability distribution in which every ICD-11 category is given a probability value between 0 and 1. As it is a probability distribution, the sum of the entire distribution equals to 1.

Condition confirmation

The diagnosis support output of the device covers a wide range of ICD-11 categories, except for one class: Non-specific lesion. This class is activated when the device did not find any condition in the image.

With this in mind, it is possible to determine from the predicted probability distribution how likely is that the picture contains any kind of condition (hasCondition). This could be done by summing the probabilities of all classes, excluding Non-specific lesion, which is achieved with the following formula: However, there is a faster and more efficient way to calculate it:

pcondition=1pnslp_{condition}= 1 - p_{nsl}

Where pnslp_{nsl} is the probability given to the Non-specific lesion category in the probability distribution. As the sum of the entire probability distribution equals to 1, the probability of the image depicting a condition is simply the subtraction of the Non-specific lesion probability from the total probability (1).

Weighted sum findings

Several clinical indicators (pigmentedLesion, urgentReferral, highPriorityReferral, and malignancy) are obtained via a weighted sum of the device's output, using category weights defined specifically for each finding. These weights are binary values (0 or 1) and indicate which ICD-11 categories from the output contribute to the finding's value.

  • For the pigmentedLesion finding, all ICD-11 categories that correspond to a pigmented lesion are given a positive weight (wi=1w_i=1) 1, and negative (wi=0w_i=0) otherwise.
  • For the urgentReferral finding, all ICD-11 categories related to conditions that require urgent referral (i.e. should be referred between 0-48 hours) are given a positive weight (wi=1w_i=1) 1, and negative (wi=0w_i=0) otherwise.
  • For the highPriorityReferral finding, all ICD-11 categories related to conditions that, despite not requiring an urgent referral, have a higher priority for referral than others (i.e. referred in 7-15 days), are given a positive weight (wi=1w_i=1) 1, and negative (wi=0w_i=0) otherwise.
  • For the malignancy finding, all ICD-11 categories related to malignancy (skin cancer) are given a positive weight (wi=1w_i=1) 1, and negative (wi=0w_i=0) otherwise.

The value of each finding (ff) is computed using the weighted sum of the device's output (i.e. the probability distribution):

f=i=1Nwipif = \sum_{i=1}^{N} w_i \cdot p_i

Where NN is the total number of predicted ICD-11 categories, and wiw_i and pip_i are the weight and probability of the ii-th category of the distribution.

Performance indicators

Performance indicators are a set of values that provide a deeper understanding of the device's skin disease recognition performance on a given input as well as on the internal test data used during development. Similarly to the clinical indicators, they are derived from the diagnosis support output of the device, which is an interpretative distribution representation of possible International Classification of Diseases (ICD) classes that might be represented in the pixel content of the image. In other words, it is a probability distribution in which every ICD-11 category is given a probability value between 0 and 1. As it is a probability distribution, the sum of the entire distribution equals to 1.

Top-K sensitivity and specificity

In order to measure the skin disease recognition performance of the device for each specific ICD-11 category, we compute category-wise top-K sensitivity and specificity metrics on our hold-out test set. These are common metrics for binary classification scenarios:

  • Sensitivity (also known as true positive rate) is the probability of a positive output, conditioned on the test case truly being positive.
  • Specificity (or true negative rate) is the probability of a negative output, conditioned on the test case truly being negative.
Sensitivity=TPTP+FNSensitivity=\frac{TP}{TP + FN} Specificity=TNTN+FPSpecificity=\frac{TN}{TN + FP}

In order to apply these metrics to our multiclass scenario, we used the following strategy for each ICD-11 category CC. We also modified the metrics to account for the diagnosis support use case (i.e. looking at the top-K suggestions instead of just the top-1 prediction):

  1. If the ground truth label of an image corresponds to class CC, it is considered a positive (1) case, and negative (0) if it is any other ICD-11 category.
  2. If category CC is within the top-K predicted classes, the prediction is considered a positive (1) output. If category CC is not in the top-K list, the output is negative (0).

As the predictions and ground truth labels have been converted to a binary case (0/1), we can compute sensitivity and specificity using true positives, false negatives, false positives, and true negatives:

Positive outputNegative output
Positive label✔️ True positive (TP) False negative (FN)
Negative label False positive (FP)✔️ True negative (TN)

We compute sensitivity and specificity for several values of KK (1,3, 5), resulting in the top-1, top-3, and top-5 sensitivity and specificity performance indicators.

Entropy

This finding is included to provide the user with an estimation of the uncertainty associated to the diagnosis support output of the device. Normalised entropy (HH) is defined as:

H=i=1Npiln(pi)ln(N)H=-\frac{\sum_{i=1}^{N}p_i \cdot \ln(p_i)}{\ln(N)}

Where ln\ln is the natural logarithm, NN the total number of ICD-11 categories, and pip_i the probability of th ii-th category in the probability distribution.

Low entropy values indicate that the mass of the distribution is concentrated on a reduced number of categories, which can be interpreted as the device being confident about its prediction. Conversely, high entropy values indicate that the mass of the probability distribution is distributed equally across all categories, suggesting that the device is not confident about its prediction.