Simplify the categorization of medical images using Amazon SageMaker Canvas | Amazon Web Services

12 minutes, 46 seconds Read

Medical image analysis is essential for both illness diagnosis and treatment. Healthcare practitioners may detect some malignancies, heart disorders, and ophthalmologic problems more swiftly thanks to the automation of this procedure made possible by machine learning (ML) approaches. However, the lengthy and intricate process of developing machine learning models for image categorization is one of the major obstacles that researchers and physicians in this field must overcome. For many healthcare practitioners, traditional approaches might be a barrier since they need deep understanding of ML algorithms and coding experience.

We utilized Amazon SageMaker Canvas, a visual platform that lets medical professionals create and implement ML models without the need for coding or other specialist skills, to close this gap. The steep learning curve that comes with machine learning is removed by this strategy that is easy to utilize, allowing physicians to devote more time to their patients.

A drag-and-drop interface is offered by Amazon SageMaker Canvas for building machine learning models. Clinicians may choose which data to use, what kind of output they want, and then sit back and watch as the model is automatically built and trained. The model produces precise predictions after it has been trained.

For medical professionals who wish to apply machine learning (ML) to enhance their diagnostic and treatment choices, this method is perfect. Without having to be ML experts, they may utilize Amazon SageMaker Canvas to harness the potential of ML to aid their patients.

Patient outcomes and healthcare efficiency are directly impacted by medical image categorization. Early illness identification by timely and precise categorization of medical pictures facilitates efficient treatment planning and monitoring. Furthermore, a wider spectrum of healthcare professionals—including those without substantial technical backgrounds—are now able to contribute to the field of medical picture analysis because to the democratization of machine learning through user-friendly interfaces like Amazon SageMaker Canvas. This all-inclusive strategy encourages cooperation and information exchange, which eventually advances medical research and enhances patient care.

In this article, we’ll examine Amazon SageMaker Canvas’s medical picture classification capabilities, go over its advantages, and show some actual use cases that illustrate how it affects medical diagnosis.

Use case
The earlier skin cancer is identified, the greater the likelihood that therapy will be effective. Skin cancer is a dangerous and often fatal condition. According to statistics, skin cancer—such as squamous and basal cell carcinomas—is one of the most prevalent cancer forms and causes hundreds of thousands of deaths annually globally. Skin cell proliferation that is not typical is how it shows up.

On the other hand, the odds of recovery are significantly increased by early diagnosis. Additionally, it could minimize the need for surgical, radiological, or chemotherapeutic treatments or their total use, which would save healthcare expenses.

A dermoscopy[1], which examines the overall size, shape, and color features of skin lesions, is the first step in the diagnosis of skin cancer. Histological examinations and further samples are then performed on suspected lesions to confirm the kind of cancer cells. Skin cancer is detected by doctors using a variety of techniques, the first being ocular detection. Physicians utilize ABCD (asymmetry, border, color, diameter) as a reference for potential melanoma shapes when doing early screening for the illness. This guidance was established by the American Center for the Study of Dermatology. When a suspected skin lesion is discovered, the physician removes a sample of the visible lesion from the patient’s skin and uses microscopy to identify the kind of skin cancer and determine if it is benign or malignant. The identification of worrisome lesions or moles can be aided by computer vision models, leading to an earlier and more precise diagnosis.

The steps involved in developing a cancer detection model are listed below:

  • Compile a sizable dataset of pictures showing both skin that is healthy and skin that has different kinds of malignant or precancerous tumors. Careful curation of this dataset is necessary to guarantee correctness and consistency.
  • To distinguish between skin that is healthy and skin that is malignant, preprocess the photos using computer vision algorithms and extract pertinent information.
  • Utilizing a supervised learning methodology, train an ML model on the preprocessed photos to enable the machine to differentiate between various skin types.
  • Assess the model’s performance using a range of measures, including accuracy and recall, to make sure it correctly detects malignant skin and reduces the number of false positives.
  • Construct a user-friendly tool including the model so dermatologists and other medical practitioners may use it to help identify and diagnose skin cancer.

All things considered, creating a skin cancer detection model from scratch usually calls for a substantial investment of time and knowledge. This is where steps 2 through 5 may be completed more quickly and easily with the aid of Amazon SageMaker Canvas.

Summary of the solution
We utilize a dermatoscopy skin cancer picture dataset provided by Harvard Dataverse to show how to create a computer vision model for skin cancer without writing any code. We utilize the 10,015 dermatoscopic image dataset (available at HAM10000) to develop a skin cancer classification algorithm that forecasts skin cancer classes. A few salient features of the dataset are:

  • The dataset is utilized as a training set for machine learning in academia.
  • It is a comprehensive compilation of all significant diagnostic classifications related to pigmented lesions.
  • The dataset includes the following categories: dermatofibroma (df), melanoma (mel), melanocytic nevi (nv), actinic keratoses and intraepithelial carcinoma / Bowen’s disease (akiec), basal cell carcinoma (bcc), benign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses, bkl), melanoma (mel), melanocytic nevi (nv), and vascular lesions (angiomas, angiokeratomas, pyogenic granulomas and vasc)
  • Through histopathology (histo), more than 50% of the lesions in the dataset have been verified.
  • For the other cases, the ground truth is ascertained via in vivo confocal microscopy (confocal), expert consensus (consensus), or follow-up inspection (confocal).
  • The HAM10000_metadata file’s lesion_id field may be used to monitor lesions in the dataset that have numerous pictures.

We demonstrate how to use Amazon SageMaker Canvas to ease picture categorization for numerous skin cancer types without writing any code. SageMaker Canvas image categorization automatically categorizes a picture of a skin lesion as either benign or potentially cancerous.


  • The ability to build the resources listed in the steps section using an AWS account that has the necessary permissions.
  • An Amazon SageMaker user with full access using AWS Identity and Access Management (AWS IAM).

1. Configure a SageMaker domain
Follow these steps to create an Amazon SageMaker domain.
Get the HAM10000 dataset here.
Configure datasets
Make an image-classification- bucket in Amazon Simple Storage Service (Amazon S3), where ACCOUNT_ID is your specific AWS AccountNumber.
Figure 1: Making a pail

Make two folders in this bucket, training-data and test-data.
Making directories
Figure 2: Make directories

Make seven folders under training-data, one for each of the following types of skin cancer found in the dataset: akiec, bcc, bkl, df, mel, nv, and vasc.
View in Folder
Figure 3: View of Folder

The HAM10000_metadata file’s lesion_id-column allows users to monitor lesions that have many pictures in the dataset. Copy the matching photos into the appropriate folder using the lesion_id-column (you can begin with 100 images for each categorization).
List of Importable Objects (Model Pictures)
Figure 4: Listing of Importable Objects (Model Pictures)

Employ SageMaker Canvas on Amazon
Open the console, navigate to the Amazon SageMaker service, and choose Canvas from the list. Please click the Open Canvas button once you are on the Canvas page.
Go to the SageMaker Canvas page.
Figure 5: Access the Canvas

After landing on the Canvas page, click My models, and then pick New Model from the menu on the right side of the screen.
Model Development
Figure 6: Model Development

We choose Image analysis under the Problem type and enter image_classify as the model’s name in the new pop-up window that appears.
Bring the dataset in.
Please choose Create dataset on the following screen, give the dataset the name image_classify in the pop-up box, and click the Create button.
creation of datasets
Figure 7: Building a database

Select Amazon S3 as the Data Source on the following screen. The photos can also be uploaded directly (i.e., Local upload).
Dataset import from S3 buckets
Figure 8: S3 bucket import of dataset

You will be presented with a list of all the buckets in your account when you choose Amazon S3. You may use Amazon SageMaker Canvas to rapidly label the photos based on folder names by selecting the parent bucket that contains the dataset into the subfolder (e.g., image-classify-2023) and then clicking the Import data button.
You will see that the value in the Status column changes to Ready from Processing once the dataset has been properly imported.
Click on Select dataset at the bottom of the page to choose your dataset.
Create your model.
Your data should be imported and labeled on the Build page according to the name of the Amazon S3 folder.
Data labeling for Amazon S3
Figure 9: Data Labeling for Amazon S3

When you click the Quick build button (the red-highlighted text in the accompanying image), two options for building the model will appear. The Quick build is the first, while the Standard build is the second. The rapid construction option, as its name implies, prioritizes speed over precision and takes 15 to 30 minutes to complete. The normal construction takes between 45 and 4 hours to complete, with precision taking precedence over speed. Using SageMaker Autopilot capability, the standard build produces many models in the backend and conducts experiments with various hyperparameter combinations before selecting the optimal model.
Choose Standard build to begin the model’s construction. It takes between two and five hours to finish.
custom build
Performing a standard build in Figure 10

When the model build is finished, Figure 11 displays an estimated accuracy.
Model Forecast
Figure 11: Predictive model

You should be able to see information on the accuracy of the model if you choose the Scoring tab. Additionally, we can check the precision, recall, and F1 score by selecting the Advanced metrics option on the Scoring page (A balanced measure of accuracy that takes class balance into consideration).
Whether your model forecasts your data in a numerical, category, picture, text, or time series format will determine the advanced metrics Amazon SageMaker Canvas displays. Since missing a cancer diagnosis is significantly more harmful than correctly identifying one, in this instance, we think memory matters more than precision. The mathematical idea of categorization is referred to as categorical prediction, which includes 2- and 3-category predictions. The proportion of true positives (TP) among all real positives (TP + false negatives) is known as the advanced metric recall. It calculates the percentage of positive cases that the model accurately anticipated to be positive. For a thorough understanding of the advanced metrics, please refer to this article: A deep dive into Amazon SageMaker Canvas advanced metrics.
sophisticated metrics
Figure 12: Complex measurements

This concludes the Amazon SageMaker Canvas model building stage.

Try out your model.
Now that you’ve selected the Predict option, you may submit your own photographs using Single prediction or Batch prediction on the Predict page. To submit your image and test the model, please choose Import after setting your preferred option.
Examine your photos.
Figure 13: Examine your own photos

Let’s begin by predicting a single picture. Make sure you have selected Import picture under Single Prediction. This opens a dialog box where you can select whether to upload locally or from Amazon S3. In this instance, we choose Amazon S3, navigate to the directory containing the test images, and choose any image. Next, choose Import data.
Go to the SageMaker Canvas page.
Figure 14: Prediction of a Single Image

After choosing, the screen that reads “Generating prediction results” ought to appear. As indicated below, your findings ought to be available in a few minutes.
Let’s attempt the batch prediction now. Under Run predictions, choose Batch prediction. Then, click the Import new dataset button, give the dataset the name BatchPrediction, and press the Create button.
Results of a single image prediction
Figure 15: Prediction results for a single picture

Verify that you have chosen Amazon S3 upload in the subsequent dialog, go to the directory containing our test set, and click the Import data button.
Predicting images in batches
Figure 16: Prediction of Batch Images

When the photos are in the “Ready” state, click the generated dataset’s radio button and pick “Generate predictions.” The batch prediction batch should now be showing you as it is generating predictions. Let’s wait for the findings for a few minutes.
Selecting the dataset name will bring you to a website with a thorough forecast on all of our photographs once the status is in the Ready stage.
Bacth forecast outcomes
Figure 17: Prediction results of batch images

The ability to validate the outcomes and download the forecast as a zip or csv file for further use or sharing is another crucial feature of Batch forecast.
Get the forecast here.
Figure 18: Prediction of download

By doing this, you have effectively used Amazon SageMaker Canvas to build a model, train it, and evaluate its prediction.

tidying up
To cease using SageMaker Canvas workspace instance hours and release all resources, choose Log out from the left navigation pane of the SageMaker Canvas application on Amazon.

[1]Fraiwan M, Faouri E. Deep Transfer Learning-Based Automatic Skin Cancer Detection and Classification. Basis for Sensing. June 30, 2022; 22(13):4963. 10.3390/s22134963. doi. PMCID: PMC9269808; PMID: 35808463.

In summary
In this article, we demonstrated how machine learning approaches for medical image analysis may speed up the diagnosis of skin cancer and be applied to other illnesses as well. However, developing machine learning (ML) models for image categorization may be difficult and time-consuming, necessitating ML and coding skills. This problem was solved by Amazon SageMaker Canvas, which offers a visual interface that does not require coding or specific machine learning knowledge. This enables medical personnel to apply machine learning (ML) without a significant learning curve, freeing them up to concentrate on patient care.

Creating a cancer detection model the old-fashioned way is a laborious and time-consuming task. It entails compiling a well chosen dataset, preparing photos, training machine learning (ML) models, assessing their efficacy, and incorporating the results into a user-friendly application for medical practitioners. Preprocessing to integration were made easier with Amazon SageMaker Canvas, which cut down on the time and effort needed to develop a skin cancer detection model.

In this article, we explored the potent powers of Amazon SageMaker Canvas for medical picture classification, highlighting its advantages and providing practical examples that highlight its significant influence on medical diagnosis. We looked at a number of compelling use cases, including the early identification of skin cancer, which frequently results in much better treatment outcomes and lower healthcare costs.

It is crucial to recognize that a number of variables, like the amount of the training dataset and the particular model type used, can affect how accurate the model is. The effectiveness and dependability of the categorization findings are influenced by these elements.

Healthcare practitioners can benefit greatly from using Amazon SageMaker Canvas to diagnose illnesses more quickly and accurately. It is essential to remember that it is not meant to take the place of medical experts’ knowledge and discretion. Instead, it gives them greater power by enhancing their talents and making faster and more accurate diagnoses possible. In order to provide the best possible patient care, healthcare professionals and artificial intelligence (AI) tools—like Amazon SageMaker Canvas—must collaborate. The human aspect is still crucial to decision-making.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *