Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock

Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock

avatar

Minh Huynh

2025.05.26

A Practical Guide about Fine-Tuning and Deploying Custom AI Models on Amazon Bedrock

Lab Introduction

  • AWS experience: Intermediate
  • Time to complete: 60 minutes
  • AWS Region: US East (N. Virginia) us-east-1
  • Cost to complete: 1-2$
  • Services used: Amazon Bedrock, Amazon S3, Amazon IAM,...

Use Case: Tóm tắt hội thoại giữa bác sĩ và bệnh nhân

Trong bài thực hành này, chúng ta sử dụng bộ dữ liệu chứa các cuộc hội thoại giữa bác sĩ và bệnh nhân, được lấy từ bộ dữ liệu ACI-Bench dataset. Nhiệm vụ của chúng ta là huấn luyện lại mô hình để tóm tắt những cuộc hội thoại này thành các ghi chú lâm sàng có cấu trúc. Mô hình nền tảng được chọn để tinh chỉnh là Cohere's command-light-text-v14, nổi bật trong việc tạo ra các bản tóm tắt ngắn gọn và mạch lạc.

Các bước thực hiện bao gồm:

  1. Thu thập các đoạn hội thoại thực tế giữa bác sĩ và bệnh nhân từ bộ dữ liệu ACI-Bench.
  2. Tiến hành xử lý, chuẩn hóa dữ liệu và phân loại thông tin thành các mục như: lý do đến khám, triệu chứng, chẩn đoán, điều trị, khuyến nghị...
  3. Lựa chọn Cohere's command-light-text-v14 làm mô hình nền tảng, vì khả năng tạo ra các bản tóm tắt súc tích và rõ ràng.
  4. Tiến hành huấn luyện mô hình (re-training) với dữ liệu đã được gán nhãn.
  5. Đánh giá hiệu quả mô hình bằng cách so sánh với ghi chú lâm sàng do chuyên gia tạo ra.
  6. Ứng dụng thực tế: Hỗ trợ bác sĩ tự động tạo ghi chú sau mỗi cuộc hội thoại, giúp tiết kiệm thời gian và giảm thiểu sai sót khi nhập liệu.

1. Set up the necessary AWS resources

Create a Sagemaker Notebook to perform model fine-tuning. Access the service Amazon Sagemaler AI -> Tab Notebook -> Create notebook instance

Notebook configuration information:

  • Notebook instance name: fine-tuning-model-notebook
  • Notebook instance type: ml.t3.medium
  • Platform identifier: Amazon Linux 2, Jupyter Lab 4

Configuration Git repositories:

  • Repo: Clone a public Git Repo
  • URL: https://github.com/gautrucdethuong/bedrock-fine-tuning-model.git
  • Click Create notebook instance. Waiting 2-3p...
  • After Notebook is successfully created -> Click on the button Open JupyterLab.
  • Open file bedrock-custom-model-finetuning.ipyynb, click Kernel: conda_python3 -> click button Select

2. Follow step-by-step in Notebook

Step 1. Install Required Libraries

Install the required libraries. Press Shift + Enter to run each notebook line.

Step 2. Prepare and upload fine-tuning dataset to S3

Load prepared datasets in the dataset/data.csv folder.

In this step, we reformat the dataset by formatting it into the JSON Lines (JSONL) structure required for fine-tuning on Amazon Bedrock. Each line in the JSONL file must include a Prompt and a Completion field.

The following is the format of the data converted into JSONL:

{
    "completion": "<Summarized clinical note>",
    "prompt": "Summarize the following conversation:\n\n<Doctor-patient dialogue>"
}

To make the dataset accessible for fine-tuning, it needs to be uploaded to an Amazon S3 bucket.

Once the bucket is verified, the fine-tuning dataset, saved in JSON Lines format, is uploaded to the specified bucket. This step is essential, as Amazon Bedrock accesses the dataset from S3 during the fine-tuning process.

Verify dataset uploaded into S3.

Step 3: Fine-Tune the Model on Amazon Bedrock

Check list foundation models available for fine-tuning in region "us-east-1".

Step 4: Create an IAM Role for Fine-Tuning

Run this cell below:

Step 5: Create and submit a fine-tuning job

With the dataset uploaded to Amazon S3 and the necessary resources in place, the next step is to create and submit the fine-tuning job. This involves specifying the pre-trained foundation model, the job details, and the fine-tuning parameters.

In this example, we fine-tune the Cohere command-light-text-v14 model to summarize medical conversations. Below is the configuration used to submit the job:

# Define the job parameters
base_model_id = "cohere.command-light-text-v14:7:4k"
job_name = "cohere-Summarizer-medical-finetuning-job-v1"
model_name = "cohere-Summarizer-medical-Tuned-v1"

# Submit the fine-tuning job
bedrock.create_model_customization_job(
    customizationType="FINE_TUNING",
    jobName=job_name,
    customModelName=model_name,
    roleArn=role_arn,
    baseModelIdentifier=base_model_id,
    hyperParameters={
        "epochCount": "3",  # Number of passes over the dataset
        "batchSize": "16",  # Number of samples per training step
        "learningRate": "0.00005",  # Learning rate for weight updates
    },
    trainingDataConfig={"s3Uri": f"s3://{bucket_name}/{s3_key}"},
    outputDataConfig={"s3Uri": f"s3://{bucket_name}/finetuned/"}
)

Key Parameters:

  • Base Model: The pre-trained model (cohere.command-light-text-v14) serves as the foundation for customization.
  • Job Name and Model Name: These identifiers help track the fine-tuning job and the resulting fine-tuned model for future deployments.

Hyperparameters:

  • epochCount: Specifies the number of training cycles. For demonstration, three epoch is used, but more epochs may yield better results for larger datasets.
  • batchSize: Determines how many samples are processed in each training step. A value of 16 balances memory usage and training efficiency.
  • learningRate: Sets the pace at which the model learns. Lower values ensure stable training but may require more time to converge.

Training and Output Configuration:

  • The trainingDataConfig points to the S3 location of the dataset.
  • The outputDataConfig specifies where the fine-tuned model will be stored.

Run this cell in notebook:

The status of the fine-tuning job can be also seen:

Follow by AWS Console: Amazon Bedrock -> Custom models -> Jobs tab -> Click detail jobs

Step 6. Deploy model custom using provisioned throughput

To use the model for inference, you need to purchase "Provisioned Throughput."

On Amazon Bedrock sidebar in your AWS console, go to Custom Models and then choose the Models tab, select the model you have trained, and then click on Purchase Provisioned Throughput.

Give the provisioned throughput a name, select a commitment term (you can choose "No Commitment" for testing), and then click Purchase Provisioned Throughput.

Step 7. Test our fine-tuned model

Case 1: Test using code in notebook

Copy ARN Provisioned throughput custom model (example: arn:aws:bedrock:us-east-1:833005555478:provisioned-model/frd8hjc0gwtp).

In the next step, we will make a request to the model for inference. Be sure to replace YOUR_MODEL_ARN with the ARN you copied earlier.

I tested it with the following conversation to evaluate its ability to generate concise and meaningful summaries for medical dialogues. The input conversation is designed to reflect a real-world doctor-patient interaction, emphasizing symptoms, medication adherence, and a follow-up plan:

[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?

[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.

[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?

[patient] It’s mostly when I stand up quickly or after I've been walking for a while.

[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?

[patient] No shortness of breath, but I do feel my heart racing sometimes.

[doctor] How about your medications? Are you taking them as prescribed?

[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.

[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.

[patient] Okay, doctor.

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?

[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.

[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.

[patient] Yes, I’ll make sure to get back on track.

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.

[patient] I see. What does that mean for me?

[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.

[patient] That makes sense. Thank you, doctor.

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?

[patient] Yes, especially after long days at work.

[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.

[patient] Thank you, doctor.

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?

[patient] No, I think that’s all for now.

[doctor] Great. See you in two weeks.

Run code this cell in notebook:

Case 2: Test the inference directly from the Playground in the Amazon Bedrock console

Click navigate to Chat/Text under the Playground section, select your fine-tuned model, inference -> Choose Apply.

Copy prompt below and paste chat -> Click button Run.

[doctor] Good morning, Mr. Smith. How have you been feeling since your last visit?

[patient] Good morning, doctor. I've been okay overall, but I’ve been struggling with persistent fatigue and some dizziness.

[doctor] I see. Is the dizziness occurring frequently or only under specific circumstances?

[patient] It’s mostly when I stand up quickly or after I've been walking for a while.

[doctor] Have you noticed any changes in your heart rate or shortness of breath during these episodes?

[patient] No shortness of breath, but I do feel my heart racing sometimes.

[doctor] How about your medications? Are you taking them as prescribed?

[patient] Yes, but I missed a few doses of my beta-blocker last week due to travel.

[doctor] That could explain some of the symptoms. I’ll need to check your blood pressure and do an EKG to assess your heart rhythm.

[patient] Okay, doctor.

[doctor] How has your diet been? Are you still following the low-sodium plan we discussed?

[patient] I’ve been trying, but I’ve slipped up a bit during holidays with family meals.

[doctor] I understand. We’ll reinforce that, as it’s critical for managing your hypertension.

[patient] Yes, I’ll make sure to get back on track.

[doctor] Let’s discuss the results from your last bloodwork. Your cholesterol levels were slightly elevated, and your hemoglobin A1c suggests borderline diabetes.

[patient] I see. What does that mean for me?

[doctor] It means we need to focus on dietary changes and consider starting a low-dose statin. I’ll also refer you to a nutritionist for better meal planning.

[patient] That makes sense. Thank you, doctor.

[doctor] Lastly, you mentioned experiencing more frequent leg swelling recently. Is that still a concern?

[patient] Yes, especially after long days at work.

[doctor] That could be a sign of fluid retention. I’ll adjust your diuretic dose and monitor your progress over the next two weeks.

[patient] Thank you, doctor.

[doctor] All right, let’s get those tests done and review everything at our next appointment. Do you have any other concerns?

[patient] No, I think that’s all for now.

[doctor] Great. See you in two weeks.

Model's Response:

Step 8. Cleanup

You can remove provisioned throughput by navigating to the Provisioned Throughput section from the sidebar in the Amazon Bedrock console.

Delete Notebook instance.