Sentiment Analysis of Text Files with Amazon Comprehend

Sentiment Analysis of Text Files with Amazon Comprehend

avatar

Minh Huynh

2025.07.26

In this lab, you will automate sentiment analysis on text files using Amazon Comprehend. When a text file is uploaded to an S3 bucket, a Lambda function will be triggered to analyze the sentiment and store the results in another S3 bucket.

Lab Introduction

  • AWS experience: Intermediate
  • Time to complete: 20 minutes
  • AWS Region: US East (N. Virginia) us-east-1
  • Cost to complete: Free Tier eligible
  • Services used: Amazon Comprehend, Amazon Lambda, Amazon S3

1. Explore Amazon Comprehend with AWS Console

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to uncover valuable insights from text. It can perform sentiment analysis, entity recognition, key phrase extraction, and more, allowing you to gain deeper insights into unstructured text data.

  1. Navigate to Amazon Comprehend.
  2. Click on Launch Amazon Comprehend.
  3. You will be redirected to the Real-time analysis dashboard.

a. Scroll down and take your time to review the Input data. There should be an existing text in the Input text Textbox.

b. Move to the Insights section. Take your time with each mini tab’s results. From the Entities tab to the Syntax tab.

In Amazon Comprehend, a confidence score is a numerical value that indicates the level of certainty or probability the service has about its analysis or classification results. A higher confidence score generally means the results are more reliable, while a lower score suggests less certainty.

  • Entities:
    • Description: Identifies and classifies entities within the text, such as people, organizations, locations, dates, and other relevant items.
    • Use Case: Useful for extracting specific details from unstructured text, such as identifying names in documents or organizations in news articles.

  • Key Phrases:
    • Description: Extract and highlight the most relevant phrases from the text that convey the main points or concepts.
    • Use Case: Helps summarize content and understand the key topics discussed in the text.

  • Language:
    • Description: Detects the language in which the text is written.
    • Use Case: Ensures proper text processing in multilingual environments by identifying the language, which can be useful for translation or further language-specific analysis.

  • PII (Personally Identifiable Information):
    • Description: Identifies and redacts or flags sensitive information related to individuals, such as names, addresses, phone numbers, and social security numbers.
    • Use Case: It is important for data privacy and compliance, ensuring that sensitive information is handled appropriately in various applications.

  • Sentiment:
    • Description: Analyze the overall sentiment of the text, categorizing it as positive, negative, neutral, or mixed.
    • Use Case: Useful for understanding customer feedback, social media posts, or reviews to gauge general sentiment and customer satisfaction.

  • Targeted Sentiment:
    • Description: Provides sentiment analysis specific to particular aspects or targets within the text, such as a product feature or service.
    • Use Case: Allows for more granular sentiment analysis related to specific topics or entities, helping businesses understand nuanced opinions about particular aspects.

  • Syntax:
    • Description: Analyzes the grammatical structure of the text, including parts of speech (nouns, verbs, adjectives) and sentence structure.
    • Use Case: Useful for deeper linguistic analysis, such as building chatbots or improving text readability and understanding.
  1. Scroll up to the Input data. Enter the following sample text to analyze the Input text textbox.
My experience with AWS has been fantastic! The services are intuitive, and the support team is very responsive. Cloud mentor pro has been a great learning resource, and I highly recommend it to anyone. If you want to contact me, my email is cloudmentorpro@example.com, and my phone number is (555) 123-4567. Looking forward to learning more and collaborating with others in this space!

a. Click on Analyze

  1. Similar to the previous steps, take your time to review the analysis results in the Insights section.

2. Deploy architecture Amazon Comprehend integrating Amazon S3 and Lambda function

1. Set Up S3 Bucket

  1. Navigate to the S3 service:
  2. Create a new S3 bucket:
  • Use a unique name (e.g., my-comprehend-bucket-3000).
  • Use default settings and click Create bucket.
  1. Create two folders within the bucket:
  • comprehend-text-input
  • comprehend-analysis-output

2. Create a Lambda Function

  1. Navigate to the Lambda service
  2. Create a new function with the following configuration
  • Function namemyLambdaFunction
  • Runtime: Python 3.8 or higher
  • Execution role: Create a new role with basic Lambda permission
  • Click Create function.
  1. Replace the default code with the following Python script:
import json
import boto3
 
# Initialize S3 and Comprehend clients
s3_client = boto3.client('s3')
comprehend_client = boto3.client('comprehend')
 
def lambda_handler(event, context):
    # Extract bucket and key (file name) from the S3 event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']
     
    # Get the text content from the uploaded file in S3
    response = s3_client.get_object(Bucket=bucket, Key=key)
    text = response['Body'].read().decode('utf-8')
     
    # Perform sentiment analysis using Comprehend
    sentiment_response = comprehend_client.detect_sentiment(
        Text=text,
        LanguageCode='en'  # Set the language of the text
    )
     
    # Save the sentiment analysis result to the output bucket
    output_key = 'comprehend-analysis-output/' + key.split('/')[-1].replace('.txt', '') + '-sentiment.json'
     
    s3_client.put_object(
        Bucket=bucket,
        Key=output_key,
        Body=json.dumps(sentiment_response, indent=4)
    )
     
    return {
        'statusCode': 200,
        'body': json.dumps('Sentiment analysis completed successfully.')
    }
  1. Deploy the function.
  2.  Adjust the Timeout to 1 minute in the Configuration tab > General configuration > Timeout
  3. Add permissions to lambda functions role interacting with S3 and Comprehend

3. Add S3 Trigger to Lambda

  1. Go back to the S3 bucket created in the previous step.
  2. In the Properties tab, create an Event Notification with the following settings:
  • Event name: text-upload-event
  • Prefix: comprehend-text-input/
  • Event type: Put
  • Destination: Lambda Function: Choose myLambdaFunction or paste its ARN.
  • Save changes.

4. Test the Lambda Function

  1. Upload a text file to the S3 bucket comprehend-text-input folder.
  2. Here is a text file you can upload: https://drive.google.com/file/d/1eD6MVg8AmYgUA6CNLlbgYjjaPkxf1lS1/view?usp=sharing

  1. Navigate to the comprehend-analysis-output S3 bucket. Verify that the sentiment analysis result has created a new JSON file.
  2. Download and review the sentiment analysis result, which includes the detected sentiment and confidence scores.

Congratulations! You have successfully set up an automated sentiment analysis process using Amazon Comprehend, S3 buckets, and a Lambda function.

Clean up resources

  • Delete Lambda function
  • Delete Amazon S3