Text Analysis

Convert your text data into structured, tabular datasets ready for modeling. Heimdall Read analyzes text and extracts comprehensive metrics and insights.

Get Your API Key

You'll need an API key from the Unstructured API Key tab to use text analysis features.

What You Can Analyze

Customer reviews - Understand sentiment and key topics
Product descriptions - Extract features and categories
Social media posts - Analyze engagement and sentiment
Support tickets - Categorize and prioritize issues
Survey responses - Extract insights from open-ended questions

API Specifications

Endpoint

POST https://read.heimdallapp.org/read/v1/api/process

Request Headers

x-api-key - API key that is issued when the endpoint is configured
x-username - Username associated with your account

Request Body

text - The text input that you want to analyze

warning

The text input should not include line breaks or double quotes.

Response Metrics

Heimdall Read provides comprehensive text analysis including:

Basic Metrics

length - The number of characters in the text
word_count - The number of words in the text
sentence_count - The number of sentences in the text

Language Analysis

avg_word_length - The average number of characters in words
avg_sentence_length - The average number of words in sentences
oov_ratio - Proportion of words not found in standard vocabulary
oov_ratio_2 - Alternative vocabulary coverage metric

Part-of-Speech Analysis

noun_count - Number of nouns in the text
verb_count - Number of verbs in the text
adjective_count - Number of adjectives in the text
adverb_count - Number of adverbs in the text
pronoun_count - Number of pronouns in the text
stopword_count - Number of common words (the, and, etc.)

Content Analysis

tfidf_top1 - Most important term (TF-IDF analysis)
tfidf_top2 - Second most important term
tfidf_top3 - Third most important term

Sentiment Analysis

sentiment - Overall sentiment: Positive, Negative, or Neutral
compound_sentiment_score - Sentiment intensity (-1 to +1)

Example Response

{
    "length": 1755,
    "word_count": 367,
    "oov_ratio": 0.18256130790190736,
    "oov_ratio_2": 0.33787465940054495,
    "sentence_count": 18,
    "avg_word_length": 3.904632152588556,
    "avg_sentence_length": 20.38888888888889,
    "noun_count": 54,
    "verb_count": 76,
    "adjective_count": 17,
    "adverb_count": 40,
    "pronoun_count": 27,
    "stopword_count": 169,
    "tfidf_top1": "writing",
    "tfidf_top2": "good",
    "tfidf_top3": "just",
    "sentiment": "positive",
    "compound_sentiment_score": 0.7737
}

Sample Request

import requests

url = 'https://read.heimdallapp.org/read/v1/api/process'
headers = {
    'X-api-key': 'YOUR-API-KEY',
    'X-username': 'YOUR-USERNAME'
}

data = {
    "text": "Heimdall is an amazing machine learning platform that makes data science simple and accessible for everyone."
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    result = response.json()
    print(f"Sentiment: {result['sentiment']}")
    print(f"Word count: {result['word_count']}")
    print(f"Key terms: {result['tfidf_top1']}, {result['tfidf_top2']}")
else:
    print(f"Error: {response.status_code}")

Use Cases

Customer Review Analysis

# Analyze customer reviews for sentiment and key topics
reviews = [
    "Great product, fast shipping, highly recommend!",
    "Poor quality, arrived damaged, very disappointed.",
    "Average product, nothing special but works fine."
]

for review in reviews:
    response = requests.post(url, headers=headers, json={"text": review})
    data = response.json()
    print(f"Review: {review[:50]}...")
    print(f"Sentiment: {data['sentiment']} (Score: {data['compound_sentiment_score']})")
    print(f"Key terms: {data['tfidf_top1']}, {data['tfidf_top2']}")
    print("---")

Content Categorization

# Use text metrics to categorize content
def categorize_content(text_metrics):
    if text_metrics['sentiment'] == 'positive' and text_metrics['compound_sentiment_score'] > 0.5:
        return "Highly Positive"
    elif text_metrics['sentiment'] == 'negative' and text_metrics['compound_sentiment_score'] < -0.5:
        return "Highly Negative"
    else:
        return "Neutral"

Error Handling

422 Unprocessable Entity

You will receive a 422 error if your request body structure is incorrect.

Common issues:

Missing required headers
Invalid text format (line breaks, quotes)
Malformed JSON request

Next Steps

Now that you can analyze text:

Try Image Analysis - Process images with Heimdall Vision
Build ML Models - Use text features in machine learning

Text Analysis

What You Can Analyze​

API Specifications​

Endpoint​

Request Headers​

Request Body​

Response Metrics​

Basic Metrics​

Language Analysis​

Part-of-Speech Analysis​

Content Analysis​

Sentiment Analysis​

Example Response​

Sample Request​

Use Cases​

Customer Review Analysis​

Content Categorization​

Error Handling​

Next Steps​