Text Analysis
Convert your text data into structured, tabular datasets ready for modeling. Heimdall Read analyzes text and extracts comprehensive metrics and insights.
Get Your API Key
You'll need an API key from the Unstructured API Key tab to use text analysis features.
What You Can Analyze
- Customer reviews - Understand sentiment and key topics
- Product descriptions - Extract features and categories
- Social media posts - Analyze engagement and sentiment
- Support tickets - Categorize and prioritize issues
- Survey responses - Extract insights from open-ended questions
API Specifications
Endpoint
POST https://read.heimdallapp.org/read/v1/api/process
Request Headers
- x-api-key - API key that is issued when the endpoint is configured
- x-username - Username associated with your account
Request Body
- text - The text input that you want to analyze
warning
The text input should not include line breaks or double quotes.
Response Metrics
Heimdall Read provides comprehensive text analysis including:
Basic Metrics
- length - The number of characters in the text
- word_count - The number of words in the text
- sentence_count - The number of sentences in the text
Language Analysis
- avg_word_length - The average number of characters in words
- avg_sentence_length - The average number of words in sentences
- oov_ratio - Proportion of words not found in standard vocabulary
- oov_ratio_2 - Alternative vocabulary coverage metric
Part-of-Speech Analysis
- noun_count - Number of nouns in the text
- verb_count - Number of verbs in the text
- adjective_count - Number of adjectives in the text
- adverb_count - Number of adverbs in the text
- pronoun_count - Number of pronouns in the text
- stopword_count - Number of common words (the, and, etc.)
Content Analysis
- tfidf_top1 - Most important term (TF-IDF analysis)
- tfidf_top2 - Second most important term
- tfidf_top3 - Third most important term
Sentiment Analysis
- sentiment - Overall sentiment: Positive, Negative, or Neutral
- compound_sentiment_score - Sentiment intensity (-1 to +1)
Example Response
{
"length": 1755,
"word_count": 367,
"oov_ratio": 0.18256130790190736,
"oov_ratio_2": 0.33787465940054495,
"sentence_count": 18,
"avg_word_length": 3.904632152588556,
"avg_sentence_length": 20.38888888888889,
"noun_count": 54,
"verb_count": 76,
"adjective_count": 17,
"adverb_count": 40,
"pronoun_count": 27,
"stopword_count": 169,
"tfidf_top1": "writing",
"tfidf_top2": "good",
"tfidf_top3": "just",
"sentiment": "positive",
"compound_sentiment_score": 0.7737
}
Sample Request
import requests
url = 'https://read.heimdallapp.org/read/v1/api/process'
headers = {
'X-api-key': 'YOUR-API-KEY',
'X-username': 'YOUR-USERNAME'
}
data = {
"text": "Heimdall is an amazing machine learning platform that makes data science simple and accessible for everyone."
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
result = response.json()
print(f"Sentiment: {result['sentiment']}")
print(f"Word count: {result['word_count']}")
print(f"Key terms: {result['tfidf_top1']}, {result['tfidf_top2']}")
else:
print(f"Error: {response.status_code}")
Use Cases
Customer Review Analysis
# Analyze customer reviews for sentiment and key topics
reviews = [
"Great product, fast shipping, highly recommend!",
"Poor quality, arrived damaged, very disappointed.",
"Average product, nothing special but works fine."
]
for review in reviews:
response = requests.post(url, headers=headers, json={"text": review})
data = response.json()
print(f"Review: {review[:50]}...")
print(f"Sentiment: {data['sentiment']} (Score: {data['compound_sentiment_score']})")
print(f"Key terms: {data['tfidf_top1']}, {data['tfidf_top2']}")
print("---")
Content Categorization
# Use text metrics to categorize content
def categorize_content(text_metrics):
if text_metrics['sentiment'] == 'positive' and text_metrics['compound_sentiment_score'] > 0.5:
return "Highly Positive"
elif text_metrics['sentiment'] == 'negative' and text_metrics['compound_sentiment_score'] < -0.5:
return "Highly Negative"
else:
return "Neutral"
Error Handling
422 Unprocessable Entity
You will receive a 422 error if your request body structure is incorrect.
Common issues:
- Missing required headers
- Invalid text format (line breaks, quotes)
- Malformed JSON request
Next Steps
Now that you can analyze text:
- Try Image Analysis - Process images with Heimdall Vision
- Build ML Models - Use text features in machine learning