Text analysis (Read)
Analyze single text inputs via REST API — sentiment, word counts, TF-IDF terms, and linguistic metrics.
Before you start
- Unstructured API key (sidebar footer Looking for unstructured? or Account → API keys)
- For bulk text corpora for ML training, use Lake zip ingest instead
What You Can Analyze
- Customer reviews - Understand sentiment and key topics
- Product descriptions - Extract features and categories
- Social media posts - Analyze engagement and sentiment
- Support tickets - Categorize and prioritize issues
- Survey responses - Extract insights from open-ended questions
API Specifications
Endpoint
POST https://read.heimdallapp.org/read/v1/api/process
Request Headers
- x-api-key - API key that is issued when the endpoint is configured
- x-username - Username associated with your account
Request Body
- text - The text input that you want to analyze
warning
The text input should not include line breaks or double quotes.
Response Metrics
Heimdall Read provides comprehensive text analysis including:
Basic Metrics
- length - The number of characters in the text
- word_count - The number of words in the text
- sentence_count - The number of sentences in the text
Language Analysis
- avg_word_length - The average number of characters in words
- avg_sentence_length - The average number of words in sentences
- oov_ratio - Proportion of words not found in standard vocabulary
- oov_ratio_2 - Alternative vocabulary coverage metric
Part-of-Speech Analysis
- noun_count - Number of nouns in the text
- verb_count - Number of verbs in the text
- adjective_count - Number of adjectives in the text
- adverb_count - Number of adverbs in the text
- pronoun_count - Number of pronouns in the text
- stopword_count - Number of common words (the, and, etc.)
Content Analysis
- tfidf_top1 - Most important term (TF-IDF analysis)
- tfidf_top2 - Second most important term
- tfidf_top3 - Third most important term
Sentiment Analysis
- sentiment - Overall sentiment: Positive, Negative, or Neutral
- compound_sentiment_score - Sentiment intensity (-1 to +1)
Example Response
{
"length": 1755,
"word_count": 367,
"oov_ratio": 0.18256130790190736,
"oov_ratio_2": 0.33787465940054495,
"sentence_count": 18,
"avg_word_length": 3.904632152588556,
"avg_sentence_length": 20.38888888888889,
"noun_count": 54,
"verb_count": 76,
"adjective_count": 17,
"adverb_count": 40,
"pronoun_count": 27,
"stopword_count": 169,
"tfidf_top1": "writing",
"tfidf_top2": "good",
"tfidf_top3": "just",
"sentiment": "positive",
"compound_sentiment_score": 0.7737
}
Sample Request
import requests
url = 'https://read.heimdallapp.org/read/v1/api/process'
headers = {
'X-api-key': 'YOUR-API-KEY',
'X-username': 'YOUR-USERNAME'
}
data = {
"text": "Heimdall is an amazing machine learning platform that makes data science simple and accessible for everyone."
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
result = response.json()
print(f"Sentiment: {result['sentiment']}")
print(f"Word count: {result['word_count']}")
print(f"Key terms: {result['tfidf_top1']}, {result['tfidf_top2']}")
else:
print(f"Error: {response.status_code}")
Use Cases
Customer Review Analysis
# Analyze customer reviews for sentiment and key topics
reviews = [
"Great product, fast shipping, highly recommend!",
"Poor quality, arrived damaged, very disappointed.",
"Average product, nothing special but works fine."
]
for review in reviews:
response = requests.post(url, headers=headers, json={"text": review})
data = response.json()
print(f"Review: {review[:50]}...")
print(f"Sentiment: {data['sentiment']} (Score: {data['compound_sentiment_score']})")
print(f"Key terms: {data['tfidf_top1']}, {data['tfidf_top2']}")
print("---")
Content Categorization
# Use text metrics to categorize content
def categorize_content(text_metrics):
if text_metrics['sentiment'] == 'positive' and text_metrics['compound_sentiment_score'] > 0.5:
return "Highly Positive"
elif text_metrics['sentiment'] == 'negative' and text_metrics['compound_sentiment_score'] < -0.5:
return "Highly Negative"
else:
return "Neutral"
Error Handling
422 Unprocessable Entity
You will receive a 422 error if your request body structure is incorrect.
Common issues:
- Missing required headers
- Invalid text format (line breaks, quotes)
- Malformed JSON request
Next Steps
Now that you can analyze text:
- Try Image Analysis - Process images with Heimdall Vision
- Build an ML model — Train on gold features from Lake