Chapter 7: Threat Intelligence & Open Source Intelligence (OSINT)
Collect, process, and analyze threat intelligence feeds and OSINT data using Python for defensive cyber security operations.
Feeds
IOCs
Risk
Reports
7.1 Chapter Overview
Threat intelligence is information about cyber threats, attacker behaviour, indicators of compromise, tactics, techniques and procedures. OSINT is intelligence collected from publicly available sources such as public advisories, security blogs, published feeds, DNS records and official vulnerability notices.
Python helps security analysts process intelligence quickly by reading feeds, removing duplicates, classifying indicators, comparing them with internal logs, enriching records and generating summary reports.
7.2 Learning Objectives
- Understand threat intelligence and OSINT concepts.
- Identify common IOC types used in cyber defense.
- Read threat data from text, CSV and JSON feeds.
- Normalize and remove duplicate indicators using Python.
- Compare threat indicators against internal logs.
- Apply simple enrichment, confidence and risk scoring.
- Generate a threat intelligence summary report.
- Respect legal and ethical boundaries of OSINT collection.
7.3 What is Threat Intelligence?
Threat intelligence turns raw security data into useful decisions. It helps answer: What is the threat? How confident are we? Is our organization affected? What action should we take?
| Level | Description | Example |
|---|---|---|
| Strategic | High-level intelligence for leadership. | Ransomware trends affecting education sector. |
| Tactical | Attacker techniques and methods. | Credential theft and phishing techniques. |
| Operational | Information about active campaigns. | Campaign targeting a specific region or industry. |
| Technical | Machine-readable indicators. | IP addresses, domains, URLs and hashes. |
7.4 Common IOC Types
| IOC Type | Example | Use | Caution |
|---|---|---|---|
| IP Address | 203.0.113.5 | Detect suspicious communication. | IP ownership can change. |
| Domain | malicious-example.com | Detect phishing or malware callback. | Domain may later become inactive. |
| URL | https://example.com/login | Detect phishing or malware hosting. | Do not visit suspicious URLs directly. |
| File Hash | SHA256 value | Identify known files. | Only matches the exact file. |
| Email Address | sender@example.com | Phishing investigation. | Can be spoofed. |
7.5 What is OSINT?
OSINT is the collection and analysis of publicly available information. In cyber security, OSINT can support phishing investigation, vulnerability awareness, domain research, exposed asset discovery and incident context.
Public Reports
Vendor research, advisories and incident write-ups.
Public Feeds
Lists of suspicious IPs, domains, URLs or hashes.
DNS Information
Domain and record information for investigation.
Security Advisories
Official notices from trusted vendors and agencies.
7.6 Threat Intelligence Feed Formats
| Format | Description | Python Tool |
|---|---|---|
| Text | One indicator per line. | open(), splitlines() |
| CSV | Rows and columns such as indicator, type and confidence. | csv module |
| JSON | Structured key-value intelligence records. | json module |
| API | Data fetched from approved web service. | requests library |
7.7 Reading Text IOC Feeds with Python
ioc_data = """203.0.113.5
198.51.100.77
malicious-example.com
phishing-test.net
203.0.113.5
"""
with open("ioc_feed.txt", "w", encoding="utf-8") as file:
file.write(ioc_data)
with open("ioc_feed.txt", "r", encoding="utf-8") as file:
indicators = [line.strip() for line in file if line.strip()]
unique_indicators = sorted(set(indicators))
print("Total indicators:", len(indicators))
print("Unique indicators:", len(unique_indicators))
for indicator in unique_indicators:
print(indicator)7.8 Reading CSV Threat Intelligence Data
import csv
rows = [
["indicator", "type", "source", "confidence"],
["203.0.113.5", "ip", "training-feed", "high"],
["malicious-example.com", "domain", "training-feed", "medium"],
["badfilehash123", "hash", "training-feed", "high"]
]
with open("threat_feed.csv", "w", newline="", encoding="utf-8") as file:
writer = csv.writer(file)
writer.writerows(rows)
with open("threat_feed.csv", "r", encoding="utf-8") as file:
reader = csv.DictReader(file)
for row in reader:
if row["confidence"] == "high":
print(row["indicator"], row["type"], row["confidence"])7.9 Reading JSON Threat Intelligence Data
import json
feed = {
"source": "training-osint-feed",
"indicators": [
{"indicator": "203.0.113.5", "type": "ip", "confidence": "high"},
{"indicator": "malicious-example.com", "type": "domain", "confidence": "medium"},
{"indicator": "badfilehash123", "type": "hash", "confidence": "high"}
]
}
with open("threat_feed.json", "w", encoding="utf-8") as file:
json.dump(feed, file, indent=4)
with open("threat_feed.json", "r", encoding="utf-8") as file:
data = json.load(file)
for item in data["indicators"]:
print(item["indicator"], item["type"], item["confidence"])7.10 Classifying Indicator Types
import re
def classify_indicator(indicator):
ip_pattern = r"^(?:\d{1,3}\.){3}\d{1,3}$"
url_pattern = r"^https?://"
sha256_pattern = r"^[a-fA-F0-9]{64}$"
if re.match(ip_pattern, indicator):
return "ip"
elif re.match(url_pattern, indicator):
return "url"
elif re.match(sha256_pattern, indicator):
return "sha256_hash"
elif "." in indicator:
return "domain"
else:
return "unknown"
indicators = ["203.0.113.5", "https://example.com/login", "malicious-example.com"]
for indicator in indicators:
print(indicator, "=>", classify_indicator(indicator))7.11 IOC Enrichment
Enrichment adds context to raw indicators, such as source, confidence, first seen date, category and recommended action.
from datetime import date
def enrich_indicator(indicator, indicator_type, confidence):
enriched = {
"indicator": indicator,
"type": indicator_type,
"confidence": confidence,
"first_seen": str(date.today()),
"source": "PDTC training feed"
}
if confidence == "high":
enriched["recommendation"] = "Prioritize investigation"
else:
enriched["recommendation"] = "Monitor and validate"
return enriched
print(enrich_indicator("203.0.113.5", "ip", "high"))| Field | Purpose |
|---|---|
| Source | Where the intelligence came from. |
| Confidence | How reliable or relevant the indicator is. |
| Category | Phishing, malware, scanning, botnet or ransomware. |
| Recommendation | What the analyst should do next. |
7.12 Matching Threat Intelligence with Logs
threat_indicators = ["203.0.113.5", "malicious-example.com"]
logs = [
"2026-06-02 outbound connection to malicious-example.com",
"2026-06-02 login success from 192.168.1.10",
"2026-06-02 failed login from 203.0.113.5"
]
for log in logs:
for indicator in threat_indicators:
if indicator in log:
print("MATCH FOUND")
print("Indicator:", indicator)
print("Log:", log)
print("-" * 30)7.13 Simple Threat Score Calculation
def calculate_threat_score(confidence, internal_match, indicator_type):
score = 0
if confidence == "high":
score += 50
elif confidence == "medium":
score += 30
else:
score += 10
if internal_match:
score += 40
if indicator_type in ["hash", "url"]:
score += 10
if score >= 80:
return score, "Critical"
elif score >= 60:
return score, "High"
elif score >= 40:
return score, "Medium"
else:
return score, "Low"
score, level = calculate_threat_score("high", True, "ip")
print("Threat Score:", score)
print("Risk Level:", level)7.14 Generate Threat Intelligence Report
import csv
from datetime import datetime
report_rows = [
["indicator", "type", "confidence", "internal_match", "risk_level", "recommendation"],
["203.0.113.5", "ip", "high", "yes", "Critical", "Investigate immediately"],
["malicious-example.com", "domain", "medium", "yes", "High", "Review DNS logs"],
["badfilehash123", "hash", "high", "no", "Medium", "Add to watchlist"]
]
with open("threat_intelligence_report.csv", "w", newline="", encoding="utf-8") as file:
writer = csv.writer(file)
writer.writerow(["Report Created", datetime.now()])
writer.writerow([])
writer.writerows(report_rows)
print("Threat intelligence report created.")7.15 Safe API Collection Concept
import os
import requests
api_key = os.getenv("THREAT_INTEL_API_KEY")
if api_key is None:
print("API key not configured. Set it as an environment variable.")
else:
headers = {"Authorization": "Bearer " + api_key}
url = "https://example.com/api/threat-feed" # replace with approved API
response = requests.get(url, headers=headers, timeout=10)
print("Status Code:", response.status_code)7.16 OSINT Analysis Workflow
7.17 Interactive IOC Classifier Demo
Paste indicators below. The demo classifies them as IP, URL, hash, domain or unknown.
7.18 Practical Activities
Activity 1: Text Feed Processing
Create a text file containing IOCs. Read it using Python and remove duplicate indicators.
Activity 2: CSV Threat Feed
Create a CSV feed with indicator, type, source and confidence. Print only high-confidence indicators.
Activity 3: IOC Classification
Write a function that classifies indicators as IP, domain, URL or hash.
Activity 4: Log Correlation
Compare a list of threat indicators against internal sample logs and print matches.
Mini Project
Build a Python threat intelligence processor that reads a feed, removes duplicates, classifies indicators, matches them against logs and generates a CSV report.
7.19 Interactive Final Assessment Quiz
Each correct answer gives +1 mark. Each wrong answer gives -0.5 mark.
1. Threat intelligence helps organizations prevent, detect and respond to cyber threats.
2. OSINT stands for:
3. IP addresses, domains, URLs and file hashes can be IOCs.
4. Which Python module is used to read JSON data?
5. Removing duplicate indicators improves feed quality.
6. Threat intelligence is more useful when correlated with internal logs.
7. API keys should be hardcoded directly inside scripts.
8. Confidence level helps prioritize indicators.
9. OSINT collection should follow legal, ethical and organizational rules.
10. A CSV report can summarize threat intelligence findings.
Your Score: 0
7.20 Chapter Summary
In this chapter, learners studied threat intelligence and OSINT using Python. They learned IOC types, feed formats, text/CSV/JSON processing, indicator classification, enrichment, threat scoring, log correlation, safe API concepts and reporting.