Chapter 7: Threat Intelligence & Open Source Intelligence (OSINT)

Collect, process, and analyze threat intelligence feeds and OSINT data using Python for defensive cyber security operations.

Threat IntelligenceOSINTIOC FeedsPython AnalysisEnrichment
Collect
Feeds
Normalize
IOCs
Analyze
Risk
Generate
Reports

7.1 Chapter Overview

Threat intelligence is information about cyber threats, attacker behaviour, indicators of compromise, tactics, techniques and procedures. OSINT is intelligence collected from publicly available sources such as public advisories, security blogs, published feeds, DNS records and official vulnerability notices.

Python helps security analysts process intelligence quickly by reading feeds, removing duplicates, classifying indicators, comparing them with internal logs, enriching records and generating summary reports.

Learning Outcome: By the end of this chapter, learners should be able to collect, process and analyze threat intelligence feeds and OSINT data using Python in a safe, ethical and defensive manner.
1Collect Sources
2Normalize IOCs
3Enrich Context
4Correlate Logs
5Report Findings

7.2 Learning Objectives

  • Understand threat intelligence and OSINT concepts.
  • Identify common IOC types used in cyber defense.
  • Read threat data from text, CSV and JSON feeds.
  • Normalize and remove duplicate indicators using Python.
  • Compare threat indicators against internal logs.
  • Apply simple enrichment, confidence and risk scoring.
  • Generate a threat intelligence summary report.
  • Respect legal and ethical boundaries of OSINT collection.

7.3 What is Threat Intelligence?

Threat intelligence turns raw security data into useful decisions. It helps answer: What is the threat? How confident are we? Is our organization affected? What action should we take?

LevelDescriptionExample
StrategicHigh-level intelligence for leadership.Ransomware trends affecting education sector.
TacticalAttacker techniques and methods.Credential theft and phishing techniques.
OperationalInformation about active campaigns.Campaign targeting a specific region or industry.
TechnicalMachine-readable indicators.IP addresses, domains, URLs and hashes.
Threat Intelligence = Data + Context + Confidence + Action

7.4 Common IOC Types

IOC TypeExampleUseCaution
IP Address203.0.113.5Detect suspicious communication.IP ownership can change.
Domainmalicious-example.comDetect phishing or malware callback.Domain may later become inactive.
URLhttps://example.com/loginDetect phishing or malware hosting.Do not visit suspicious URLs directly.
File HashSHA256 valueIdentify known files.Only matches the exact file.
Email Addresssender@example.comPhishing investigation.Can be spoofed.

7.5 What is OSINT?

OSINT is the collection and analysis of publicly available information. In cyber security, OSINT can support phishing investigation, vulnerability awareness, domain research, exposed asset discovery and incident context.

Public Reports

Vendor research, advisories and incident write-ups.

Public Feeds

Lists of suspicious IPs, domains, URLs or hashes.

DNS Information

Domain and record information for investigation.

Security Advisories

Official notices from trusted vendors and agencies.

Ethical Rule: Collect only lawful public information and follow site terms, privacy rules and organization policy. Never use OSINT to harass, dox, target or profile individuals.

7.6 Threat Intelligence Feed Formats

FormatDescriptionPython Tool
TextOne indicator per line.open(), splitlines()
CSVRows and columns such as indicator, type and confidence.csv module
JSONStructured key-value intelligence records.json module
APIData fetched from approved web service.requests library

7.7 Reading Text IOC Feeds with Python

ioc_data = """203.0.113.5
198.51.100.77
malicious-example.com
phishing-test.net
203.0.113.5
"""

with open("ioc_feed.txt", "w", encoding="utf-8") as file:
    file.write(ioc_data)

with open("ioc_feed.txt", "r", encoding="utf-8") as file:
    indicators = [line.strip() for line in file if line.strip()]

unique_indicators = sorted(set(indicators))

print("Total indicators:", len(indicators))
print("Unique indicators:", len(unique_indicators))

for indicator in unique_indicators:
    print(indicator)
Learning Point: set() removes duplicate indicators so analysts do not review the same IOC repeatedly.

7.8 Reading CSV Threat Intelligence Data

import csv

rows = [
    ["indicator", "type", "source", "confidence"],
    ["203.0.113.5", "ip", "training-feed", "high"],
    ["malicious-example.com", "domain", "training-feed", "medium"],
    ["badfilehash123", "hash", "training-feed", "high"]
]

with open("threat_feed.csv", "w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerows(rows)

with open("threat_feed.csv", "r", encoding="utf-8") as file:
    reader = csv.DictReader(file)
    for row in reader:
        if row["confidence"] == "high":
            print(row["indicator"], row["type"], row["confidence"])

7.9 Reading JSON Threat Intelligence Data

import json

feed = {
    "source": "training-osint-feed",
    "indicators": [
        {"indicator": "203.0.113.5", "type": "ip", "confidence": "high"},
        {"indicator": "malicious-example.com", "type": "domain", "confidence": "medium"},
        {"indicator": "badfilehash123", "type": "hash", "confidence": "high"}
    ]
}

with open("threat_feed.json", "w", encoding="utf-8") as file:
    json.dump(feed, file, indent=4)

with open("threat_feed.json", "r", encoding="utf-8") as file:
    data = json.load(file)

for item in data["indicators"]:
    print(item["indicator"], item["type"], item["confidence"])

7.10 Classifying Indicator Types

import re

def classify_indicator(indicator):
    ip_pattern = r"^(?:\d{1,3}\.){3}\d{1,3}$"
    url_pattern = r"^https?://"
    sha256_pattern = r"^[a-fA-F0-9]{64}$"

    if re.match(ip_pattern, indicator):
        return "ip"
    elif re.match(url_pattern, indicator):
        return "url"
    elif re.match(sha256_pattern, indicator):
        return "sha256_hash"
    elif "." in indicator:
        return "domain"
    else:
        return "unknown"

indicators = ["203.0.113.5", "https://example.com/login", "malicious-example.com"]

for indicator in indicators:
    print(indicator, "=>", classify_indicator(indicator))

7.11 IOC Enrichment

Enrichment adds context to raw indicators, such as source, confidence, first seen date, category and recommended action.

from datetime import date

def enrich_indicator(indicator, indicator_type, confidence):
    enriched = {
        "indicator": indicator,
        "type": indicator_type,
        "confidence": confidence,
        "first_seen": str(date.today()),
        "source": "PDTC training feed"
    }

    if confidence == "high":
        enriched["recommendation"] = "Prioritize investigation"
    else:
        enriched["recommendation"] = "Monitor and validate"

    return enriched

print(enrich_indicator("203.0.113.5", "ip", "high"))
FieldPurpose
SourceWhere the intelligence came from.
ConfidenceHow reliable or relevant the indicator is.
CategoryPhishing, malware, scanning, botnet or ransomware.
RecommendationWhat the analyst should do next.

7.12 Matching Threat Intelligence with Logs

threat_indicators = ["203.0.113.5", "malicious-example.com"]

logs = [
    "2026-06-02 outbound connection to malicious-example.com",
    "2026-06-02 login success from 192.168.1.10",
    "2026-06-02 failed login from 203.0.113.5"
]

for log in logs:
    for indicator in threat_indicators:
        if indicator in log:
            print("MATCH FOUND")
            print("Indicator:", indicator)
            print("Log:", log)
            print("-" * 30)

7.13 Simple Threat Score Calculation

def calculate_threat_score(confidence, internal_match, indicator_type):
    score = 0

    if confidence == "high":
        score += 50
    elif confidence == "medium":
        score += 30
    else:
        score += 10

    if internal_match:
        score += 40

    if indicator_type in ["hash", "url"]:
        score += 10

    if score >= 80:
        return score, "Critical"
    elif score >= 60:
        return score, "High"
    elif score >= 40:
        return score, "Medium"
    else:
        return score, "Low"

score, level = calculate_threat_score("high", True, "ip")
print("Threat Score:", score)
print("Risk Level:", level)

7.14 Generate Threat Intelligence Report

import csv
from datetime import datetime

report_rows = [
    ["indicator", "type", "confidence", "internal_match", "risk_level", "recommendation"],
    ["203.0.113.5", "ip", "high", "yes", "Critical", "Investigate immediately"],
    ["malicious-example.com", "domain", "medium", "yes", "High", "Review DNS logs"],
    ["badfilehash123", "hash", "high", "no", "Medium", "Add to watchlist"]
]

with open("threat_intelligence_report.csv", "w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerow(["Report Created", datetime.now()])
    writer.writerow([])
    writer.writerows(report_rows)

print("Threat intelligence report created.")

7.15 Safe API Collection Concept

import os
import requests

api_key = os.getenv("THREAT_INTEL_API_KEY")

if api_key is None:
    print("API key not configured. Set it as an environment variable.")
else:
    headers = {"Authorization": "Bearer " + api_key}
    url = "https://example.com/api/threat-feed"  # replace with approved API
    response = requests.get(url, headers=headers, timeout=10)
    print("Status Code:", response.status_code)
Important: Do not hardcode API keys. Follow API terms, rate limits and organizational policy.

7.16 OSINT Analysis Workflow

1Define Objective
2Collect Public Data
3Validate Source
4Normalize Indicators
5Correlate with Logs
6Report Findings

7.17 Interactive IOC Classifier Demo

Paste indicators below. The demo classifies them as IP, URL, hash, domain or unknown.

Click Classify IOCs.

7.18 Practical Activities

Activity 1: Text Feed Processing

Create a text file containing IOCs. Read it using Python and remove duplicate indicators.

Activity 2: CSV Threat Feed

Create a CSV feed with indicator, type, source and confidence. Print only high-confidence indicators.

Activity 3: IOC Classification

Write a function that classifies indicators as IP, domain, URL or hash.

Activity 4: Log Correlation

Compare a list of threat indicators against internal sample logs and print matches.

Mini Project

Build a Python threat intelligence processor that reads a feed, removes duplicates, classifies indicators, matches them against logs and generates a CSV report.

7.19 Interactive Final Assessment Quiz

Each correct answer gives +1 mark. Each wrong answer gives -0.5 mark.

1. Threat intelligence helps organizations prevent, detect and respond to cyber threats.

2. OSINT stands for:

3. IP addresses, domains, URLs and file hashes can be IOCs.

4. Which Python module is used to read JSON data?

5. Removing duplicate indicators improves feed quality.

6. Threat intelligence is more useful when correlated with internal logs.

7. API keys should be hardcoded directly inside scripts.

8. Confidence level helps prioritize indicators.

9. OSINT collection should follow legal, ethical and organizational rules.

10. A CSV report can summarize threat intelligence findings.

Your Score: 0

7.20 Chapter Summary

In this chapter, learners studied threat intelligence and OSINT using Python. They learned IOC types, feed formats, text/CSV/JSON processing, indicator classification, enrichment, threat scoring, log correlation, safe API concepts and reporting.

Remember: Threat intelligence becomes valuable when it is reliable, contextual, correlated with internal evidence and converted into clear defensive action.