Chapter 7 - Threat Intelligence & OSINT

7.1 Chapter Overview

Threat intelligence is information about cyber threats, attacker behaviour, indicators of compromise, tactics, techniques and procedures. OSINT is intelligence collected from publicly available sources such as public advisories, security blogs, published feeds, DNS records and official vulnerability notices.

Python helps security analysts process intelligence quickly by reading feeds, removing duplicates, classifying indicators, comparing them with internal logs, enriching records and generating summary reports.

Learning Outcome: By the end of this chapter, learners should be able to collect, process and analyze threat intelligence feeds and OSINT data using Python in a safe, ethical and defensive manner.

1Collect Sources

2Normalize IOCs

3Enrich Context

4Correlate Logs

5Report Findings

7.2 Learning Objectives

Understand threat intelligence and OSINT concepts.
Identify common IOC types used in cyber defense.
Read threat data from text, CSV and JSON feeds.
Normalize and remove duplicate indicators using Python.
Compare threat indicators against internal logs.
Apply simple enrichment, confidence and risk scoring.
Generate a threat intelligence summary report.
Respect legal and ethical boundaries of OSINT collection.

7.3 What is Threat Intelligence?

Threat intelligence turns raw security data into useful decisions. It helps answer: What is the threat? How confident are we? Is our organization affected? What action should we take?

Level	Description	Example
Strategic	High-level intelligence for leadership.	Ransomware trends affecting education sector.
Tactical	Attacker techniques and methods.	Credential theft and phishing techniques.
Operational	Information about active campaigns.	Campaign targeting a specific region or industry.
Technical	Machine-readable indicators.	IP addresses, domains, URLs and hashes.

Threat Intelligence = Data + Context + Confidence + Action

7.4 Common IOC Types

IOC Type	Example	Use	Caution
IP Address	203.0.113.5	Detect suspicious communication.	IP ownership can change.
Domain	malicious-example.com	Detect phishing or malware callback.	Domain may later become inactive.
URL	https://example.com/login	Detect phishing or malware hosting.	Do not visit suspicious URLs directly.
File Hash	SHA256 value	Identify known files.	Only matches the exact file.
Email Address	sender@example.com	Phishing investigation.	Can be spoofed.

7.5 What is OSINT?

OSINT is the collection and analysis of publicly available information. In cyber security, OSINT can support phishing investigation, vulnerability awareness, domain research, exposed asset discovery and incident context.

Public Reports

Vendor research, advisories and incident write-ups.

Public Feeds

Lists of suspicious IPs, domains, URLs or hashes.

DNS Information

Domain and record information for investigation.

Security Advisories

Official notices from trusted vendors and agencies.

Ethical Rule: Collect only lawful public information and follow site terms, privacy rules and organization policy. Never use OSINT to harass, dox, target or profile individuals.

7.6 Threat Intelligence Feed Formats

Format	Description	Python Tool
Text	One indicator per line.	open(), splitlines()
CSV	Rows and columns such as indicator, type and confidence.	csv module
JSON	Structured key-value intelligence records.	json module
API	Data fetched from approved web service.	requests library

7.7 Reading Text IOC Feeds with Python

ioc_data = """203.0.113.5
198.51.100.77
malicious-example.com
phishing-test.net
203.0.113.5
"""

with open("ioc_feed.txt", "w", encoding="utf-8") as file:
    file.write(ioc_data)

with open("ioc_feed.txt", "r", encoding="utf-8") as file:
    indicators = [line.strip() for line in file if line.strip()]

unique_indicators = sorted(set(indicators))

print("Total indicators:", len(indicators))
print("Unique indicators:", len(unique_indicators))

for indicator in unique_indicators:
    print(indicator)

Learning Point: set() removes duplicate indicators so analysts do not review the same IOC repeatedly.

7.8 Reading CSV Threat Intelligence Data

import csv

rows = [
    ["indicator", "type", "source", "confidence"],
    ["203.0.113.5", "ip", "training-feed", "high"],
    ["malicious-example.com", "domain", "training-feed", "medium"],
    ["badfilehash123", "hash", "training-feed", "high"]
]

with open("threat_feed.csv", "w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerows(rows)

with open("threat_feed.csv", "r", encoding="utf-8") as file:
    reader = csv.DictReader(file)
    for row in reader:
        if row["confidence"] == "high":
            print(row["indicator"], row["type"], row["confidence"])

7.9 Reading JSON Threat Intelligence Data

import json

feed = {
    "source": "training-osint-feed",
    "indicators": [
        {"indicator": "203.0.113.5", "type": "ip", "confidence": "high"},
        {"indicator": "malicious-example.com", "type": "domain", "confidence": "medium"},
        {"indicator": "badfilehash123", "type": "hash", "confidence": "high"}
    ]
}

with open("threat_feed.json", "w", encoding="utf-8") as file:
    json.dump(feed, file, indent=4)

with open("threat_feed.json", "r", encoding="utf-8") as file:
    data = json.load(file)

for item in data["indicators"]:
    print(item["indicator"], item["type"], item["confidence"])

7.10 Classifying Indicator Types

import re

def classify_indicator(indicator):
    ip_pattern = r"^(?:\d{1,3}\.){3}\d{1,3}$"
    url_pattern = r"^https?://"
    sha256_pattern = r"^[a-fA-F0-9]{64}$"

    if re.match(ip_pattern, indicator):
        return "ip"
    elif re.match(url_pattern, indicator):
        return "url"
    elif re.match(sha256_pattern, indicator):
        return "sha256_hash"
    elif "." in indicator:
        return "domain"
    else:
        return "unknown"

indicators = ["203.0.113.5", "https://example.com/login", "malicious-example.com"]

for indicator in indicators:
    print(indicator, "=>", classify_indicator(indicator))

7.11 IOC Enrichment

Enrichment adds context to raw indicators, such as source, confidence, first seen date, category and recommended action.

from datetime import date

def enrich_indicator(indicator, indicator_type, confidence):
    enriched = {
        "indicator": indicator,
        "type": indicator_type,
        "confidence": confidence,
        "first_seen": str(date.today()),
        "source": "PDTC training feed"
    }

    if confidence == "high":
        enriched["recommendation"] = "Prioritize investigation"
    else:
        enriched["recommendation"] = "Monitor and validate"

    return enriched

print(enrich_indicator("203.0.113.5", "ip", "high"))

Field	Purpose
Source	Where the intelligence came from.
Confidence	How reliable or relevant the indicator is.
Category	Phishing, malware, scanning, botnet or ransomware.
Recommendation	What the analyst should do next.

7.12 Matching Threat Intelligence with Logs

threat_indicators = ["203.0.113.5", "malicious-example.com"]

logs = [
    "2026-06-02 outbound connection to malicious-example.com",
    "2026-06-02 login success from 192.168.1.10",
    "2026-06-02 failed login from 203.0.113.5"
]

for log in logs:
    for indicator in threat_indicators:
        if indicator in log:
            print("MATCH FOUND")
            print("Indicator:", indicator)
            print("Log:", log)
            print("-" * 30)

7.13 Simple Threat Score Calculation

def calculate_threat_score(confidence, internal_match, indicator_type):
    score = 0

    if confidence == "high":
        score += 50
    elif confidence == "medium":
        score += 30
    else:
        score += 10

    if internal_match:
        score += 40

    if indicator_type in ["hash", "url"]:
        score += 10

    if score >= 80:
        return score, "Critical"
    elif score >= 60:
        return score, "High"
    elif score >= 40:
        return score, "Medium"
    else:
        return score, "Low"

score, level = calculate_threat_score("high", True, "ip")
print("Threat Score:", score)
print("Risk Level:", level)

7.14 Generate Threat Intelligence Report

import csv
from datetime import datetime

report_rows = [
    ["indicator", "type", "confidence", "internal_match", "risk_level", "recommendation"],
    ["203.0.113.5", "ip", "high", "yes", "Critical", "Investigate immediately"],
    ["malicious-example.com", "domain", "medium", "yes", "High", "Review DNS logs"],
    ["badfilehash123", "hash", "high", "no", "Medium", "Add to watchlist"]
]

with open("threat_intelligence_report.csv", "w", newline="", encoding="utf-8") as file:
    writer = csv.writer(file)
    writer.writerow(["Report Created", datetime.now()])
    writer.writerow([])
    writer.writerows(report_rows)

print("Threat intelligence report created.")

7.15 Safe API Collection Concept

import os
import requests

api_key = os.getenv("THREAT_INTEL_API_KEY")

if api_key is None:
    print("API key not configured. Set it as an environment variable.")
else:
    headers = {"Authorization": "Bearer " + api_key}
    url = "https://example.com/api/threat-feed"  # replace with approved API
    response = requests.get(url, headers=headers, timeout=10)
    print("Status Code:", response.status_code)

Important: Do not hardcode API keys. Follow API terms, rate limits and organizational policy.

7.16 OSINT Analysis Workflow

1Define Objective

2Collect Public Data

3Validate Source

4Normalize Indicators

5Correlate with Logs

6Report Findings

7.17 Interactive IOC Classifier Demo

Paste indicators below. The demo classifies them as IP, URL, hash, domain or unknown.

Indicators

Click Classify IOCs.

7.18 Practical Activities

Activity 1: Text Feed Processing

Create a text file containing IOCs. Read it using Python and remove duplicate indicators.

Activity 2: CSV Threat Feed

Create a CSV feed with indicator, type, source and confidence. Print only high-confidence indicators.

Activity 3: IOC Classification

Write a function that classifies indicators as IP, domain, URL or hash.

Activity 4: Log Correlation

Compare a list of threat indicators against internal sample logs and print matches.

Mini Project

Build a Python threat intelligence processor that reads a feed, removes duplicates, classifies indicators, matches them against logs and generates a CSV report.

7.19 Interactive Final Assessment Quiz

Each correct answer gives +1 mark. Each wrong answer gives -0.5 mark.

1. Threat intelligence helps organizations prevent, detect and respond to cyber threats.

True False

2. OSINT stands for:

Open Source Intelligence Online System Internal Network Official Security Internal Tool Open Server Internet Node

3. IP addresses, domains, URLs and file hashes can be IOCs.

True False

4. Which Python module is used to read JSON data?

json paint audio camera

5. Removing duplicate indicators improves feed quality.

True False

6. Threat intelligence is more useful when correlated with internal logs.

True False

7. API keys should be hardcoded directly inside scripts.

True False

8. Confidence level helps prioritize indicators.

True False

9. OSINT collection should follow legal, ethical and organizational rules.

True False

10. A CSV report can summarize threat intelligence findings.

True False

Your Score: 0

7.20 Chapter Summary

In this chapter, learners studied threat intelligence and OSINT using Python. They learned IOC types, feed formats, text/CSV/JSON processing, indicator classification, enrichment, threat scoring, log correlation, safe API concepts and reporting.

Remember: Threat intelligence becomes valuable when it is reliable, contextual, correlated with internal evidence and converted into clear defensive action.