What is OSINT?

Open source intelligence (OSINT) is the process of collecting, analyzing, and converting publicly available information about an organization's digital footprint into clear technical insights that guide security decisions. Unlike a casual web search, OSINT follows a structured methodology that transforms raw data into a clear map of an external attack surface.

Originally developed for military and government intelligence, OSINT is now a foundational part of cloud security, fraud detection, and digital forensics. OSINT differs from hacking because it doesn’t involve breaching systems, bypassing authentication, or exploiting vulnerabilities. Security engineers operate entirely within the boundaries of ethically accessible data to identify the same exposures an external party might find.

OSINT analysts focus on what an organization unintentionally exposes to the internet: infrastructure, metadata, credentials, APIs, employee information, and misconfigurations that can be discovered without breaching any perimeter. This intelligence is gathered from top OSINT sources like:

The open web: Content routinely indexed by standard search engines, including corporate websites, news reports, social media profiles, and public code repositories.
The deep web: This includes unindexed information accessible through specific APIs, such as certificate transparency logs or historical DNS records.
The dark web: Anonymized networks requiring specialized software to access, heavily monitored by security teams for leaked proprietary data, credential dumps, and threat actor communications.

The 2026 Cloud Threat Report

See the latest attack patterns and threat actor TTPs targeting cloud environments.

How does OSINT work: The OSINT cycle

The OSINT cycle distinguishes professional intelligence from a casual web search. This structured process ensures teams collect, interpret, and act on data to identify real risk rather than noise.

Security teams follow these six steps to build a structured OSINT workflow:

Step 1: Planning and direction: Analysts define specific intelligence requirements, such as verifying S3 bucket naming conventions or identifying exposed employee data on social media.

Step 2: Collection: Guided by the intelligence requirements, teams collate raw data from open sources using OSINT tools like Shodan and Censys, and techniques like Google dorking, port enumeration, and DNS mapping (detailed in the section below).

Step 3: Processing: Security teams clean and normalize the raw results by removing overlapping records, validating sources, and discarding noise.

Step 4: Analysis: This step links normalized data to your internal environment to determine if an exposure is reachable or connects to sensitive data.

Step 5: Dissemination: Analysts route findings to the correct stakeholders, such as providing remediation steps to infrastructure engineers.

Step 6: Continuous monitoring: Security programs automate this cycle to pipe findings into platforms for continuous exposure enrichment.

What are the common OSINT techniques and methodologies?

OSINT collection techniques range from passive discovery to active probing based on the level of interaction required with the target environment.

1. Passive reconnaissance techniques

Passive reconnaissance identifies exposure through third-party sources like registries, archives, and search engines. This method avoids direct interaction with target systems to ensure the investigation remains undetectable.

Passive reconnaissance techniques


Google dorking (Google hacking)	Security engineers use advanced search operators to find indexed resources. This reveals administrative login pages or configuration files that should remain private.
Certificate transparency mining	Every issued TLS certificate is recorded in public logs. Querying Certificate Transparency (CT) logs identify newly issued certificates to map subdomains, backend APIs, and edge infrastructure.
DNS & infrastructure mapping	Security engineers use passive DNS history, WHOIS records, reverse IP lookups, and ASN analysis to map current and historical infrastructure relationships. This mapping identifies abandoned legacy infrastructure to ensure it meets modern security standards.
Storage enumeration & public database searches	Security teams identify misconfigured cloud storage buckets, unprotected databases, and leaked credentials published within third-party breach compilations. This allows teams to focus on exploitable findings rather than purely informational exposure.
Code & repository intelligence	Scanning public repositories like GitHub, GitLab, and Bitbucket identifies exposed credentials. Analysts correlate organizational identifiers and credential patterns to identify exposed API keys, internal URLs, and cloud provider access tokens, thereby removing potential attacker leverage.
Metadata extraction	Publicly posted images and files contain metadata that serves as a technical resource for analysts. Metadata like EXIF data, usernames, file paths, software versions, and GPS coordinates validates and enriches other security findings.
Social media & organizational intelligence	Security teams analyze LinkedIn, job postings, and public appearances to identify employee roles and technology stacks. This data helps teams secure the human attack surface by uncovering operational patterns visible to the internet.

2. Active reconnaissance techniques

Active reconnaissance techniques directly interact with target systems to generate network traffic and record activity in server logs. Security engineers apply active techniques to validate exposures identified during the passive discovery phase. Increased interaction levels provide higher technical certainty but require coordination with security operations to distinguish testing from unauthorized activity.

Active reconnaissance techniques


Port scanning & service enumeration	Security engineers scan specific IP addresses to find open network ports and determine which services, protocol versions, and configurations run on the infrastructure perimeter. This data allows teams to focus remediation efforts on the most critical exposures.
Banner grabbing	Security engineers initiate direct connections to discovered services like web servers or databases to extract service banners. These banners reveal specific version information and metadata that teams use to identify vulnerabilities and prioritize patching.
Web application fingerprinting	Security teams identify the underlying technology stack of a web application by analyzing HTTP response headers, CMS default files, and platform responses. This process identifies misconfigured components that require remediation.

3. Passive vs. active OSINT: How are they different?

Passive reconnaissance methods use third-party data sources and avoid direct interaction with target systems.

Active reconnaissance directly probes target systems, generating logs, triggering SOC alerts.

Teams use active techniques sparingly to avoid triggering security alerts or network disruption. Mature organizations enforce strict controls around active techniques, often requiring formal authorization (even for internal assets) to avoid operational disruption, false alarms, and policy violations.

Watch 5-minute demo

See exactly how Wiz handles a live threat.

Legal, ethical, and OPSEC considerations

High-maturity security programs integrate legal, ethical, and operational security (OPSEC) considerations directly into their OSINT methodology. While OSINT utilizes publicly accessible data, organizations must distinguish between data availability and authorized use. Establishing clear data-use policies ensures that intelligence collection remains compliant with global privacy regulations and internal governance standards.

Before initiating an investigation or automating collection, security teams define strict operational guidelines. These policies codify permissible techniques, data handling rules, and escalation thresholds. This structured approach ensures that intelligence provides the technical context required to secure the environment without creating new organizational liabilities. Core considerations include:

1. Legal and ethical boundaries

The foundational rule of OSINT is that "publicly available" does not mean "no restrictions on use.” When analysts identify data within third-party breach compilations (such as leaked employee credentials, social media profiles, or internal directories), they handle personally identifiable information (PII). So, ethical and regulatory laws like GDPR (EU) and CCPA (California) regulations still guide how teams handle PII, ensuring data use remains limited and accountable.

While OSINT data exists openly on the internet, active scanning requires explicit authorization. Security teams align active scanning with acceptable use policies (AUP) to ensure all infrastructure interactions remain authorized. This includes coordinating vulnerability scans or service enumeration with asset owners to avoid operational disruption.

Performing active scans on third-party infrastructure requires strict adherence to unauthorized access provisions and cross-border data handling laws. This coordination ensures that active discovery adheres to legal frameworks like the U.S. Computer Fraud and Abuse Act (CFAA) and maintains the integrity of the security investigation.

2. Operational Security (OPSEC) for Analysts

Operational security (OPSEC) protects the investigator by minimizing the digital footprint left during research. Every interaction on the internet generates network metadata that identifies the source of an investigation if the environment is not properly isolated.

Directly querying external infrastructure from a corporate network associates the organization’s IP address with the investigation. To maintain anonymity and prevent external parties from identifying the source of an inquiry, security engineers must use dedicated infrastructure for all OSINT activities.

Analysts perform investigations using isolated environments, sandboxed systems, and dedicated VPNs. This separation ensures that research activities remain distinct from the corporate network, preserving the integrity of the investigation and the security of the organization.

What Tools are Used for OSINT?

OSINT tools vary depending on the intelligence target. OSINT tools are typically organized into categories based on their function and data sources, from broad internet-mapping platforms to specialized extraction tools:

1. Aggregators & search engines

Rather than launching manual scans that trigger target defenses, analysts query commercial platforms that continuously index the global internet, including:

Shodan: Indexes internet-connected devices and open ports. It helps security teams find unprotected IoT devices or misconfigured cloud databases.
Censys: An internet-wide scanning infrastructure for granular certificate and host searches which is critical for spotting exposed services, digital certificates, and network vulnerabilities.
crt.sh: Monitors Certificate Transparency (CT) logs so practitioners can easily discover newly issued TLS certificates, mapped directly to subdomains that an organization may not realize are public.

2. Repository scanners

Repository scanners identify exposed secrets in public code, allowing security teams to remediate vulnerabilities before they impact the cloud environment.

TruffleHog: Scans Git repository history for secrets, like API keys and infrastructure credentials, that developers may have accidentally committed to publicly available code.
Gitleaks: Detects hardcoded secrets, connection strings, and configuration file oversights across code repositories.

3. Link analysis & visualization

Security teams correlate raw public data with internal asset logs to identify specific risks to the cloud environment.

Maltego: Visualizes relationships between entities (people, emails, domains, and IPs) so analysts can connect complex corporate footprints attackers could exploit.
Spiderfoot: Scans and correlates hundreds of public data sources (DNS, threat feeds, social media) to map a target's entire digital footprint.

4. Document & metadata scanners

Publicly available files contain embedded metadata that provides technical context for infrastructure mapping.

FOCA: Extracts hidden information and metadata from standard public documents, allowing analysts to retrieve internal network schemas, software versions, and embedded author tracking data for deeper infrastructure mapping.

5. Browser-based investigators

Ad hoc investigations prioritize speed and manual analysis over automated workflows for targeted research. This approach allows security engineers to pivot quickly between data sources during high-priority incidents, focusing on specific technical indicators without the constraints of pre-defined automation logic.

Mitaka: A lightweight browser extension that performs rapid queries for domains, IP addresses, and file hashes across multiple threat intelligence databases directly from the browser.

Académie Wiz

Top threat intelligence platforms in 2026

Threat intelligence platforms (TIPs) aggregate attacker data from OSINT, dark web sources, commercial feeds, and adversary infrastructure to highlight the threats most likely to be exploited.

Frameworks & reference resources

Because the OSINT ecosystem includes a wide range of specialized tools, security teams prioritize structured methodologies over specific software selections. Standardized frameworks ensure consistent data collection and analysis, allowing organizations to maintain a repeatable security posture as individual tools evolve.

OSINT Framework: A web-based directory that organizes specialized research tools by data category. Security teams use the framework to pivot from a specific input (such as an email address, domain, or IP) to the relevant discovery platform for deeper analysis.
IntelTechniques: Provides specialized search applications and operational security (OPSEC) parameters for complex digital investigations. Security teams use these resources to maintain anonymity while performing deep-web research.
NATO OSINT Handbook/Open Source Intelligence Handbook: These foundational references define the technical requirements for the standardized intelligence lifecycle. They provide the methodology for converting raw open-source data into the technical context required for enterprise security operations.

OSINT for Cybersecurity

OSINT supports several high-value cybersecurity use cases. It is especially useful for continuous external attack surface discovery, leaked credential and secret detection, and incident investigation.

For attack surface discovery, OSINT can help identify public-facing endpoints, subdomains, storage buckets, and APIs that appear as cloud environments change. For credential exposure, public repositories and other open sources can reveal accidentally committed keys, tokens, or connection strings before attackers find them. For incident response, OSINT adds outside-the-network context through sources such as WHOIS, passive DNS, and threat intelligence reporting, which can help scope incidents and understand attacker infrastructure.

How Wiz bridges OSINT to threat intelligence

The value of OSINT depends on the correlation of external discoveries with internal environment context. While OSINT identifies exposed infrastructure, it does not inherently indicate whether that infrastructure contains sensitive data, holds privileged access, or represents a high-risk attack surface.

Wiz addresses this requirement by mapping external exposure against the cloud threat landscape and integrating these findings into the Wiz threat and vulnerability detection engine. This correlation enables security teams to move beyond raw data collection toward automated remediation.

Here’s how:

Cloud threat landscape: Wiz tracks cloud-specific TTPs and attack campaigns. When a scan identifies an exposed API, the platform determines if the exposure matches known high-risk attack patterns, allowing teams to prioritize remediation.
Risk correlation via the Wiz Security Graph: Wiz correlates OSINT findings with internal context (including IAM privileges, data sensitivity, and network paths) to identify exploitable risks. This process applies the intelligence cycle at machine speed to distinguish between benign instances and critical vulnerabilities.
Attack path analysis: Wiz maps the relationship between OSINT-discovered exposures and critical assets. This converts raw data into a prioritized remediation sequence based on the potential impact on the cloud environment.
Integrated threat detection (Wiz Defend): OSINT and threat intelligence are integrated into detection rules across the Wiz platform. When new technical indicators emerge, detection rules update automatically to automate the transition from intelligence to environment protection.
Wiz Threat Center: Wiz alerts when newly disclosed vulnerabilities affect specific environment assets, providing technical context on exploitability and exposure.

OSINT provides the raw reconnaissance, while threat intelligence provides the context required to prioritize findings. Wiz connects external reconnaissance with internal cloud context to identify exactly which exposures lead to critical assets. This ensures your team focuses on exploitable risk rather than theoretical exposure.

Ready to see how the Wiz Security Graph prioritizes your external footprint? Schedule a demo today.

See Wiz threat intelligence in action

Enrich OSINT findings with full cloud context, from workload exposure to identity permissions.