Sensitive data discovery
Hunt credential stores, secret material, and high-risk file types tenant-wide across SharePoint, OneDrive, and (via keyword search) Exchange Online. Our tool of choice is Microsoft Purview eDiscovery: it indexes mailboxes and sites org-wide and lets us query them with KeyQL and built-in sensitive-information types.
Purview eDiscovery
Prereqs: account in eDiscovery Manager or eDiscovery Administrator (or a custom role with search/export rights). Credential-related sensitive information types (SITs) (e.g. All credentials, General password, Client secret / API key) require advanced classification and typically E5 / premium eDiscovery. Confirm licensing before relying on SensitiveType for secrets.
Scope note: SensitiveType:"..." matches classified content on SharePoint / OneDrive (indexed documents). It does not search mailbox/Teams chat bodies for SIT matches the same way; combine with keyword queries for mailboxes. Stay within ROE for search, export, and retention.
Run a content search
- Purview portal, eDiscovery, open the system Content Search.
- New search, add locations (all users, specific users, sites, or org-wide SharePoint/OneDrive/Exchange as needed).
- In the query box, use KeyQL (keyword query language) and/or the condition builder (conditions).
- Run the search, then review Sample (hit counts, locations, size).
Downloading arbitrary files
Direct download works for common file types (e.g. .docx, .csv, email): select the item and use Download.
When Download is missing (e.g. .kdbx, other non-preview types), pull the file via the preview API:
- Open the item in preview anyways, then DevTools, Network.
- Find the
GetPreviewInforequest, copyDocumentIdfrom the JSON response. - Download any file that does offer the download functionality, Copy as cURL (request to
/api/DocumentPreview/DownloadDocument). - In that cURL, replace the
documentIdquery value with the ID from step 2, run with-o <filename>.
# The download request should look something like this
curl 'https://purview.microsoft.com/api/DocumentPreview/DownloadDocument?documentId=<DocumentId>' \
-X POST \
-o file.kdbxQuery patterns
Credential / secret files (extension-based)
Target vaults, keys, and common leak formats on sites (use keyword filetype: for prefix/wildcard behavior; in the GUI File type condition, list extensions explicitly. doc* does not match docx):
filetype:kdbx OR filetype:pem OR filetype:pfx OR filetype:p12 OR filetype:key OR filetype:ppk OR filetype:ovpn OR filetype:rdp(filename:password* OR filename:*secret* OR filename:*credential*) AND (filetype:txt OR filetype:csv OR filetype:xlsx OR filetype:json OR filetype:xml OR filetype:env)Keyword hunts (content in body/metadata)
password OR passwd OR pwd OR secret OR "api key" OR apikey OR api_key OR connectionstring OR "client secret" OR privatekey OR "BEGIN RSA PRIVATE KEY"Externally shared sensitive data
ViewableByExternalUsers:true AND SensitiveType:"All Credentials"ViewableByExternalUsers:true AND filetype:kdbxReferences
- Find sensitive data on sites:
SensitiveTypesyntax and examples. - Keyword queries and search conditions:
filetype, conditions, limitations. - Sensitive information type definitions: full SIT list (API keys, tokens, passwords).
- eDiscovery workflow: cases, holds, review sets.