關於truffleHog
truffleHog是一款功能強大的數據挖掘工具,該工具可以幫助廣大研究人員輕鬆從目標Git庫中搜索出搜索高熵字符串和敏感數據,我們就可以根據這些信息來提升自己代碼庫的安全性了。該工具可以通過深入分析目標Git庫的提交歷史和代碼分支,來搜索出潛在的敏感信息。
運行機制
該工具將遍歷目標Git庫的每個分支的整個提交歷史,檢查每個提交的每個Diff,並檢查可能存在的敏感數據。這是由正則表達式和熵得出的,對於熵檢查,truffleHog將評估每個Diff中超過20個字符的文本塊的base64字符集和十六進制字符集的香農熵。如果在任何時候檢測到大於20個字符的高熵字符串,它便會將相關數據打印到屏幕上。
工具安裝
該工具基於Python開發,因此廣大研究人員可以使用pip命令來完成工具的安裝:
自定義配置
我們可以通過「--rules /path/to/rules」添加自定義正則表達式,添加方式為JSON文件,內容格式如下:
{ "RSA private key": "-----BEGIN EC PRIVATE KEY-----" }{ "local self signed test key": "-----BEGIN EC PRIVATE KEY-----\nfoobar123\n-----END EC PRIVATE KEY-----", "git cherry pick SHAs": "regex:Cherry picked from .*", }注意,之前版本的truffleHog是在git Diff上運行熵檢查。該功能當前版本仍然存在,但增加了高信號正則表達式檢查,並且還增加了抑制熵檢查的功能:trufflehog --regex --entropy=False https://github.com/dxa4481/truffleHog.gittrufflehog file:///user/dxa4481/codeprojects/truffleHog/在「--include_paths」和「--exclude_paths」選項的幫助下,我們還可以通過在文件中定義正則表達式(每行一個)來匹配目標對象路徑,從而將掃描限制為Git歷史中對象的子集。下面給出的是可供參考的正則文件樣例:include-patterns.txt:src/ # lines beginning with "#" are treated as comments and are ignored gradle/ # regexes must match the entire path, but can use python's regex syntax for # case-insensitive matching and other advanced options (?i).*\.(properties|conf|ini|txt|y(a)?ml)$ (.*/)?id_[rd]sa$ exclude-patterns.txt:(.*/)?\.classpath$ .*\.jmx$(.*/)?test/(.*/)?resources/這些過濾器文件接下來可以通過下列命令部署使用:
trufflehog --include_paths include-patterns.txt --exclude_paths exclude-patterns.txt file://path/to/my/repo.git在這些過濾器的幫助下,工具可以發現並報告目標Git庫中根目錄下的問題。與此同時,我們還可以使用「-h」和「--help」命令來查看更多有用的信息。工具幫助信息usage: trufflehog [-h] [--json] [--regex] [--rules RULES] [--allow ALLOW] [--entropy DO_ENTROPY] [--since_commit SINCE_COMMIT] [--max_depth MAX_DEPTH] git_urlFind secrets hidden in the depths of git.positional arguments: git_url URL for secret searchingoptional arguments: -h, --help show this help message and exit --json Output in JSON --regex Enable high signal regex checks --rules RULES Ignore default regexes and source from json list file --allow ALLOW Explicitly allow regexes from json list file --entropy DO_ENTROPY Enable entropy checks --since_commit SINCE_COMMIT Only scan from a given commit hash --branch BRANCH Scans only the selected branch --max_depth MAX_DEPTH The max commit depth to go back when searching for secrets -i INCLUDE_PATHS_FILE, --include_paths INCLUDE_PATHS_FILE File with regular expressions (one per line), at least one of which must match a Git object path in order for it to be scanned; lines starting with "#" are treated as comments and are ignored. If empty or not provided (default), all Git object paths are included unless otherwise excluded via the --exclude_paths option. -x EXCLUDE_PATHS_FILE, --exclude_paths EXCLUDE_PATHS_FILE File with regular expressions (one per line), none of which may match a Git object path in order for it to be scanned; lines starting with "#" are treated as comments and are ignored. If empty or not provided (default), no Git object paths are excluded unless effectively excluded via the --include_paths option.
首先,我們要進入包含目標Git庫的目錄:
然後通過Docker鏡像啟動truffleHog,並運行下列命令:
docker run --rm -v "$(pwd):/proj" dxa4481/trufflehog file:///proj「-v」選項將把當前工作目錄(pwd)加載到Docker容器中的/proj目錄中。「file:///proj」包含了容器中「/proj」目錄的引用。
工具使用樣例

項目地址
https://github.com/trufflesecurity/truffleHog
參考資料
https://join.slack.com/t/trufflehog-community/shared_invite/zt-pw2qbi43-Aa86hkiimstfdKH9UCpPzQ