I've been using grep to run a few PII scans and while it's finding results, it's indeed finding too many false positives.
Is there a way that I can tell grep not to trigger a match for a file unless it contains other data?
For instance, can I tell it not to trigger an alert on a regex for a SSN unless the file includes text like "ssn" or "social security number" somewhere else in the file?
Sure, just pipe the first output to another grep. Grep for foo and then grep for bar, so your output will only be lines that have both foo AND bar.
This only hits on uppercase SSN, then passes any lines with either sss OR social security number. lower case. Add -i for any case. If you reall want both SSN and ssn try this;
Another one I like is to only shows lines that don't have bar. Do Grep for foo then grep -v bar. Only lines with foo, that also do not have bar. Using GNU grep 2.5.4 in windows