For this given input:
How to get This line that this word repeated 3 times in THIS line?
But not this line which is THIS word repeated 2 times.
And I will get This line with this here and This one
A test line with four this and This another THIS and last this
I want this output:
How to get This line that this word repeated 3 times in THIS line?
And I will get This line with this here and This one
Getting whole lines contains only three repeated "this" words. (case insensitive match)
In
perl
, replacethis
with itself case-insensitively and count the number of replacements:Using a count of matches instead:
If you have GNU awk, a very simple way:
The number of fields will be one more than the number of separators.
In python, this would do the job:
outputs:
Or to read in from a file, with the file as argument:
Paste the script into an empty file, save it as
find_3.py
, run it by the command:Of course the word "this" can be replaced by any other word (or other string or line section), and the number of occurrences per line can be set to any other value in the line:
Edit
If the file would be large (hundreds of thousands / millions of lines), the code below would be faster; it reads the file per line instead of loading the file at once:
Assuming your source file is tmp.txt,
The left grep outputs all lines that do not have 4 or more case-insensitive occurrences of "this" in tmp.txt.
The result is piped to the right grep, which outputs all lines with 3 or more occurrences in the left grep result.
Update: Thanks to @Muru, here is the better version of this solution,
replace 4 with n+1 and 3 with n.
You can play a bit with
awk
for this:This returns:
Explanation
What we do is to define the field separator to
this
itself. This way, the line will have as many fields +1 as times the wordthis
appears.To make it case insensitive, we use
IGNORECASE = 1
. See reference: Case Sensitivity in Matching.Then, it is just a matter of saying
NF==4
to get all those lines havingthis
exactly three times. No more code is needed, since{print $0}
(that is, print the current line) is the default behaviour ofawk
when an expression evaluates toTrue
.Assuming the lines are stored in a file named
FILE
:If you're in Vim:
This will just print matched lines.
Ruby one-liner solution:
Works in a quite simple fashion: we redirect file into ruby's stdin, ruby gets line from stdin, cleans it up with
chomp
anddowncase
, andscan().count
gives us number of occurrences of a substring.