Consider this command:
echo "string.with.dots" | sed 's/\(.*\)\.\(.*\)/\1\n\2/'
(Matches into a first capturing group any character until the last .
and into a second capturing group any character after it.)
This outputs:
string.with
dots
Reasonably (I think) I tought that using anchors in the right combination would have managed to reverse such behavior (i.e. the match would have been string
for the first capturing group and with.dots
for the second capturing group), but:
echo "string.with.dots" | sed 's/^\(.*\)\.\(.*\)/\1\n\2/'
echo "string.with.dots" | sed 's/^\(.*\)\.\(.*\)$/\1\n\2/'
echo "string.with.dots" | sed 's/\(.*\)\.\(.*\)$/\1\n\2/'
All output:
string.with
dots
I don't know how the pattern matching is implemented, but it seems that it always privileges the patterns closer to the start of the string rather than those closer to the end of the string (despite any present ^
or missing $
).
How can this behavior be changed (i.e. not how to write an hard-coded solution to this example, but how to reverse the pattern-matching priority order into sed
or into regexes in general), if possible?
To get what you want try this:
Test:
sed
will match greedily, so while you are usingsed 's/\(.*\)\.\(.*\)/\1\n\2/'
it will greedily match upto last.
as the first captured group and then then the rest after the.
as second.In my
sed
expression, to stopsed
from being greedy, i have to search for some alternatives. I have matched from the start to a.
as the first group ([^.]*
) and then whatever after the first match as the second.Now if you want all portions around
.
in separate lines:Add two
rev
and swap\1
and\2
:Output:
I wonder if you can get away with using bash parameter expansion