I got a string like the following:
test.de. 1547 IN SOA ns1.test.de. dnsmaster.test.de. 2012090701 900 1000 6000 600
now I want to replace all the tabs/spaces inbetween the records with just a single space so I can easily use it with cut -d " "
I tried the following:
sed "s/[\t[:space:]]+/[:space:]/g"
and various varions but couldn't get it working. Any ideas?
Use
sed -e "s/[[:space:]]\+/ /g"
Here's an explanation:
For your replacement, you only want to insert a space.
[:space:]
won't work there since that's an abbreviation for a character class and the regex engine wouldn't know what character to put there.The
+
must be escaped in the regex because with sed's regex engine+
is a normal character whereas\+
is a metacharacter for 'one or more'. On page 86 of Mastering Regular Expressions, Jeffrey Friedl mentions in a footnote that ed and grep used escaped parentheses because "Ken Thompson felt regular expressions would be used to work primarily with C code, where needing to match raw parentheses would be more common than backreferencing." I assume that he felt the same way about the plus sign, hence the need to escape it to use it as a metacharacter. It's easy to get tripped up by this.In sed you'll need to escape
+
,?
,|
,(
, and)
. or use -r to use extended regex (then it looks likesed -r -e "s/[[:space:]]\+/ /g"
orsed -re "s/[[:space:]]\+/ /g"
You can use the
-s
("squeeze") option oftr
:The
[:blank:]
character class comprises both spaces and tabs.Here are some interesting methods I found via experiments (using xxd to see tabs).
I like using the following alias for bash. Building on what others wrote, use sed to search and replace multiple spaces with a single space. This helps get consistent results from cut. At the end, i run it through sed one more time to change space to tab so that it's easier to read.