Consider an .htaccess
file which must convert all underscores to dashes, and replace a potential .html
suffix with a slash in the filename.
Example URL from client: http://foo.com/a_b/c_d.html
Example URL to redirect: http://foo.com/a-b/c-d/
I have been using this rule to replace the .html
suffix:
RewriteRule ^(.*)\.html$ $1/ [R,L]
I have found this terrific serverfault.SE post for the underscore rewriting:
RewriteRule ^(.*)_(.*)_(.*)_(.*)$ /$1-$2-$3-$4 [R,L]
RewriteRule ^(.*)_(.*)_(.*)$ /$1-$2-$3 [R,L]
RewriteRule ^(.*)_(.*)$ /$1-$2 [R,L]
However, only one of the replacement types happens, whichever is first in the .htaccess
file. I cannot seem to configure .htaccess
to perform both replacements.
That means that the following code will replace the .html
suffix only:
RewriteRule ^(.*)\.html$ $1/ [R,L]
RewriteRule ^(.*)_(.*)_(.*)_(.*)$ /$1-$2-$3-$4 [R,L]
RewriteRule ^(.*)_(.*)_(.*)$ /$1-$2-$3 [R,L]
RewriteRule ^(.*)_(.*)$ /$1-$2 [R,L]
And the following code will replace underscores only:
RewriteRule ^(.*)_(.*)_(.*)_(.*)$ /$1-$2-$3-$4 [R,L]
RewriteRule ^(.*)_(.*)_(.*)$ /$1-$2-$3 [R,L]
RewriteRule ^(.*)_(.*)$ /$1-$2 [R,L]
RewriteRule ^(.*)\.html$ $1/ [R,L]
How must .htaccess
be configured to replace both the .html
suffix and the underscores?
As mentioned in comments, the directives you posted should already achieve what you require, albeit in two separate redirects. There is nothing you need to do to enable this behaviour in
.htaccess
- this is just how it works.It's possible there is some kind of conflict with other directives, but I'm struggling to imagine what kind of conflict could result in the behaviour you are seeing.
However, you could combine these two redirects, so there is only ever one redirect. For example:
This handles both URLs that end with a slash and those that don't (when no
.html
extension is provided) - as mentioned in comments. A trailing slash is always included in the substitution.Additional notes:
(.*?)
- The?
in the last captured group makes the regex non-greedy. This is necessary so as to not capture the trailing slash or.html
extension (if any). This is left for the non-capturing group that follows...(?:/|\.html)?
- This is a non-capturing (?:
) group that is optional (trailing?
). Using alternation, it either matches a trailing slash or trailing.html
extension or nothing at all.