Site accessible with multiple URLs

139 Views Asked by At

I'm using the micro framework Silex on my website hosted on a VPS.

So, the site files are in the /site_name/public_html/ folder but, with Silex, the site must point to the /site_name/public_html/web/ folder.

In the public_html directory, I have the following .htaccess file :

Options -Indexes -MultiViews

<IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /

    # Redirect to https & www
    RewriteCond %{HTTPS} off [OR]
    RewriteCond %{HTTP_HOST} !^www\. [NC]
    RewriteRule ^ https://www.example.com%{REQUEST_URI} [R=301,L,NE]

    # Redirect incoming URLs to web folder
    RewriteCond %{REQUEST_URI} !web/
    RewriteRule (.*) /web/$1 [L]
</IfModule>

And, in the /public_html/web/ folder, the following .htaccess :

<IfModule mod_rewrite.c>
    # Redirect incoming URLs to index.php
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^ index.php [QSA,L]
</IfModule>

Now, everything works fine but my pages are accessible with three different patterns :

  1. example.com/page/ (the one I want to keep)
  2. example.com/web/page/
  3. example.com/web/index.php/page/

I have used the meta canonical to avoid duplicate content but I still want these last two options to not exist.

I guess I have something to change in both .htaccess files but I can't find what it is.

1

There are 1 best solutions below

3
MrWhite On

I would actually remove the .htaccess file in the /web subdirectory altogether and rewrite directly to /web/index.php in the root .htaccess file. By having two .htaccess files you are seemingly creating extra work. The mod_rewrite directives in the subdirectory will completely override the parent directives (by default), so your canonical HTTPS and www redirects are also being overridden.

(Presumably you had a RewriteEngine On directive in the /web/.htaccess file?)

Having removed the /web/.htaccess file, try something like the following in your root .htaccess file:

Options -Indexes -MultiViews

RewriteEngine On
RewriteBase /web

# Redirect to https & www
RewriteCond %{HTTPS} off [OR]
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^ https://www.example.com%{REQUEST_URI} [R=302,L,NE]

# If /web or /index.php is present at the start of the requested URL then remove it (via redirect)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^(?:web|index\.php)/(.*) /$1 [R=302,L]

# Front-controller...
# Internally rewrite all requests to /web/index.php (uses RewriteBase set above)
RewriteRule index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [L]

The check against the REDIRECT_STATUS environment variable ensures we only test initial requests and not requests that have been later rewritten.

The <IfModule> wrapper is not required, unless your site is intended to work without mod_rewrite.

Note that a request like /web/index.php/page/ would result in two redirects. First to /index.php/page then to /page. Since this is an edge case I would consider a double redirect to be acceptable.

UPDATE: I've removed the "directory" check in the above as this would have prevented the document root (example.com/) from being rewritten to the /web subdirectory. This would have consequently resulted in a 403 if you didn't have a directory index document (eg. index.php) in the document root of your site. (However, requests for example.com/page/ should have still worked OK.)

Test with 302 (temporary) redirects and only change to 301 (permanent) when you are sure it's working OK - to avoid any caching issues in the browser. Be sure to clear the browser cache before testing.