Remove Multiple Query Strings From Multiple URLs with .htaccess

218 Views Asked by At

I would like to remove all query strings including parameters and values from URLs with htaccess rules.

Here are a few URLs with query strings as examples which are needed to be removed from the end of URLs.

https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=342 
https://example.com/page/62/?option=com_content&view=article&id=91&Itemid=2 
https://example.com/page/30/?start=72 
https://example.com/other-category-slug/page/12/?add_to_wishlist=9486  
https://example.com/other-category-slug/page/15/?add_to_wishlist=9486 
https://example.com/other-category-slug/page/4/?orderby=price-desc&add_to_wishlist=332 
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=5736 
https://example.com/other- category-slug/page/7/?orderby=popularity 
https://example.com/other-category-slug/page/15/?add_to_wishlist=350 
https://example.com/category-slug/page/19/?orderby=price-desc 
https://example.com/category-slug/page/3/?orderby=date 
https://example.com/page/2/?post_type=map 
https://example.com/category-slug/page/2/?PageSpeed=noscript 
https://example.com/category/page/6/?orderby=menu_order 
https://example.com/page/50/?Itemid=wzshaxrogq 
https://example.com/category-slug/page/1/?orderby=price&add_to_wishlist=12953 
https://example.com/category-slug/this-is-product-slug/?PageSpeed=noscript 
https://example.com/category-slug/?add_to_wishlist=15153 
https://example.com/page/24/?op 
https://example.com/page/68/?iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625

I need clean URLs like these without query strings and parameters.category-slug and product-slug are just examples. I believe i need 5 rules.

https://example.com/category-slug/product-slug/
https://example.com/category-slug/page/15/
https://example.com/category-slug/
https://example.com/page/62/
https://example.com/

Here are a few query strings which I want to keep.

https://example.com/?attachment_id=123
https://example.com/?p=123
https://example.com/page/12/?fbclid=PAAaaK8eCN
https://example.com/your-shopping-cart/?remove_item=22c1acb3539e1aeba2
https://example.com/category-slug/this-is-product-slug/?add-to-cart=29030
https://example.com/?s=%7Bsearch_term_string%7D

Here is my code which is not working. In fact I don't understand the Regex in them.

RewriteEngine On
RewriteRule ^(page/[0-9]+)/.+$ /$1? [L,NC,R=301]
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
RewriteRule ^$ /? [L,NC,R=301]

Thanks in advance

1

There are 1 best solutions below

8
MrWhite On

Yes , Query strings are exact match

Whilst you've given examples of the URL-path, it looks like you just need to base the match on the query string part of the URL, not the URL-path? Unless the same query string could appear on another URL-path that you would want to keep?

You would only need to focus on the query strings you want to remove, not the ones you want to keep.

I believe i need 5 rules.

It looks like you would need just one rule, but with a lot of conditions (RewriteCond directives). One condition for every query string (since you say they are "exact matches").

RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$

Although, rather confusingly, you are not attempting an "exact match" at all in your rule, but rather using a generic pattern. (Although you've stated you "don't understand the Regex".)

If you are wanting "exact matches" then you don't need to use regex at all. You can use the = prefix operator on the CondPattern (2nd argument to the RewriteCond directive) to make it an exact (lexicographical) match.

For example, try something like the following instead:

RewriteEngine On

RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=342 [OR]
RewriteCond %{QUERY_STRING} =option=com_content&view=article&id=91&Itemid=2 [OR]
RewriteCond %{QUERY_STRING} =start=72 [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=9486 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=332 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=5736 [OR]
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=350 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =post_type=map [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
RewriteCond %{QUERY_STRING} =Itemid=wzshaxrogq [OR]
RewriteCond %{QUERY_STRING} =orderby=price&add_to_wishlist=12953 [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=15153 [OR]
RewriteCond %{QUERY_STRING} =op [OR]
RewriteCond %{QUERY_STRING} =iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]

The above redirects to the same URL-path, but strips the original query string if it matches any of those stated in the preceding conditions.

The QSD flag (Query String Discard) strips the original query string from the request. This is the preferred method on Apache 2.4. However, if you are still on Apache 2.2 then you would need to append an empty query string instead (as you are doing in your existing rule). For example:
RewriteRule ^ %{REQUEST_URI}? [R,L]

Note there is no OR flag on the last RewriteCond directive.

NB: You had included the query string add_to_wishlist=9486 twice in the list of URLs/query strings to remove.

Test first with a 302 (temporary) redirect and only change to a 301 (permanent), if that is the intention, once you have confirmed that it works as intended. 301s are cached persistently by the browser so can make testing problematic.

Make sure the browser cache is cleared before testing.


Combining conditions using regex

Using regex, you could combine several of the conditions. For example, the following 4 conditions could be combined into one:

RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]

Is the same as (using regex alternation):

RewriteCond %{QUERY_STRING} ^orderby=(popularity|price-desc|date|menu_order)$ [OR]

UPDATE:

Is it possible to Remove everything (query string and parameters etc) from all URLs with something like * instead of hardcoding each query string?

To remove every query string from every URL (seriously?) then you can do the following (no, you don't use *):

RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]

This removes any query string from any URL. The single dot (.) in the CondPattern matches a single character to check that there is a query string.

But this obviously removes the query strings you want to "keep" as well.

The regex character * is a quantifier that repeats the preceding token 0 or more times. (It is not a "wildcard-pattern".) It is not required here. You need to check that the query string is something, not nothing.

There are other options:

  • Reverse the logic and make exceptions for query strings you want to "keep" and remove the rest. But it depends which is the larger.
  • Don't match the query strings "exactly". And instead match URL parameter names, with any value.