Infotx

Welcome to Infotx - Webmaster Guides and Resources.

Web hosting Apache.

Apache

 

 

How do I perform a URL redirect/rewrite using the .htaccess file?

 

.htaccess Redirect/Rewrite Tutorial

Part 1 - How do I redirect all links for www.domain.com to domain.com ?

Description of the problem:
Your website can be accessed with www.domain.com and domain.com. Since Google penalizes this due to duplicated content reasons, you have to stick your domain to either www.domain.com or domain.com.
But - since some links are outside of your website scope and the search engines already have indexed your website under both addresses, you can't change that easily.
Solution:
Do a 301 redirect for all http requests that are going to the wrong url.
Example 1 - Redirect domain.com to www.domain.com
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.domain.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [L,R=301]
Example 2 - Redirect www.domain.com to domain.com
RewriteEngine On
RewriteCond %{HTTP_HOST} !^domain.com$ [NC]
RewriteRule ^(.*)$ http://domain.com/$1 [L,R=301]
Explanation of this .htaccess 301 redirect
Let's have a look at the example 1 - Redirect olddomain.com to www.newdomain.com. The first two lines just say apache to handle the current directory and start the rewrite module. The next line RewriteCond %{HTTP_HOST} !newdomain.com$ specifies that the next rule only fires when the http host (that means the domain of the queried url) is not (- specified with the "!") newdomain.com. The $ means that the host ends with newdomain.com - and the result is that all pages from newdomain.com will trigger the following rewrite rule. Combined with the inversive "!" is the result every host that is not newdomain.com will be redirected to this domain. The [NC] specifies that the http host is case insensitive. The escapes the "." - becaues this is a special character (normally, the dot (.) means that one character is unspecified). The next - and final - line describes the action that should be executed: RewriteRule ^(.*)$ http://www.newdomain.com/$1 [L,R=301]. The ^(.*)$ is a little magic trick. Can you remember the meaning of the dot? If not - this can be any character(but only one). So .* means that you can have a lot of characters, not only one. This is what we need - because this ^(.*)$ contains the requested url, without the domain. The next part http://www.newdomain.com/$1 describes the target of the rewrite rule - this is our "final", used domain name, where $1 contains the content of the (.*). The next part is also important, since it does the 301 redirect for us automatically: [L,R=301]. L means this is the last rule in this run - so after this rewrite the webserver will return a result. The R=301 means that the webserver returns a 301 moved permanently to the requesting browser or search engine.

Part 2 - How do I redirect domain.com/ to domain.com/index.php ?

Description of the problem:
You have a website with the name domain.com - and you want to redirect all incomming urls that are going to domain.com/ to domain.com/index.php
Solution
RewriteEngine On
RewriteCond %{HTTP_HOST} ^domain.com$
RewriteRule ^$ http://domain.com/index.php [L,R=301]
Explanation of this .htaccess 301 redirect
What does this code above do? Let's have a look at the example 1 - Redirect olddomain.com to www.newdomain.com. The first two lines just say apache to handle the current directory and start the rewrite module. The next line RewriteCond %{HTTP_HOST} !newdomain.com$ specifies that the next rule only fires when the http host (that means the domain of the queried url) is not (- specified with the "!") newdomain.com. The $ means that the host ends with newdomain.com - and the result is that all pages from newdomain.com will trigger the following rewrite rule. Combined with the inversive "!" is the result every host that is not newdomain.com will be redirected to this domain. The [NC] specifies that the http host is case insensitive. The escapes the "." - becaues this is a special character (normally, the dot (.) means that one character is unspecified). The next - and final - line describes the action that should be executed: RewriteRule ^(.*)$ http://www.newdomain.com/$1 [L,R=301]. The ^(.*)$ is a little magic trick. Can you remember the meaning of the dot? If not - this can be any character(but only one). So .* means that you can have a lot of characters, not only one. This is what we need - because this ^(.*)$ contains the requested url, without the domain. The next part http://www.newdomain.com/$1 describes the target of the rewrite rule - this is our "final", used domain name, where $1 contains the content of the (.*). The next part is also important, since it does the 301 redirect for us automatically: [L,R=301]. L means this is the last rule in this run - so after this rewrite the webserver will return a result. The R=301 means that the webserver returns a 301 moved permanently to the requesting browser or search engine.

Part 3 - How can I migrate domain content with .htaccess?

Description of the problem:
You have an old website that is accessible under olddomain.com and you have a new website that is accessible under newdomain.com . Copying the content of the old website to the new website is the first step - but what comes after that? You should do a 301 moved permanently redirect from the old domain to the new domain - which is easy and has some advantages:

  • Users will automatically be redirected to the new domain - you don't have to inform them.
  • Also search engines will be redirected to the new domain - and all related information will be moved to the new domain (but this might take some time).
  • Google's PageRankTM will be transfered to the new domain, also other internal information that is being used to set the position of pages in the search engine result pages (serp's) - like TrustRank .

Solution:
Do a 301 redirect for all http requests that are going to the old domain.
Example 1 - Redirect from olddomain.com to www.newdomain.com
RewriteEngine On
RewriteCond %{HTTP_HOST} !newdomain.com$ [NC]
RewriteRule ^(.*)$ http://www.newdomain.com/$1 [L,R=301]
This is useful when you use www.newdomain.com as your new domain name (see also this article about redirecting www and non-www domains). If not - use the code of example 2.
Example 2 - Redirect from olddomain.com to newdomain.com
RewriteEngine On
RewriteCond %{HTTP_HOST} !newdomain.com$ [NC]
RewriteRule ^(.*)$ http://newdomain.com/$1 [L,R=301]

Part 4 - Add a trailing slash to requested URLs

Description of the problem:
Some search engines remove the trailing slash from urls that look like directories - e.g. Yahoo does it. But - it could result into duplicated content problems when the same page content is accessible under different urls. Apache gives some more information in the Apache Server FAQ.

Let's have a look at an example: enarion.net/google/ is indexed in Yahoo as enarion.net/google - which would result in two urls with the same content.
Solution
The solution was to create a .htaccess rewrite rule that adds the trailing slashes to these urls. Example - redirect all urls that doesn't have a trailing slash to urls with a trailing slash
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301]
Explanation of this add trailing slash .htaccess rewrite rule
The first line tells Apache that this is code for the rewrite engine of the mod_rewrite module of Apache. The 2nd line sets the current directory as page root. But the interesting part is following now: RewriteCond %{REQUEST_FILENAME} !-f makes shure that files that are existing will not get a slash added. You shouldn't do the same with directories since this would exlude the rewrite behaviour for existing directories. The line RewriteCond %{REQUEST_URI} !example.php exludes a sample url that shouldn't be rewritten. This is just an example - if you don't have any file or url that shouldn't be rewritten, remove this line. The condition RewriteCond %{REQUEST_URI} !(.*)/$ finally fires when a urls doesn't contain a trailing slash - this is all what we want. Now we need to redirect these url with the trailing slash: RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301] does the 301 redirect to the url with the trailing slash appended for us. You should replace domain.com with your url. Make shure that you stick with the right domain name; if unshure, have a look at this article.
This article was referenced with gratitude from enarion.net at the following URL: http://enarion.net/web/apache/htaccess/

 

 

Do hosting companies support mod_rewrite for apache?

 

Most hosting companies support Mod_rewrite. Mod_rewrite is enabled on all servers.

 

 

.htaccess Tutorial

 

.htaccess Tutorial
Part 1 - Introduction

Introduction

In this tutorial you will find out about the .htaccess file and the power it has to improve your website. Although .htaccess is only a file, it can change settings on the servers and allow you to do many different things, the most popular being able to have your own custom 404 error pages. .htaccess isn't difficult to use and is really just made up of a few simple instructions in a text file.

Will My Host Support It?

This is probably the hardest question to give a simple answer to. Many hosts support .htaccess but don't actually publicise it and many other hosts have the capability but do not allow their users to have a .htaccess file. As a general rule, if your server runs Unix or Linux, or any version of the Apache web server it will support .htaccess, although your host may not allow you to use it.

A good sign of whether your host allows .htaccess files is if they support password protection of folders. To do this they will need to offer .htaccess (although in a few cases they will offer password protection but not let you use .htaccess). The best thing to do if you are unsure is to either upload your own .htaccess file and see if it works or e-mail your web host and ask them.

What Can I Do?

You may be wondering what .htaccess can do, or you may have read about some of its uses but don't realise how many things you can actually do with it.

There is a huge range of things .htaccess can do including: password protecting folders, redirecting users automatically, custom error pages, changing your file extensions, banning users with certian IP addresses, only allowing users with certain IP addresses, stopping directory listings and using a different file as the index file.

Creating A .htaccess File

Creating a .htaccess file may cause you a few problems. Writing the file is easy, you just need enter the appropriate code into a text editor (like notepad). You may run into problems with saving the file. Because .htaccess is a strange file name (the file actually has no name but a 8 letter file extension) it may not be accepted on certain systems (e.g. Windows 3.1). With most operating systems, though, all you need to do is to save the file by entering the name as:

".htaccess"

(including the quotes). If this doesn't work, you will need to name it something else (e.g. htaccess.txt) and then upload it to the server. Once you have uploaded the file you can then rename it using an FTP program.

Warning

Before beginning using .htaccess, I should give you one warning. Although using .htaccess on your server is extremely unlikely to cause you any problems (if something is wrong it simply won't work), you should be wary if you are using the Microsoft FrontPage Extensions. The FrontPage extensions use the .htaccess file so you should not really edit it to add your own information. If you do want to (this is not recommended, but possible) you should download the .htaccess file from your server first (if it exists) and then add your code to the beginning.

Custom Error Pages

The first use of the .htaccess file which I will cover is custom error pages. These will allow you to have your own, personal error pages (for example when a file is not found) instead of using your host's error pages or having no page. This will make your site seem much more professional in the unlikely event of an error. It will also allow you to create scripts to notify you if there is an error (for example I use a PHP script on Free Webmaster Help to automatically e-mail me when a page is not found).

You can use custom error pages for any error as long as you know its number (like 404 for page not found) by adding the following to your .htaccess file:

ErrorDocument errornumber /file.html

For example if I had the file notfound.html in the root directory of my site and I wanted to use it for a 404 error I would use:

ErrorDocument 404 /notfound.html

If the file is not in the root directory of your site, you just need to put the path to it:

ErrorDocument 500 /errorpages/500.html

These are some of the most common errors:

401 - Authorization Required
400 - Bad request
403 - Forbidden
500 - Internal Server Error
404 - Wrong page

Then, all you need to do is to create a file to display when the error happens and upload it and the .htaccess file.
Part 2 - .htaccess Commands
?
Introduction

In the last part I introduced you to .htaccess and some of its useful features. In this part I will show you how to use the .htaccess file to implement some of these.

Stop A Directory Index From Being Shown

Sometimes, for one reason or another, you will have no index file in your directory. This will, of course, mean that if someone types the directory name into their browser, a full listing of all the files in that directory will be shown. This could be a security risk for your site.

To prevent against this (without creating lots of new 'index' files, you can enter a command into your .htaccess file to stop the directory list from being shown:

Options -Indexes

Deny/Allow Certian IP Addresses

In some situations, you may want to only allow people with specific IP addresses to access your site (for example, only allowing people using a particular ISP to get into a certian directory) or you may want to ban certian IP addresses (for example, keeping disruptive memembers out of your message boards). Of course, this will only work if you know the IP addresses you want to ban and, as most people on the internet now have a dynamic IP address, so this is not always the best way to limit usage.

You can block an IP address by using:

deny from 000.000.000.000

where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will block a whole range.

You can allow an IP address by using:

allow from 000.000.000.000

where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will allow a whole range.

If you want to deny everyone from accessing a directory, you can use:

deny from all

but this will still allow scripts to use the files in the directory.

Alternative Index Files

You may not always want to use index.htm or index.html as your index file for a directory, for example if you are using PHP files in your site, you may want index.php to be the index file for a directory. You are not limited to 'index' files though. Using .htaccess you can set foofoo.blah to be your index file if you want to!

Alternate index files are entered in a list. The server will work from left to right, checking to see if each file exists, if none of them exisit it will display a directory listing (unless, of course, you have turned this off).

DirectoryIndex index.php index.php3 messagebrd.pl index.html index.htm

Redirection

One of the most useful functions of the .htaccess file is to redirect requests to different files, either on the same server, or on a completely different web site. It can be extremely useful if you change the name of one of your files but allow users to still find it. Another use (which I find very useful) is to redirect to a longer URL, for example in my newsletters I can use a very short URL for my affiliate links. The following can be done to redirect a specific file:

Redirect /location/from/root/file.ext http://www.othersite.com/new/file/location.xyz

In this above example, a file in the root directory called oldfile.html would be entered as:

/oldfile.html

and a file in the old subdirectory would be entered as:

/old/oldfile.html

You can also redirect whole directoires of your site using the .htaccess file, for example if you had a directory called olddirectory on your site and you had set up the same files on a new site at: http://www.newsite.com/newdirectory/ you could redirect all the files in that directory without having to specify each one:

Redirect /olddirectory http://www.newsite.com/newdirectory

Then, any request to your site below /olddirectory will bee redirected to the new site, with the extra information in the URL added on, for example if someone typed in:

http://www.youroldsite.com/olddirecotry/oldfiles/images/image.gif

They would be redirected to:

http://www.newsite.com/newdirectory/oldfiles/images/image.gif

This can prove to be extremely powerful if used correctly.

Part 3 - Password Protection

Introduction

Although there are many uses of the .htaccess file, by far the most popular, and probably most useful, is being able to relaibly password protect directories on websites. Although JavaScript etc. can also be used to do this, only .htaccess has total security (as someone must know the password to get into the directory, there are no 'back doors')

The .htaccess File

Adding password protection to a directory using .htaccess takes two stages. The first part is to add the appropriate lines to your .htaccess file in the directory you would like to protect. Everything below this directory will be password protected:

AuthName "Section Name"
AuthType Basic
AuthUserFile /full/path/to/.htpasswd
Require valid-user

There are a few parts of this which you will need to change for your site. You should replace "Section Name" with the name of the part of the site you are protecting e.g. "Members Area".

The /full/parth/to/.htpasswd should be changed to reflect the full server path to the .htpasswd file (more on this later). If you do not know what the full path to your webspace is, contact your system administrator for details.

The .htpasswd File

Password protecting a directory takes a little more work than any of the other .htaccess functions because you must also create a file to contain the usernames and passwords which are allowed to access the site. These should be placed in a file which (by default) should be called .htpasswd. Like the .htaccess file, this is a file with no name and an 8 letter extension. This can be placed anywhere within you website (as the passwords are encrypted) but it is advisable to store it outside the web root so that it is impossible to access it from the web.

Entering Usernames And Passwords

Once you have created your .htpasswd file (you can do this in a standard text editor) you must enter the usernames and passwords to access the site. They should be entered as follows:

username:password

where the password is the encrypted format of the password. To encrypt the password you will either need to use one of the premade scripts available on the web or write your own. There is a good username/password service at the KxS site (http://www.kxs.net/support/htaccess_pw.html) which will allow you to enter the user name and password and will output it in the correct format.

For multiple users, just add extra lines to your .htpasswd file in the same format as the first. There are even scripts available for free which will manage the .htpasswd file and will allow automatic adding/removing of users etc.

Accessing The Site

When you try to access a site which has been protected by .htaccess your browser will pop up a standard username/password dialog box. If you don't like this, there are certain scripts available which allow you to embed a username/password box in a website to do the authentication. You can also send the username and password (unencrypted) in the URL as follows:

http://username:password@www.website.com/directory/

Summary

.htaccess is one of the most useful files a webmaster can use. There are a wide variety of different uses for it which can save time and increase security on your website.

 

 

How can I prevent bandwidth theft using the mod_rewrite engine and .htaccess

 

Due to either ignorance or an 'I'll do what I want because I want to' attitude, there are plenty of people that will place image tags on their pages that pull images from your server. This linking can place a great load on your server as well as cause you to incur excess bandwidth charges.

HOW DO I STOP THIS THEFT?
The Apache Server's Mod Rewrite Engine (which must be compiled into your server to allow you to do this) can examine the name of the document requesting a file of a particular type. You can then define logic that basically does the following:

If the URL of the page requesting the image file is from an allowed domain, display the image- otherwise return a broken image.
The logic, or rules are then placed in the directory(s) that contain your image files.
IS THIS A PERFECT SOLUTION?
No. In order for it to work, the browser that requested the page must return the URL of the page, or what is called the HTTP_REFERER. There is also a performace penalty on the server due to the extra overhead it testing the file requests.

This method should be used when offsite linking has become an issue of concern to you. A little bit of tolerence or maybe a gentle e-mail to the other site's webmaster may also be an acceptable solution. I have actually made a few friends this way!

HOW EXACTLY CAN I DO THIS?

STEP 1: Ensure mod_rewrite is enabled on your server. (Yes, it is.)

STEP 2: Get organized! Try to get all of your images into directories that do not contain your HTML files. Each directory containing the images should have an empty index.html file to prevent people from looking at your directory listing.

STEP 3: Create or edit a .htaccess in one of the directories containing your images. I suggest doing one directory first so you can test your rules, and quickly comment out the lines or rename the file if it causes server configuration errors. The .htaccess file should contain the following lines.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://domain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.domain.com/.*$ [NC]
RewriteRule .*.gif$ - [L]

RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://domain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.domain.com/.*$ [NC]
RewriteRule .*.jpg$ - [L]

NOTE:When cutting and pasting, be sure that each RewriteCond is on one line. Line wrapping in the page display could introduce broken lines.
Change domain.com to whatever your domain name is. Be sure to use both the plain domain name as well as the www so that people coming to your site either way are not deprived of your images!

STEP 4: Test! Create a page on another server and insert in image tag pointing to an image in the protected directory. If you get a broken image icon- you did it! The requests will still appear in your logs, but your bandwidth will be protected.
On files such as .MIDI (music files), it will result in a Forbidden error.