Since I see that quite a few people have already read this thread, Leo…I have done your homework FOR you. This is not my work, only my research. So I can’t take credit for this, just pasting it.
So what is on-page (or on-site) optimisation. This basically encompasses all those factors and elements on the page itself that can help you rank better. This are things that you have direct control over, unlike off-page (or off-site) optimisation where, although you can influence them, you’re not in direct control as you’re dependent on another site/webmaster.
Head Elements <head></head>
Title Element <title></title>
This is perhaps the singularly most important element on the page for determining your position in the SERPs. Google, Yahoo! and MSN place a great deal of weight on any terms that appear within the title, so it’s worth spending some time thinking about it.
Keep it concise. The more terms you have in the title, then theoretically the less weight gets passed to each. Too short and you could miss out on some additional terms, too long and you start to lose the value. Quite often you’ll see sites that try to include all their targeted keywords in the homepage title – don’t do it. Instead, have a dedicated page for each term where you can focus the title and page contents on just that term. Most sites will have more traffic landing on inner pages than the homepage, so there’s just no need to try and get all your keywords on the one page.
Don’t repeat keywords unnecessarily. You get no additional benefit from repeating keywords and if done excessively it can end up looking ridiculous to potential visitors (see point 3 below). Take for example a title such as “Widgets - Red Widgets, Blue Widgets, White Widgets | Widget Store”. This could easily be rewritten as “Red, White and Blue Widgets from the Widget Store”, as well as several other possible variations.
Let it read well. The title of your site in the search results will almost always be taken from the content of the title element on the page. It’s important therefore that it reads well for any potential visitors. There’s no point ranking #1 if you’re going to lose visitors because your title puts them off. Such problems could be caused by excessive repetition (making it look spammy) for example.
Including your site name. This is very much down to personal preference. Some like to include the site name in every title throughout the site, others only on the homepage. If your site has particularly good branding and is already well known then it may be worth including on every page – those searching may be encouraged to select your listing in the search results by being reassured by the presence of your brand in the title. For most sites though it is of questionable value. Your homepage should be the main page that ranks for your site’s name, thus repeating it across all pages will have no real benefit in terms of SEO, and all the while you’re potentially reducing the weight of the other keywords (see point 1 above).
Keep it unique. Often you see sites where the same title is repeated throughout (these tend to be dynamic sites that use a common header without the ability to adjust the title). This is a criminal waste of potential. Given the value of the title element, each page should have a title that isn’t repeated elsewhere within the site. The title should be targeted to the contents of that page alone.
WOW!! - Look, FREE Advice. As already mentioned, titles are there for potential visitors and not just search engines. When creating titles, also consider the benefit of terms that catch the eye when they appear in the SERPs, eg WOW, FREE, LOOK, NOW. Although they may not have any ranking benefit (sometimes, but not always), it could make your title stand out enough against those of your competitors to get the click-through even if you rank lower. EGOL wrote a post about this some time ago: Three Secrets to making a LOT more money!.
Synonyms. If you’re struggling to compete for a particular search query, don’t discount the value of targeting a slightly less competitive synonym in your title for a while. It could result in more traffic than ranking lower for a more popular term, and your page will still have relevance to the original keyword. With more traffic comes the potential for more links, strengthening the page so that later you may be in a better position to compete for the original keyword.
Meta Keywords <meta name=“keywords” content=“” />
Here we go with the first of the misconceptions. Meta keywords have no real-world effect on your SERPs. Let’s repeat that so it really sinks in: meta keywords have no real-world effect on your SERPs.
When search engines first appeared way back in the very early days of the internet, they were very basic at best and a far cry from the search engines we know today. It wasn’t long before SEOs started manipulating their meta keyword tags, stuffing them full of unrelated terms and generally abusing their original purpose. Search engines therefore counteracted this by ignoring meta keywords in their ranking algorithms. When Google came onto the scene, meta keywords were already out of use for determining relevancy and ranking positions.
Unfortunately, the perceived value of meta keywords has still persisted against all evidence to the contrary. Often an argument is put forward that if they don’t hurt, why not use them. They do hurt your time though – spending 5 minutes per page working on a meta keywords list adds up. On a ten page site that’s nearly an hour, on a hundred page site that’s a full working day! If you’ve got that much time to waste needlessly you need to find yourself a hobby!
Another frequently asked question is “how do I format the keywords”. Should they be comma separated, commas with spaces, one word or multi word terms, etc. Since meta keywords don’t affect your ranks and play no part in SEO, it really doesn’t make any difference. Just leave them out, then you don’t need to worry. Instead worry about how much content your competitors developed while you were dwelling on your meta keywords.
Meta Description <meta name=“description” content=“” />
Now to one of the most contentious issues we currently have on SEOChat: the meta description tag. First of all, like the meta keywords tag, search engines do not use the content to determine relevancy and search result positions. Adding or editing your meta description will have no effect on where you appear for any given search term.
Where the meta description is still used is as a potential source for the page description that appears in the search results. Great, so a well crafted meta description can give us a nice boost in click-through when our site appears for a search; where’s the contention there? I’m glad you asked – read on.
It was Fathom who first raised the issue on SEOChat, that relying on meta descriptions for your site’s SERPs description is a problem for several reasons. As already mentioned, the meta description is just one potential source for the search engines. The main search engines all have the ability to create snippets – taking one or two highly relevant strings from the page content and formatting it into a description, emphasising the searched terms in the process. The page content isn’t restricted to a small number of characters as is the meta description, with the resultant benefit that you can include far more possible search matches than you ever could with a meta description. The page content also has the added benefit that it does contribute towards relevancy and ranking, unlike the meta tag. If we allow the search engines to create our description for us out of our content, there is a much greater chance that the description will be more accurately matched to the search term. Don’t forget that your pages will naturally rank for many more search terms than you can anticipate, and there is no way you can craft a meta description that will be relevant for all of them.
So why then don’t we create a meta description as a catch-all for those times that the search engines can’t create a good snippet from our content? Well you can, but ultimately you’re hiding the problem – the problem that your content isn’t optimised for that search term. Far better to take a look at the content and adjust it where necessary (either editing the existing text, or creating new content), with the added benefit that improving it will also potentially help your rankings.
Ok, so I’ve removed the meta description tag from my page, but now when it appears for a site: search in Google, all I get as the description is the main navigation links, how is that supposed to be helpful and what to do about it? Well, you needn’t do anything really. Those navigation links are appearing because they’re the first thing on the page, and without a search term in the query there’s nothing to base a snippet on, ergo Google just uses the first piece of content. How many people though come to your site via a simple site: search rather than using a search term? I would suggest that almost none do, and that the only person that does a site: search is yourself. Potentially a competitor might, but then they’re not going to be a beneficial visitor that buys something, or clicks on your ads, etc. If you really are concerned about your navigation appearing thus, then move it. Adjust the layout and make sure that it’s your content that appears first in the page structure and the navigation later --again, resolving the underlying issue rather than relying on a meta description to cover it up.
Many are scared that removing their meta descriptions will impact them negatively, to which I would say that myself and several other members have all done so and have seen no decrease in SERPs positions, nor any reduction in click-through rates. The whole issue of whether or not to use meta descriptions has been discussed at great length on these forums, so I would suggest reading those posts as well.
Meta Robots <meta name=“robots” content=“” />
This is one meta tag that you do want to pay attention to, even though most pages won’t require it. By default search engines will try to index and crawl your site, so inserting index,follow is just a waste of typing time.
If you don’t want the page to appear in the search results, then use noindex. Likewise, if you don’t want search engines to follow any links on the page, then use nofollow. Noarchive instructs the search engines the you don’t want a cached version to be available. It won’t stop them indexing the page though.
Other Meta Tags
If you look at some sites there’s a whole list of additional meta tags appearing in their head section. None of these will have an impact on your search results. Revisit-after is often used in the mistaken belief that it tells search engines how often you want them to revisit your site, but in actual fact it does nothing of the sort.
Body elements <body></body>
Moving on to the page body there are one or two elements worth looking at.
Site Navigation and Internal Links
Simple html text links are guaranteed to be seen by search engines. If you’re not sure whether or not your links are search engine friendly, either look at the source code of the page, or view it in a text browser such as Lynx (available as a Firefox plugin). Text links are preferable to image links since the anchor text (the text you click on) imparts relevancy to the linked page better than an image’s alt attribute does. With CSS there’s no reason why a plain html text link can’t look just as good as any other type.
Be wary of including too many links in your navigation. Yes, you want to encourage search engines to follow links to your inner pages, but each link takes an equal share of “link juice”. There’s only a finite amount of link juice (Google Pagerank) available to share amongst the links, so if you spread it too thinly then all the links will suffer. Concentrate on passing sufficient amounts to your most important pages first, and as your site grows and attracts more incoming links you can add more internal links later. How many is too many? There’s no golden figure, but I’d start to get concerned above 50.
There’s a recent post about link architecture on the Google Webmaster Central Blog. A useful read if you’re new to the subject.
At this point it’s worth also looking at contextual links; these are the links that appear within a body of text – unlike navigation links which tend to exist in their own block as a simple list of links. Although perhaps not strictly “on-page”, since you’re using them to influence another page on the site, they are still “on-site” so it seems appropriate to include them here. Having said that, there is an argument that benefit can be had for the page that the link exists on, with its relevancy increased just that little bit more, in the same way that linking to external authority sites on a topic can increase your relevancy for that topic as well.
Contextual links work exceptionally well at imparting relevancy to the linked page since relevancy comes not only from the anchor text, but also from the surrounding text. One of the reasons Wikipedia has such a strong presence in the search results is because every page is linked via these in-content links. If you haven’t already done so, take a moment to consider using similar methods in your site if you can. There’d be little point doing so for a six page business site (you’d not gain anything over and above the standard naviagtion and it could look a little silly), but for larger sites with substantial information sections then it should be possible. For an ecommerce store, you could have a blog or reviews section, with articles that include contextual links to individual products.
Hidden Links. Don’t even think about it; you will be found out eventually, and you will suffer as a result.
H1 through H6 are heading tags. These are used to structure the page, with H1 being the main heading. Think of these as the headings in a newspaper. The main heading for an article would be the H1 tag. Sub-headings that break up the article into sections would then be the H2 tag. Should any of those sections contain their own sub-sections, these would then use the H3 tag. As you can imagine, it’s rare that you ever use all six levels of heading on a page.
For quite some time, search engines have put additional emphasis on keywords that appear within heading tags. However, as is so often the case, once this became widely known it was quite common to see excessive keyword stuffing and general abuse of the tags, with entire paragraphs appearing as “headings”. As a result they no longer have quite the impact they once did, although the H1 and H2 headings still carry a little more weight than standard text.
Since the headings do help structure the page, and some additional benefit can be had from using them, I would recommend including them on the page. As with the title element, headings should be unique and be relevant to the context. There’s no point having a sitewide H1 tag with the same text repeated throughout the site. This is not what it was designed to do, and you’d be wasting its potential if you did so.
The appearance of text within a heading tag can easily be changed using CSS to fit in with your site’s design, so there’s really no excuse not to use them.
Bold/Strong and Italic/Emphasis tags
It’s quite common to see people recommending their use for your keywords. I am not convinced that they have any worthwhile impact any longer. By all means use them for the benefit of the site visitor, but don’t expect to see any real-world benefit for your rankings. Ok, they may help you improve from #900 to #500, but once you get into the first few pages for even a relatively uncompetitive search term you’re not going to see much, if any change. You can easily spot those sites that still put weight on these tags helping their SERPs, every tenth word is in bold, it makes it much harder to read and frankly, looks ridiculous.
Image Alt Attributes
These have been subject to much abuse over the years as some SEOs would pack them full of keywords. Search engines are wise to this now, and algorithms have moved on, so you’d not see any benefit from keyword stuffing now. Also, with the advent of image searches, far more benefit can be had by focusing the image alt attribute on gaining ranks there rather than trying to manipulate them for the benefit of the page. Make the image alt attribute short, succinct, and relevant to the image content. Make sure they read well, because they can sometimes appear as part of the snippet in the standard search results.
Ranking for Image Search
Whilst we’re on the subject of images and image searches, there are one or two things you can do to help your chances of ranking well there. As already mentioned, think about the image’s alt attribute. If you’re going to rename your image, treat it as you would the page filename, making it short and relevant to the image content. An image’s relevancy is also determined by the context of the page in which it sits, so think about where you’re putting the image.
Not to be confused with the title element, the title attribute is usually to be found as part of a link, eg <a href=“#” title=“The title attribute”></a>, although it can also be found in images and other tags as well. These do not influence your rankings, so use them for the benefit of your visitors by all means, but don’t focus on them for SEO purposes.
You’ve got to give your file/page a URL anyway, so you might as well include a keyword or two in there if you’re using static html files or URL rewriting. Don’t obsess about it though. Having your keywords in the URL is only a very small factor in the algorithm, and for competitive search terms you’re not really going to see any real-world difference. The main benefit comes from external links that use the URL as the anchor text, thus those keywords then become part of the anchor text.
If you’ve got an existing site that uses dynamic URLs, don’t rush to implement URL rewriting in the hope of getting a boost in the SERPs. You’re just as likely to cause more harm than anything if you don’t implement it properly, coupled with the inevitable disruption to your SERPs whilst the changes are indexed (even if you do use 301 redirects). There’s much to consider with dynamic v static URLs which goes beyond the scope of this post, needless to say a quick search on SEOChat will pull up many existing posts on the matter. It’s also worth reading a recent post on the Google Webmaster Central Blog about dynamic v static URLs.
There’s another myth that too many subfolders in the URL will reduce the importance of the page, eg /cars/saloons/make/red.html is less preferable to /red-make-saloon.html. It really makes no difference. What is a factor is how many clicks away from the homepage it is. A page linked directly from the homepage (which usually will have the greatest link juice to give) will tend to have a better chance of ranking than one buried deep in the site that can only be reached via three other pages first. I tend to steer clear of including too many subfolders in the URL as it makes it more likely that as the site grows and I have to move pages around, then I’ll need to change the URL as some point. With less subfolders in the URL, there’s less likelihood that I’ll need to do so. With a good navigation structure and breadcrumb trail, it should be quite evident to the site visitor where they are in the site without relying on the URL as an indicator.
Let’s put another misconception to rest: keyword density. Like the meta keywords tag, keyword density continues to be touted as being of benefit to your ranking position. Numbers such as 2-5% density are bandied about as being the key to ranking well. Let’s be clear that these numbers are just fool’s gold. Like the argument for meta keywords, search engines have moved on and what worked once doesn’t necessarily mean it still works today. Search engine algorithms have far more factors at their disposal with which to judge relevance, and have long outgrown such a crude method as counting the number of times a term appears on the page.
Content should be written with the reader in mind, and trying to force a set percentage with which the keyword appears is inevitably going to lead to poor quality writing, in turn reducing the likelihood that others will link to the page.
When taken to extremes, excessive use of keywords in the content can even harm, triggering spam filters and resulting in a decline in SERPs. Often termed “over optimisation”, this is something of a misnomer as there’s only really “well optimised” and “badly optimised”.
It is perfectly plausible that a site should rank #1 for a term without ever including it on the page as a direct result of incoming links that pass relevancy for that term. I would however still recommend that you at least include it in the page title as a minimum.
Absolute v Relative URLs
It seems to be something of a myth that search engines prefer absolute rather than relative URLs. As far as I can tell, there’s no difference between the two. Given the choice, I tend to use absolute URLs, including the full domain name, for internal links. I do this not because the search engines prefer it, but in case my content gets scraped. Not all scrapers remove the internal links, so it offers some degree of protection in those circumstances as I still have links back to my site. On the other hand, it’s suggested that relative URLs don’t require an additional DNS lookup by the browser, so can be faster for the site visitor (I couldn’t say for certain whether or not that is the case, and it’s not something I’m inclined to devote time to finding out).
Having a site that fully complies with the W3C guidelines is not, in itself, going to help you rank better. Yes, it may be an indication of quality web development, and it’s certainly worthwhile aiming for (accessibility for all, etc), but it’s not an indication of relevance, which is what the search engines are looking for. It would make no sense to discount the relevancy of Stephen Hawking’s scientific theories to the origins of the universe on the basis that he hasn’t self closed a line break tag, or not escaped an ampersand.
Search engines have become very good at crawling even the most badly formed pages. If your browser can make sense of the html, then it’s safe to assume that so can the search engines.
I would also suggest that by investing time in understanding W3C validation rules, it contributes towards your own personal development. The more you understand about the construction and development of a website, the more you gain personally, and the more skills you have to offer a client. The same goes for understanding server-side languages such as php or python.
HTML v XHTML + CSS
Again, it makes no difference to the relevance of the page. Whether you choose to build it using HTML Transitional or XHTML Strict, it’s the content that the search engines are interested in. CSS helps reduce code bloat, makes maintenance of the site much simpler, etc, but is not going to help you rank higher. Tables may affect the order in which your content appears, but rest assured that the search engines will still read the content.
Site Layout and Design
Search engines aren’t really interested in how pretty your site looks, nor whether the content or the navigation comes first. The only time it could become an issue is if the page is so big, with the main content at the end of it, that the crawlers abandon the page (timeout) before they reach the content. If that’s the case, then it’s likely that so will your site visitors.
Changing the design of your site, provided that you keep titles, content, and URLs the same is unlikely to have much impact on your SERPs. There could be some degree of fluctuation experienced temporarily, or if the new design substantially increases or decreases the number of links off each page then the change could be more long term as a result of your internal linking pattern.
Worth doing for the benefit of your site visitors (page loading times and browser caching) and for site maintenance benefits, but search engine crawlers will easily read past these if they’re embedded in the page.
This file can be used to block crawlers from certain pages and folders on your site. Make sure you don’t accidentally block them from pages you do want indexed. I personally don’t really count the robots.txt file as part of on-page SEO, since it’s not directly contributing to where you rank (you don’t actually need a robots.txt file at all).
Remember, if you block pages with robots.txt you’re not preventing link juice flowing to them and it is theoretically possible for such pages to still appear in the search results based on the information given by the links pointing to them. To shut off the flow of link juice, add rel=“nofollow” to such links.
An XML sitemap is useful for larger sites. It’ll potentially help crawlers locate new pages, and recently updated pages, deep in the site faster than relying on a standard crawl pattern alone. There is, however, no guarantee that they’ll index those pages – you still need to give them adequate reason for doing so (quality, unique content and enough incoming link juice to persuade them that it’s a valuable page)
Html sitemaps are there more for the benefit of visitors than crawlers. On a larger site, there’ll be too many links to provide adequate link juice to any particular page, and for smaller sites then it’s simple enough to link every page from every other page. Of course, if your site’s navigation isn’t search engine friendly, then an html sitemap can help the crawlers find those pages, but a better solution would be to change your navigation to one that is search engine friendly in the first place.
URL Canonicalisation and Avoiding Inadvertent Duplicate Content
First of all, duplicate content on your site is not going to get you penalised unless its blatant and deliberately designed to manipulate rankings. It’s very easy, particularly with dynamic sites, to have the same page accessible through different URLs. A classic case is www.yoursite.com and www.yoursite.com/index.html – in both cases it’s the same page but two different URLs. Likewise, with dynamic sites that have muliple parameters in the query string, changing the order of the parameters makes for a different URL.
In such circumstances, Google is very good at determining which is the most apt version to use. However, it’s always best to avoid the situation arising in the first place. Make sure all your internal links point to the one version you want to use, and set up 301 redirects for the others if possible.
The principal problem is not that you’ll get penalised, but rather that you’re not directing all available link juice to one version. By splitting the link juice across multiple versions, you’re wasting the page’s potential ranking ability.