Archive for September, 2006

Sep 22 2006

Some thoughts on footprints

Published by admin under General, Keywords





If you’re making sites via tools like RSSGM, NicheCreator, etc, then “footprints” is probably a term you know. It means a bit of the HTML content of each page that remains constant across your thousands of pages across all your sites. If the search engines catch on to that one constant piece of HTML, they can simply delete all sites from the index that have that HTML, and bang… all your sites de-indexed in one go.

So, the question is: what exactly is a footprint? Certainly, visible text is one thing. Some people say that the search engines can even recognise the spacing of your HTML tags in the page. I vary entire templates, including CSS tags, CSS filenames, Javascript code, and Javascript filenames just to make sure. I install RSSGM into a folder that isn’t called RSSGM, and I vary that folder name every so often (I use generic names like “includes”, “include”, “classes”, “images”, “templates”, etc - things that are found on virtually every website going.

The other thing I’m beginning to think about it file structure as a template. Every standard RSSGM site consists of  hundreds or thousands of pages all at the root level of the site.

All those pages have a constant ending (.php or .html, I forget which, but they’re all the same).

Possiblym they all have in them (eg, “mortgage-insurance.php”, “mortgage-advisors.php”, “uk-mortgage-providers.php”, etc).

Even worse, if you’re not doing keyword list building very well, they might all even BEGIN with the keyword.

And then they all appear at the same time with the same timestamp. All 2000 pages or whatever.

And finally, all links to page “x.php” have the text link as “x”.

So there are a few things I’m working on for my next batch of sites:

  1. Varying the path of the files.
    Put some files in the root, and then put others into folders that consist of the keywords as well (eg, “http://www.domain.com/mortgage-advisors/alabama-mortgage-insurance.php”. ). DSG does this, but I’ve not had great results with DSG, I think due to other issues. Also, I’ll be putting a variable number of pages into each folder, and a variable number of folders per site. And I’ll also use generic folder names like “articles”, “content”, etc.
  2. Varying the timestamps.
    I’ll be writing a script for RSSGM to regenerate a random number of pages every so often, to help randomise the timestamps.
  3. Varying the keywords.
    LSI (”latent semantic indexing”) is a fancy term for the coming developments in search engine technology. It basically, as i understand it, means that they’ll be looking for words related to each other on the page and linking to it. So if you’re writing about mortages, also reference keywords like “moving home”, “life insurance”, “endowments”, and so on. And have the links to those pages vary their text by using those related keywords.

I’ll be writing more about this as I see things happen with that batch of sites. All this work means a lot of development and testing, so it won’t be soon, but I’ll come back to it every so often.


Tags: General Keywords


No responses yet