Page Analyzer FAQ


Page Analyzer FAQ

How to use the Page Analyzer, what is the best keyword density ?

The Page Analyzer is primarily intended to see if a search engine can easily 'understand' what search terms are most relevant for a page. If your main phrase/keyword is not occurring among the top phrases (as detected by the tool) there is a good chance that a search engine will not grasp what the main phrases of that page are either.

We recommend targeting multiple variations of your main search term. This could include the plural and singular version of your keywords, and also synonyms and very similar terms. For example: If your target keyword would be hotel amsterdam , also target hotels amsterdam, amsterdam hotel, amsterdam hotels, hotel in amsterdam, and hotels in amsterdam

When optimising pages, we recommend making the top keywords occur in as many elements as possible (title, heading, anchor text etc). Make sure your targeted phrases and keywords are listed by the tool among the top phrases on the page. That means they will probably be among the phrases with the highest keyword density, on that page.

The actual keyword density and other numbers are not important in our opinion, except relatively within a page. The targeted phrases and keywords should be among the phases with the highest density on one page. The Page Analyzer should probably not be used to compare the actual keyword density numbers of one page to another, unless two pages are very similar and have the same number of words.

Whether other pages and sites use the targeted phrase/keyword to link to the page is also important of course but that is outside of the scope of the Page Analyzer.


What are stopwords ?

Stopwords are words that occur very often and which do not convey any particular meaning.
Examples are 'the', 'a', 'you'

Based on the language we guessed from analyzing a page we will automatically apply a default stopwords list filter in the proper language. If we have one and if you have not disabled it. Currently supported are  Arabic, Czech, Danish, Dutch, Finnish, French, German, Hungarian, Italian, Norwegian,Portugese, Romanian, Spanish, Slovak, Swedish and Russian. 


What is keyword density ?

Keyword Density is the ratio of one word or phrase to the total number of words in a text.

The density of one unique word in a ten word text is 10%.

The density of a unique word that is repeated twice in a ten word text is 20%.


Can Page Analyzer reports be shared publicly ?

Yes they can.

Some pages with external data will only show if the original creator of the report loaded this data.
For example: if the creator did not load the 'Links In' section of the report, it can not be loaded later by users who are not logged in.

Reports expire automatically after 30 days.


What are poisonwords ?

Poisonwords are words that may have a negative influence on your site.They may help identify your page as an unwanted page.

For example, a search engine may decide to rank sites using (too many) words relating to hatred lower, or Google Adsense Public Service Ads may  start showing up when you use death or natural disaster related terms on your page.


How do you calculate Readability on HTML pages ?

First we convert the HTML to formatted flat text. Note this conversion is somewhat open to interpretation. Particularly table structures may look coherent on screen but not after conversion to formatted text, and a perfectly readable page might get a lower score then it deserves.

The scores by themselves may not mean much. Due to the conversion from HTML to text a low score does not necessarily mean the page is difficult to read.  The scores can be used however to tweak a text into a particular direction.. ie into a text that is relatively easier to read by using less complex words for example.

Please note: The link between SEO and readability is not proven. Perhaps some search engines prefer text that they can judge as easy to read, perhaps not. The formulas used here are the standard formulas for calculating readability, so if it is used they'll probably use the same formulas.


What is Readability ?

Readability describes the ease with which a document can be read. Readability tests were designed to assess the suitability of books for students at particular grade levels or ages. Their primary advantage is they can serve as an early warning system to let the writer know that the writing is too dense.

Obviously, readability formulas cannot measure features like interest and enjoyment. And they cannot measure whether a text is suitable for particular readers needs.

Our tool calculates three Readability scores: Fog, Flesch and Kincaid.
They all measure complexity as a function of syllables per word and words per sentence. They assume the text is well formed and logical. You could analyse a passage of nonsensical English and find the readability is quite good, provided the words are not too complex and the sentences not too long.

The Fog index, developed by Robert Gunning, is a well known and simple formula for measuring readability. The index indicates the number of years of formal education a reader of average intelligence would need to read the text once and understand that piece of writing with its word sentence workload.The Fog index, developed by Robert Gunning, is a well known and simple formula for measuring readability. The index indicates the number of years of formal education a reader of average intelligence would need to read the text once and understand that piece of writing with its word sentence workload.

   18 unreadable
14 difficult
12 ideal
10 acceptable
8 childish

The Flesch index rates text on a 100 point scale. The higher the score, the easier it is to understand the text. A score of 60 to 70 is considered to be optimal. 

The Kincaid index rates text on U.S. grade school level. So a score of 8.0 means that the document can be understood by an eighth grader. A score of 7.0 to 8.0 is considered to be optimal.


What is Sentence Word Overlap ?

Sentence Word Overlap measures how often words overlap between two adjacent sentences. Sentence word overlap is said to facilitate meaning construction and improves text comprehension and reading speed.


What is the influence of HTML that does not validate on the report ?

In most cases this will have no influence on the final density and prominence calculations, as this uses all text on the page as a basis. (Depending on your settings stopwords may be stripped). Even pages that use very bad HTML are usually stripped properly.
Page elements are harder to extract from invalid HTML. The most likely effect of invalid HTML in other cases is some page elements missing or not being complete on the report page.


What are Phrasal Ends ?

"Phrasal Ends" are indications in text or HTML that a group of words do not belong together to form a phrase.

Phrasal ends can include punctuation such as dots or comma's, or  (closing) HTML tags such  as </div>, </p> and </blockquote>.
Phrasal ends indicate that the words before and after propably do not form an intended phrase when put together.

Below some examples of the effect of considering Phrasal Ends.

Before version 3.06 of the Page Analyzer, in this HTML

 <img src="image1.jpg" src="image1.jpg" alt="Click here">
 <img src="image2.jpg" src="image2.jpg" alt="Click here">
 <img src="image3.jpg" src="image3.jpg" alt="Click here">

"Click here Click" would be counted as a phrase occuring 2 times.
It would not be counted as a phrase currently.

in this HTML

<a href='shop.html'>Shop</a>
<a href='manuals.html'>Manuals</a>
<a href='contact.html'>Contact</a>

"Shop Manuals Contact" would be counted as a phrase occuring 1 time.
It would not be counted as a phrase currently.


What are adultwords ?

Adultwords are words that may cause your site to be qualified as an adult site. Examples include 'p*rn' , and the FCC's seven naughty words.


What is keyword prominence ?

Keyword Prominence is the ratio of the position of one word or phrase to the positions of the other words in a text.

Not having a phrase in title and/or meta tags is likely to yield a relatively low Prominence score.


I know (choose) is on the page, but I do not see it in the report ?

This may be due to html which does not validate, your settings, or a bug. Please note you will not see any words of less then 3 characters in the default setting, nor words that occur only once on the page.

If you think it's a bug, please let us know and we'll look into it. Don't forget to mention how we can recreate the problem.


What is Google Section Targeting ?

Section targeting allows you to suggest sections of your text and HTML content that you'd like Google to emphasize or downplay when matching Adsense ads to your site's content.

The Keyword Density Analyzer can show you the text of areas to emphasize and of areas to ignore.

You can also chose to have the Keyword Density Analyzer ignore content that appears within Google Ad weight=ignore Sections



How do you calculate keyword density ?

Depending on your settings, density is either

Occurences of the word / Number of Unique words
Occurence of the word / Total number of words

Density is the ratio of one word or phrase to the total number of words in a text.

The density of one word in a ten word sentence having unique words only is 10%.

The density of one word, occuring twice in a ten word sentence having 9 unique words, is 20%


How does the Ranks Wizard work ?

The Ranks Wizard tries to guess what a page is optimized for, and how strong it is optimized for the keywords it is targeting.

By looking at Keyword Placement we try to estimate how strong a page is optimized for it's top keywords.

The generated numbers are a representation of the strength of a particular keyword or phrase.

Keep in mind the Wizard scores are only guesses. Google and other search engines may find other phrases the most relevant on a page, and they will certainly look at many more factors when calculating their scores, such as the quality and anchortext of the incoming links.

We do think that if you manage to increase your own Wizard score for a particular keyword or phrase it is likely to improve your rankings.


How do you calculate prominence ?

Depending on your settings, prominence is either based on the unique words in a file, or on all the words.
Prominence is the ratio of the position of one word or phrase to the positons of the other words in a text.

The formula we use is
prominence = ($totalwords - (($positionsum - 1) / $positionsnum)) * (100 / $totalwords)

$totalwords = total number of words in string

$positionsum = the sum of each position of the word we are analyzing
(example: if a word occurs on position 2 and 5 $positionsum is 7)

$positionsnum = The number of positions

The prominence of one word in the first position, in a ten word sentence having unique words only is
(10 - ((1 - 1) / 1)) * (100 / 10)) = 100%.

If that same word would be the last word in the sentence, it's prominence would be
(10 - ((10 - 1) / 1)) * (100 / 10)) = 10%.

If that same word would occur twice, on position 1 and 10, it's prominence would be
(10 - ((11 - 1) / 2)) * (100 / 10)) = 50%