1 / What process, what consequences?
The declaration of a login/password creates a secured access and implies navigation in https mode on the domain as long as the user has not disconnected.
This applies to all types of services, including email accounts and social networks, which is of course absolutely necessary.
The use of social networks like Twitter or Facebook, is facilitated by the fact that the user is identified early in the day (for example) and stay connected without any constraint. In this, it is logical and legitimate that Google + does the same.
When you open a secured session with a Google account (a single account for Gmail and Google +), you are on the domain “google.com”, which means that if you type a search in www.google.com, it will be made in protected mode, i.e. https://www.google.com, which means that data will be encrypted.
Excerpts from http://googleblog.blogspot.com/2011/10/making-search-more-secure.html on Google official blog:
“…You Will find yourselves redirected to https://www.google.com (note the extra “s”) when you’re signed in to your Google Account. This change encrypts your search queries and Google’s results page…. / …You Can Also Directly navigate to https://www.google.com if you’re signed out or if you do not have a Google Account.”
What are the consequences?
The most obvious consequence, and the most problematic one, is of course the impossibility to recover the keywords typed in these queries (google.com on while connected to a Google account). This involves the amputation of more or less usable data for an SEO strategy.
The official consequence is of course data protection, which is not objectionable in itself, even though many voices were raised in the community of SEO specialists and web analysts in particular, to question the consistency of this explanation, especially when one considers that the paid search is not impacted by the rule of “not provided keywords”.
Who is affected?
I would answer : everyone, but more or less, depending on Google’s domain extension (com or national) that visitors have used for search queries
In a few words, the more a site is impacted by research via google.com, the larger the share of encrypted research will be important, and the larger the loss of information will become.
This means that a U.S. site is very exposed (recent statements indicate over 30% loss at the moment). An English site is a little less, because it is often accessed via google.co.uk (referring to the figures announcing a slightly higher 10% for now).
A site using English being most easily reachable via google.com, regardless of its location, will be more affected than a site in another language.
A site exclusively in French, or Spanish, for example, will be little affected (currently less than 3%), whereas the English version of that site will be much more (always the same reason: more access via google.com)
To check this, simply connect to your Google account (e.g. gmail) then open a tab on Google national domain extension .fr, .de , .co.uk or other (the search will be done on this area) and a tab on google.com (the search will be made on https://www.google.com).
2 / SEO: Is it possible to limit the consequences?
Data that is provided to the SEO specialist by the web analytics tool:
- Before the HTTP request:
- Complete list of key expressions used in search engines to access the site
- Complete list of landing pages for these requests
- All behavioral and qualitative data on these two types of elements
- After the HTTP request:
- Single line (not provided) for the encrypted queries
- Complete list of key expressions for not encrypted queries
- Complete list of pages of entries for these requests
- All behavioral and qualitative data only for the lists
What is the impact on SEO analysis?
The loss of visibility on a significant portion of keywords certainly poses serious problems for the analyst, concerning the content optimization or the long tail analysis…
If we believe the idea that this process is sustainable (despite that we all hope it to be revoked by Google), we must strive to provide answers, even partially. I have read the article by Avinash Kaushik on the subject and I totally agree with Benoît Arson and Stephane Hamel’s comments.
But I think it is possible, not just to guess the hidden keywords, of course, but to raise strong trends that are usable in SEO. The study of landing pages seems to offer very interesting insight in this direction.
Let’s consider the following:
- 30% (not provided)
- 70% (keywords)
- Main landing pages: page A (21.9%), Page B (21, 6%), page C (15%)
First check: Are these scores consistent (at constant consolidation scope and content) with the recent history? If so, there is no problem. If not, proceed as follows:
Create two segments: [not provided] and [keywords] and then apply to the “landing pages” analysis
- If for each segment we get a comparable result to the global results above, no problem, our SEO strategy can be pursued
- If we have significant differences, for example
o On (not provided): pageA (31%), PageB (16%), pageC (15%)
o On (other): PageB (24%), pageA (18%), pageC (15%) (you can check it, applying the percentages of 30 and 70 on these scores, you will get the distribution quoted at the beginning of the example)
In this case, we found a behavioral segmentation between the two audiences. To learn more about this distortion and thus try to correct, we need to see what pages are underrepresented and overrepresented in each case.
30% 70% 100%
Page A is overrepresented in [not provided]
Page A is underrepresented in [keywords]
Finally, concerning landing pages, we can assume that they are optimized for well-known, precise and dedicated keywords, which then allows us to work on SEO and targeted marketing action.
This approach has the advantage of being based on observations and not assumptions, but still is not a complete solution. We can only hope that Google will respond to the massive demand from web experts in order to help solve this problem. . I therefore summarize the views of many by this tweet from Jacques Warren: “Quite muddy. Would be easier to just put back the keywords “.