38

The Past, Present and Future of What We’ve Come to Know as SEO

Share this Post

By Mike Grehan

Fifteen years ago, I wrote a book about search engine optimization that was largely regarded as the most comprehensive book of its kind. I had written an earlier edition in 1999, but between that edition and the updated version I had become absorbed in the science of information retrieval and was able to add so much more detail about the type of science and technology on which Google based its search results.  And I still continue to study and observe advances in the field. That in itself has frequently helped me to stay ahead of the game when it comes to what we know as search engine optimization (SEO).

Back in the Nineties, when I first started manipulating search engine results (yes, that’s exactly what I was doing) it wasn’t called SEO. In fact, there wasn’t really a term for it. There weren’t even that many people doing it. A kind of small cottage industry was beginning to emerge. We stuffed pages with keywords and did some hypertext-markup-tinkering to create what we called “doorway pages” and the end game was simple: Get indexed and rank somewhere in the first two pages of search engine results.

In the past, text was the strongest signal for ranking purposes. If the text in a query matched the text on a page then it was a candidate for ranking. However, the “vocabulary mismatch” is a common phenomenon in the usage of natural languages.  It occurs when different people name the same thing or concept differently. Early research work has shown that, on average, 80% of the time different people (experts in the same field) will name the same thing differently. There are usually many possible names that can be attributed to the same thing. This research motivated the work on latent semantic indexing.

People still wonder why one web page linking to another is so hugely important. In the simplest of terms, Google based its algorithm fundamentally on citation analysis. Mike Grehan
No, I won’t explain latent semantic indexing here. But to put it in context, a 2012 quantitative study of the vocabulary mismatch problem in an information retrieval setting determined that an average query term fails to appear in 30-40% of the documents that are relevant to the user query. Yes, up to 40% of relevant documents for any given query don’t even have the words from the query appear anywhere in them. This is why Google has become so adept at using “query expansion” techniques.

Looking back, even though a search engine crawler seemed like such an advanced piece of technology, it was actually almost primitive at the time. And crawlers were so easily foiled by just about any change in web development technology. That’s what really spawned the industry. An army of what effectively could have been seen as hypertext remodeling workers. Masters of the angle-bracket, tag-technicians, keyword-explorers and tweakers-of-text. And that was before Google arrived with its all new, fancy, hyperlink-induced, algorithms. When that happened, the emerging industry, which had conquered the “on page” optimization process, now focused squarely on links, links and more links.

I still get asked about links so often. People still wonder why one web page linking to another is so hugely important. In the simplest of terms, Google based its algorithm fundamentally on citation analysis. In the academic world, if a number of recognized experts in a given field all cite your paper, then basically you’re recognized as an authority on that particular subject. That’s where the term “authority site” came from. So when one web page links to another, two basic assumptions can be made. First, one page is giving a vote, as such, to the other. And second, they’re perhaps both focused on the same subject. You can throw a little network theory in here too with the observation all those years ago of “cyber communities” forming (or birds of a feather and all that).

So, the art of SEO as it had become was developing and it was a mixture of keyword analysis, tinkering with web pages, keeping your web server in check and, for ranking purposes, links and rich link anchor text. However, although the ranking factor was supposed to be mainly influenced by the “democratic nature of the web” as Google put it, and by that meaning, she with the best quality links wins, there was actually something totally “undemocratic” going on. Where did the millions and millions of end users that had no web site, and therefore no web pages to link from or too, fit in all of this?

You know, I always thought that was a little like saying that the people who make televisions get to decide what you should see on them. Quality is a subjective thing. But who’s better to decide on that? The web page authors creating content and linking to and from it, or the end users consuming it?

It was inevitable that end-user data had to be folded into the mix at some point. In exactly the same way that TV networks look at audience data to judge and rank the popularity of specific programming. For Google, “relevance feedback” has always been a signal to determine what content satisfied the information need of the end user. This is implicit data gathered on a massive scale.

Over the years, within the science many strategic approaches to information retrieval have been developed. Language models are used, as well probabilistic retrieval, Boolean indexing, latent semantic indexing, inference networks, neural networks, fuzzy set retrieval and genetic algorithms. These approaches are based on a multitude of different mathematical constructs. For a human to develop the perfect blend of all of these things to provide the most relevant results would be huge task. But perhaps not so in the realm of what’s known as machine learning.

Google took the classic information retrieval models, and under the guidance of search-master Amit Singhal scaled these approaches to match the modern web era. However, underlying this is Google’s major investment in machine learning and steps towards artificial intelligence. A cultural split has occurred at Google between the “retrievers” (those with an information retrieval background) and the “learners” (those with a machine learning background). The “retrievers” hard coded the search ranking technology (based on hundreds of signals) as far as it could go. But in 2014, the “learners” moved into the ranking team. And the first thing they did was focus on end user behavior using an artificial neural network to create a new ranking score. In April 2015, a whole new machine learning component called “RankBrain” was added to the ranking mechanism.

It’s hard to discuss the future of search without understanding how we live in a world of algorithms now. Algorithms run your cell phone, they’re in your computer, in your house, in your appliances, in your car, and your banking data and medical records are a huge tangle of algorithms.

I say it because it’s hard to be in the industry we’re in if you don’t fully understand the power of the algorithm. No, you don’t need to be a scientist or a programmer. But rather like driving a car, it’s kind of useful to know a little bit about how the engine works, not just how to drive it.

Not all queries are intended to end in a transactional result in the sense of a financial transaction. As digital marketers, we occupy our minds way too much with this and focus way too much on trying solely to connect with an end user at the checkout.Mike Grehan
Over time, computer scientists build on each other’s work. And this has certainly happened at Google. Algorithms combine with other algorithms to use the results of other algorithms, which in turn produce results for more algorithms.  Each algorithm has an input and an output. You put something in, the computer does what it does and out comes a result. But with machine learning, something entirely different happens.

With machine learning you enter the data and the desired result. And then out comes an algorithm that turns one into the other. These are learning algorithms – or learners for short – and they’re algorithms that make other algorithms. With machine learning, computers write algorithms so humans don’t have to.

Ranking signals do not remain static. They’re fluid and change with time and context and geography and behavioral feedback. And as the learner builds around all of this, it begins to look more actively beyond basic relevance to the query, maximizing the usefulness of the search results for the user who input the query.

Not all queries are intended to end in a transactional result in the sense of a financial transaction.  As digital marketers, we occupy our minds way too much with this and focus way too much on trying solely to connect with an end user at the checkout.

And yet, what if we simply want to stimulate ourselves, change our mood, maybe find a funny video clip to laugh at, maybe we simply want to look at some nice pictures or, who knows, maybe we want a tutorial on how to build a house. In order for Google to help us in so many ways, understanding intent is the most important factor.

We really have entered a brand new era of search. And that means a new era of SEO. If, in fact, that’s what it should still be called. The job has changed so much from those early web-page-tinkering days. If it’s time for Google to move forward with new, faster learning technology, identifying so much more about the end user’s information need than simply words on a page and how many links that page has pointing to it, then SEO must change with it.

Each year Google publishes the founders letter. For 2016 they gave the task to new Google CEO Sundar Pichai. In it, he said “When Larry and Sergey founded Google in 1998, there were about 300 million people online. By and large, they were sitting in a chair, logging on to a desktop machine, typing searches on a big keyboard connected to a big, bulky monitor. Today, that number is around 3 billion people, many of them searching for information on tiny devices they carry with them wherever they go.”

It’s all about context and content. Who you are, where you are, what time of day it is and, of course, previous search behavior. Our job is less about helping Google index the web 1999 style. It’s less about worrying about the penalty of buying links to beat the competition in the SERPs. It’s so much more about creating useful content experiences on the user journey. It’s about being there in the moment.

Maybe we should be thinking about ourselves as content experience analysts (CEA) concerned more with human interaction and engagement. Perhaps now really is the time to focus on optimizing for humans and not for machines.

Share this Post

Mike Grehan is CMO and Managing Director of Acronym and Chairman of SEMPO, the worldwide search marketing organization.

Comments 38

  1. Eric Ward

    It’s a Catch-22. The earned link graph tends to steer users to certain documents (whether via search or serendipity), which then leads the end user to interaction with those documents, furthering their perceived value and likely influencing AI. A vicious circle. The massive link graph is too valuable to disregard, yet favors the favored, much like you elegantly wrote about in “Filthy Linking Rich”. As a linking strategist, my goal is to help the less favored but excellent content earn the signals (the most powerful of which is still links), so that it has a fighting chance. And perhaps most importantly of all, a site or page or app must develop traffic paths that don’t originate via a Google search. If a site’s business model and traffic is too heavily dependent on what Google will give it organically, they are toast. Yet thousands of sites do just fine without Google sending them significant traffic. Perhaps the single most important skill a digital strategist can offer today is helping sites thrive regardless of what Google sends them.

  2. Ammon Johns

    Quote: “I always thought that was a little like saying that the people who make televisions get to decide what you should see on them”

    Well, using the TV analogy, you can often measure the importance or popularity of one programme or series by noting how many others mention it, cite it, or parody it. Whether its news channels reporting on actors notable for it, interview shows, comedy, whatever.

    Does this actuality of how TV works sometimes get abused? Of course. We all know that certain TV shows tend to favour guests from other shows created by the same channel, or the same producers. But that’s an endemic kind of bias that happens everywhere in life.

    I’d actually argue that one of the biggest improvements in search has been adding in bias deliberately – personalisation. All those efforts to localise, to contextualise, are all forms of bias that make search results seem better by (largely) matching them to our own biases.

    Isn’t that exactly why AI is so important? To make machines less machine-like, less impartial, and more able to emulate bias (personal opinion and taste). Pure machine logic could never be perfect, because we humans actually like our biases.

  3. Mike Grehan

    Eric,

    I agree that links are still very important. A genuinely strong signal in the mix. But my point still remains that the end user “vote”, whether that’s clicks from the SERPs or direct navigation monitored via browser or toolbar data, adds verification and an indication of popularity. But the fact also exists, as you say yourself, you can’t rely on Google solely to power your business. As I mentioned recently, there are 1.67 billion (perhaps more now) active users at Facebook. How could you ignore that?

  4. Roy Alawner

    Before using the TV analogy, you said in this memorable article:

    “Quality is a subjective thing. But who’s better to decide on that? The web page authors creating content and linking to and from it, or the end users consuming it?”.

    In my opinion, the users.

    But, how Google may obtain the information about the users intent?. Maybe with data from Facebook, but this data will be very expensive for Google.

  5. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | brianlichtig

  6. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – meganmatthewsblog

  7. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Raymond Castleberry Blog

  8. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Gregory Ware Digital Marketing Blog

  9. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Xero Media Services

  10. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Anthony hanley

  11. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple | WritingCastle

  12. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Angora Cottage

  13. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple

  14. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple - Content Hydra

  15. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple - Evangelist News

  16. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | James Local SEO Expert

  17. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | SEO Bodhi

  18. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – Traci Simpson's Innovative Blog

  19. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple - On Page SEO Checker

  20. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – Jake Bennett's Small Business Tips

  21. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple

  22. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Internet Marketing Ideas for Businesses

  23. Pingback: PositionONE The Two-Part SEO Ranking Model: Let's Make SEO Simple - PositionONE

  24. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | craigdwhite

  25. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Digital Reading List

  26. aaron

    Great article Mike 🙂

    “how Google may obtain the information about the users intent?. Maybe with data from Facebook”

    They don’t need Facebook data to capture user intent. Search queries are an expression of user intent. Rather it is tracking the subsequent user behavior after the search which determines how well a particular document or website satisfies a user’s intent for that particular query.

    Further, Google has moved beyond the search box with Chrome & Android, allowing them to passively collect browsing & usage data in apps and across the web. They know not just what you search for & what ads you click on, but also where you are located, what YouTube videos you watch, which games you play on your cell phone, if they are losing their footprint in a particular category due to a growth in installs of the Yelp app or similar, etc.

    Google views Facebook as their #1 enemy in online advertising. Both are rushing to create proprietary closed versions of the web via things like Facebook Instant Articles and Google AMP. In the past the strategy was driven by sending users away as quickly as possible to the best possible solution, knowing that by doing so the user will keep coming back. Now it is about sucking in as much of the value chain as possible (knowledge graph, instant answers, Google Now, other interactive features like in-SERP solitaire, etc.) so the user never has to leave. It is quite easy to track user behavior if you host the content they consume / engage with.

  27. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple | Novo Creative

  28. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – A2Z Web Pros

  29. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – BLOG NEW WEB NETWORK

  30. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple |

  31. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple – SEO Blog, SEO News, SEO Technology, SEO Skills

  32. Pingback: The Two-Part SEO Ranking Model: Let's Make SEO Simple - Austin Local Search

  33. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | simonctophers

  34. Pingback: Aries Marketing Ventures

  35. Pingback: The Two-Part SEO Ranking Model: Let’s Make SEO Simple | Join with Trent

  36. Sonni Quick

    Fascinating. I started a blog 2+ years ago on a subject that affects millions of lives. It has heavy content. mynameisjamie.net. It’s on the prison industry. I searched out other blogs that wrote about this to see what they were doing and how it affected them. I also looked at many other blogs. Initially I knew absolutely nothing about what to do. Some write a blog like a diary. They only reach out to other bloggers using the same platform. They like each others blogs and comments had 1-4 words like “Great blog” “Thanks” and “You’re welcome.” Comments you can’t reply to.

    I wanted a broader appeal. That is when I realized “category” and “tag” were not the same, but most bloggers use them the same and even put the same 25 tags on every post regardless of what the post is about. I started studying and reading everything I could find until it started making sense. I have a ways to go. I also now have a newsletter and I’m learning how to make that work with the blog as I create a mailing list and using fb to promote the blog and newsletter as it slowly comes together. I am also done the first draft of a book on the life of an inmate and what is happening to him inside. The timing is perfect. Prison reform is hot in the media. I don’t want it to slowly build. I want to kick it up a notch. I don’t have much to put into it money wise because to hire an editor I will have to sell my jewelry. I put a few dollars into fb ads of my writing and likes and shares are growing.

    I wrote all of this just to tell you I learned a lot from your article. Part of every day goes to learning. Thank you.

  37. Pingback: Interesting Seo Ranking Factors: Let's Make SEO Simple

Leave a Reply

Your email address will not be published. Required fields are marked *