Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 05 December 2014

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: OK. Welcome everyone to today's Google Webmaster Central office hours hangout. My name is John Mueller. I'm a Webmaster Trends Analyst here at Google in Switzerland. And I'd like to help answer some of the webmaster web search related questions that have been coming up over the past week or so. Lots of questions were submitted already, so I'll try to go through some of those where I can, but if one of you wants to get started and ask the first question, feel free to jump on in.

AUDIENCE: Hey, John. [INAUDIBLE]. The first question. This is a rel=canonical issue I've been having on the website I told you about in the medical niche. We've been using rel=canonical to restrict Google from getting the filtered categories, so we just want to show categories, not the categories with filter supplied because we already have landing pages for the most important filters. We just don't feel the need that we need to show every combination of filters. So, basically, this is the URL that's showing the issue. We've applied rel=canonical for three months, I think it is now. And so all these pages are still showing in the results. I also used the Webmaster Tools to actually block [INAUDIBLE] from-- I'm not sure what it exactly blocks. I guess it blocks Indexing or [? probing. ?] When you go to the filters part in Webmaster Tools, it has URL parameters.

JOHN MUELLER: The URL parameters, yeah.

AUDIENCE: Yeah. I blocked [INAUDIBLE] from the [INAUDIBLE], filters, and nothing changed. I see there's no [? patched ?] version. I don't know if that means anything, but, basically, I don't think there's a need for those pages to exist in the index. So some of them actually just become one product [? listing. ?] So any idea why this wouldn't work? It might be something on page related issue?

JOHN MUELLER: So, I guess there are two aspects there. On the one hand, we don't crawl all pages equally quickly or equally frequently. So some of them will crawl very often, every day or so. And others might be crawled every half year. So this is something that, I think, from the first glance looking at this, I think these are essentially URLs that we crawl extremely rarely. So we know about them. We know that they exist, but we don't treat them as something extremely, important. So we don't crawl them that frequently, and I think, from that point of view, that's not something you'd have to worry about. The other aspect is that with the rel=canonical, we still have to index that URL first before we can follow the rel=canonical. So what will happen is we'll index that URL. We'll kind of go through the content. And then we'll follow the rel=canonical. So there's always this period of time where we might have this other version index normally before we actually forward all the signals to the canonical that you specified. And, in both of those cases, that's not something I'd really worry about. Like, the query you gave me is a site query with an in-URL part. And that's pretty artificial query. That's not something users will do. So what I imagine is if you search for some of those titles, you'll find the kind of normal versions that we actually do index for your site.

AUDIENCE: OK. Is there any situations where Google might actually ignore rel=canonical [INAUDIBLE] two rel=canonicals on the same page or something like that?

JOHN MUELLER: We try to follow it where we can, but we will ignore it for things like when we see that it's completely incorrectly setup. So when we see that the whole website as a rel=canonical to the home page, where clearly the webmaster did a mistake with copy and paste. And that's something where we'll say, well, we see this markup, but we're going to treat it as a signal. We're not going to treat it as a directive. And we're probably going to ignore it in a case like that. But, for the most part, if you use it within your website normally, then we'll definitely try to follow it. And we'll try to kind of keep those URLs from having, let's say, collected signals. We'll try to forward those signals to the rel=canonical that you specify. But that doesn't mean that we'll drop that URL completely from the search results. So, specifically, if you do a site query, if you do a complicated in-URL query, you're very likely still to see those URLs because we know about them. We can see that you're trying to look for them. So we'll say, well, if he really wants to look for these URLs, we'll show them to them. And that's probably, to some extent, what you're seeing.

AUDIENCE: OK. Yeah, that makes sense. Thanks.

JOHN MUELLER: Sure. All right. Let's jump through some of the questions here. And if you guys have any comments or questions along the way, feel free to jump on in.

MENASHE AVRAMOV: Well, I have a question.

JOHN MUELLER: OK. Go for it.

MENASHE AVRAMOV: Google changed my title for mobile and show different title on desktop search and on mobile search, a shorter one on mobile. What's the best recommendation for title for [INAUDIBLE] mobile-friendly title for Google?

JOHN MUELLER: We kind of have to shorten them on mobile because we have less room. So that's probably what you're seeing there. And, I guess, the question is if you want to create short tiles on your pages, in general, or if it's OK the way that they kind of got shortened.

MENASHE AVRAMOV: Well, a generic keyboard name, like home, for mobile. It's not something that [INAUDIBLE] relevant for mobile users.

JOHN MUELLER: So, like, for the homepage, or--

MENASHE AVRAMOV: Yeah, for the homepage.

JOHN MUELLER: Yeah, that sounds like a bad example. So, if you want, feel free to post it in the chat, and I'll pass it onto the team afterwards.


JOHN MUELLER: OK, here's a question about "dub dub dub, non-dub dub dub. The data shown for two sites is different in Webmaster Tools, I assume, but, in essence, it's the same site. Why is the data not consolidated into one view? Does this mean only part of the data is sent to analytics when connected?" So, in general, what happens with websites when you have different versions of your website with dub dub dub, non-dub dub dub, or HTTP, non-HTTP, or HTTPS, I guess, is we'll try to pick one of these versions as the canonical version. And we'll try to collect all of our signals, all of our data there. And usually, that works out fairly well, even if the webmaster doesn't specify anything specific. And you'll find all of the data in Webmaster Tools in that specific version. If you don't see that we're picking the right version or that we're kind of picking different versions, depending on which URL you're crawling, you can also use the rel=canonical. You can use any of the other canonicalization methods to really let us know that this is the URL that you actually do want to have indexed. And when you do that, over time, all of that data will be essentially shown in Webmaster Tools under that version that you specify. So, essentially over time, this should settle down into one of these versions, so you can primarily check that version. Until then, I'd just check both of these versions, depending on how you have your website set up. This also affects sites that have the mobile version under a different subdomain, for example. Like, some sites have, like, for the mobile version. And dub dub dub for the desktop version. And for those sites, as well, you need to kind of verify those versions in Webmaster Tools so that you have a complete set of the data. At the moment, we're kind of treating this as a technical separation. If you have different subdomains, those are essentially different sites. If you use different protocols, those are essentially technically different sites. And theoretically, you could have different content on them. That's why we kind of collect this data separately. And we've seen in the past, for example, some e-commerce sites explicitly do that, that they put some of their content on the HTTP version and some other parts of the content on the HTTPS. And if that content is clearly separated, then we need to track that data separately. I imagine over time, in the long run, we'll find a way to kind of consolidate all of this information into one website view. But I don't see that coming anytime soon. So, in the meantime, you're going to have to kind of get used to looking at these different versions. Make sure you're checking them appropriately. Also, making sure you're doing the right settings in the version that's actually canonical. So if you, for example, set your geotargeting in the dub dub dub version, but we're actually indexing the non-dub dub dub version, then we might not be using your geotargeting settings. So that's something you really need to make sure that the settings you give us are actually in the version that we're actually indexing. "What do I do when a site won't even rank for its own domain name, even after it's been optimized over and over again and never received a penalty?" This is, I guess, always a tough situation. Usually this happens if there are really, really strong signals on our side that are saying that we can't trust this website really at all. That's something where maybe, from the quality point of view, there's something really problematic here. Maybe there's something from a web spam point of view that's really problematic with this website. Sometimes we also see this question when you have a very generic domain name. For example, if you call your website, and you expect to rank for the query cheap loans, then that's not automatically going to happen. So just because it's your domain name doesn't mean we're going to rank your site for that. So, on the one hand, really make sure that you have everything covered from top to bottom, that things are actually working as they could be working. And then make sure that you're looking at it realistically, that your domain name isn't something that's so generic that we just have so many other options to show in the search results that we might not even get to yours if your site is fairly new, for example, or if it hasn't built up a really strong reputation over time. "Changing URL structure from category to flat-- does that matter for SEO? For example, ategoryacategorybpageurl.html to directly. I've noticed some sites using exactly the same URL structure." I think there are multiple aspects to this. On the one hand, it's important to make sure that you're giving us something that has like a unique URL structure in there, so when we crawl those pages, we can look at those URLs and we can say this is really unique identifier for those URLs. This is something we sometimes see as problematic for sites that rewrite their URLs. They internally have a URL structure that works with IDs or with names. And internally, it works with URL parameters, for example. But that's kind of rewritten in a way that it makes it look like there's actually a physical directory structure there. And a lot of times, we'll see that people do this rewriting in a way that's sub-optimal, in that there are a multiple or almost infinite number of combinations that lead to the same content, where maybe you can switch, like, the different parts of the path around, or maybe the whole path is irrelevant. And, actually, there's an ID way at the end that actually specifies what the page is. Upper, lowercase might be totally irrelevant. And in cases like that, we might crawl a lot of variations and all come up to the same content. So that's something to keep in mind when you're looking at your URL structure. When you're comparing things like a flat URL structure to a folder-based URL structure, in general, that's not something I'd worry about. That's essentially equivalent from our point of view. Sometimes it's easier to work with a flat structure. Sometimes it's easier to work with a structure that has separate folders. So that's essentially something we leave up to you.

AUDIENCE: Quick follow-up with that, John. So does rel=canonical usually help with these situations where, let's say, a product page [INAUDIBLE] URLs based on what category is it accessed from? Does that usually help? And I've noticed on CMSs, for example, as I said, you can access the product from multiple categories. It generates multiple URLs. [INAUDIBLE] rel=canonical with the categories removed, so it's just [INAUDIBLE] product, that HTML. But that URL isn't actually accessible from anywhere through the website, out of that rel=canonical. So you can't really access it directly. You can just [INAUDIBLE] get to it through rel=canonical. Would that be an issue or a Google [INAUDIBLE]?

JOHN MUELLER: So, on the one hand, with the rel=canonical, one thing to keep in mind is we still have to re-crawl all of these pages to find the rel=canonical. So it's not like a redirect that leads us directly to the final page. So if you're limited from your server capacity that you don't want Googlebot to crawl unnecessarily, then maybe a redirect is better there. But, on the other hand, if you have a limited number of combinations that lead to the same page, then that might not be that problematic. So if you have, let's say, four or five different category combinations that all lead to the same product, and it has a rel=canonical to the product URL, then those five times are probably not going to cause problems on your server. On the other hand, if you have an infinite number of combinations that all lead to the same product URL, then that could potentially cause problems. So in the past, for example, we've seen things like session IDs in the past. So you have product name, and then a session ID, and then the actual product ID, and those are the kind of things where we almost have to crawl an infinite number of URLs to even notice that there's a problem here. So that's something to really watch out for. But if you have a limited number of categories, and they all lead to the same product, that's not something I'd worry about.

AUDIENCE: And is it an issue that the rel=canonical target isn't actually accessible through the website, the structure itself? It's just [INAUDIBLE] rel=canonical target?

JOHN MUELLER: In general, that's not a problem. Sometimes what will happen is we have almost like a mixed set of signals that we see all of the links point to one URL. And that URL has a rel=canonical pointing at a different URL that has no links. And then we can have this mixed bag of signals where we say, well, everything is pointing at this URL, but this is actually the one you want indexed. But that one is saying you actually want a different one indexed. So that's a kind of situation where our algorithms will have to make some kind of a judgment call. And it's not 100% certain that we'll always follow the rel=canonical in a case like that. We'll try. But if it's really the case that none of the links point to the actual page, then that's something where I wouldn't guarantee that we'd always pick that page.

AUDIENCE: Oh, OK. Because Magento actually uses this type of structure. So I'm always curious [INAUDIBLE] analyzing the Magento website-- well, obviously, it's not the optimal way. But, yeah, thanks. [INAUDIBLE]

JOHN MUELLER: I think it's also important to keep in mind that if we have different pages that show the same content, then it's essentially irrelevant for the webmaster which one we actually showed as a canonical. So we might show this one as a canonical because it's a nicer URL maybe or it's the one that you have the rel=canonical pointing to, or we might pick a different one as a canonical because all the links point to that version. But the page is going to rank exactly the same, like the content there. So it's not the case that the website would have any disadvantage if we chose this one or that one.

AUDIENCE: Well, I know there's no duplicate content penalty of any sort. But isn't it kind of more optimal to have a limited set of URLs in the Google index that all have the best content there on the website?

JOHN MUELLER: I mean, that's always a good idea, I think, but it doesn't really matter which one we actually pick for that. So if you have one URL in a category, or you have other URL directly with the product name, then if we pick one or the other, it's still one URL. And it's one that we show in the search results. So it's not something where I'd say this is a critical problem that the webmaster has to fix. It's kind of the webmaster is giving us their preference, but we're ignoring their preference. So maybe the webmaster says, oh, but I really, really want this URL index because I like it a lot better, then they'd need to give us more signals to kind of support that decision.

AUDIENCE: OK. So Google basically makes a decision based on all the signals that it gets and tries to retrieve the best option for [INAUDIBLE]. One quick question. You said that Google doesn't actually guide itself based on the URL structure to understand, for example, how the product fits into what category and such. Is there a difference when using structured data for breadcrumbs? There is a difference when showing the snippet. Is there a difference in how Google understands where the product fits, in what categories, relevancy, and such?

JOHN MUELLER: I'd say we don't necessarily do that based on the URL alone, but we do see that when we crawl the website. When we see this is a category page, and it links to this set of products, that really helps us understand the context of those pages. And that's something that is usually also reflected in the rich snippets. But, primarily, we'd see that that through crawling the website and see how it's kind of connected.

AUDIENCE: So the breadcrumb structure data is some sort of a signal that helps Google understand better, or it doesn't matter?

JOHN MUELLER: I think we mostly just use that for the snippet in the searches, for just laying the URL a little bit nicer. With breadcrumb markups, specifically, you just need to make sure that you're using the markup that we have in the Help Center, not the markup because the markup doesn't work yet.

AUDIENCE: OK, thanks.

JOHN MUELLER: All right. Let's grab some more questions here.

AUDIENCE: Hi, John. I was wondering if I could jump in with a question if that's OK.


AUDIENCE: Oh, thank you so much. OK, so I posted this to the Q&A. I'll just read it real quick. My question is, is there an algorithmic component to Penguin? And how easy is it for a bad player, such as a spam bot, to negatively affect my domain with a bunch of malicious do follow links? The reason I ask is because I found that I have over 25,000 do follow links to one page on my site, which have spiked up in the last 60 days or so. And it's appearing on thousands of hacked forums inside a spam post. And I tried to disavow about 500 domains, as a result of this. But I'm kind of unfamiliar with what can be done or any thoughts you have on all this.

JOHN MUELLER: So Penguin is essentially completely algorithmic. It's not something where we'd manually go through and sort those out. And we do look at web stamp signals when it comes to Penguin. So things like really spammy links would be included there, in general. Whoops. Something's ringing. OK. Hold on.

AUDIENCE: No worries.

JOHN MUELLER: I don't know if that's even on my side. OK. We'll just ignore it for the moment. So, in general, we take into account those kind of things in our algorithms. In practice, however, we have a pretty strong protection against this generic type of hack content, this generic kind of auto-generated spam. And that's not something where I'd say we'd see that as a big problem. If you do see this happening, then I think putting it in your disavow file is a great idea because it kind of takes the problem out of the world, and it's something you don't have to worry about anymore when you do that. So I'd say, for the most part, we catch these things automatically, you don't really have to worry about them. If you do see them, taking care of them yourself kind of eases the load on your site and makes sure that we actually do ignore them.

AUDIENCE: OK. I appreciate the feedback on that. Thank you. Because we're definitely not what I would call an SEO experts. We've been running our site for almost 10 years, and this year is the first time I'm finding we have experienced a negative decline in traffic around the same time that people are reporting these algorithmic updates. So maybe it is Panda, too. I'm not sure yet. We're still in the process of trying to figure all that out. So thank you.

JOHN MUELLER: Yeah. I'd definitely take a look at the dates to see when you're seeing these changes. It might also be that you're just seeing kind of a subtle decline, which is just our algorithms, in general, maybe not being as happy as they used to be with your website, in general. And those are the kind of things that are almost harder to resolve as a webmaster because there's no line in your HTML that you have to fix there. There's no disavow that you have to add there. You really have to kind of take a step back, look at your website overall, and think about what you could be doing, in general, to significantly take it a step further when it comes to quality, when it comes to the website itself.

AUDIENCE: That make sense. We have been spending a lot of time sort of re-looking at everything, so that's great. Thank you so much. Thank you.

JOHN MUELLER: Sure. And I think one thing just to keep in mind that these changes don't take effect immediately. So if you make bigger changes on your website when it comes to quality, if you do these disavows, then that's not something where you'd see an immediate change. Sometimes it really takes several months for that to kind of bubble down in our algorithms. And specifically with regards to something like the Penguin algorithm, that's not run that frequently, so that might take a while for it to actually be reflected.

AUDIENCE: OK, sounds good.

JOHN MUELLER: All right. "Will their be advice in Webmaster Tools if a site is hit by an algorithm? It would be a nice and helpful feature." I have heard this a few times, yes. We do talk about this with the Webmaster Tools team and with the search quality teams that work on algorithms. I think it would be nice to have some more information like this in Webmaster Tools, but I don't see that happening in the near run. So maybe in the long-term at some point, but definitely not really soon. It's really tricky with a lot of the algorithm information that we have because the algorithms are essentially built to provide relevant search results. They're not built in a way that you could take this information one-to-one, and give it to the webmaster, and the webmaster would have something to work on. It's, essentially, we're trying to bring the best search results. And sometimes that maps to something that the webmaster can do directly. But a lot of times, it's just a change in the web, a change in how we think relevance should be handled. And that's not something that we can really tell the webmaster and say, hey, by the way, something changed on the web, or something changed in our overall systems, and maybe you could do something on your website. But we don't really know what to tell you. So these changes happen all the time, and it's not always a case that there's really a one-to-one relationship between changes in our algorithm and something that the webmaster could specifically do to kind of bump a site back up to number one where it used to be.

AUDIENCE: [INAUDIBLE] actually be helpful is telling the webmaster that it's kind of closing in on some of those spam issues, especially-- so usually when there's a negative sign that something the website is doing isn't in agreement with the Google guidelines, even that would be enough.


AUDIENCE: It's something that the webmaster can actually take control of, especially--

JOHN MUELLER: Exactly. I think some of our algorithms might fall into that category a little bit more where we'd say maybe-- I don't know. The overall view of the quality of your website has gone down. That might be something the webmaster could take action. Or if we had an algorithm that looks at keyword stuffing and said, oh, we're seeing a lot more keyword stuffing on your pages, you should kind of hold yourself back a little bit. That might be something that the webmaster could work on, but a lot of our bigger algorithms really look at kind of a whole combination of signals. And it's not something where we could say this algorithm says your site went down a little bit. Therefore, you need to do this specifically. It's really hard problem. But I think, to some extent, some of this is kind of a logical progression of what we've done with the manual actions with the web spam issues, which we have brought up in Webmaster Tools now. And I wouldn't say it will never happen, but I don't see this as something that's trivial where we'll just, oh, yes, we'll just add this feature to Webmaster Tools, and it'll be really useful. There's a lot of work that would need to be done before we could get to that point.


JOHN MUELLER: OK. "Which ranking factors or metrics are most important for SEO in Google? Which of those should be more focused on?" This is a tricky question. From my point of view, I kind of split this into two things. On the one hand, I think, from a technical point of view, there's a lot of work that kind of needs to be done as a foundation. And that comes to understanding how crawling and indexing works, understanding how your server responds, what kind of capabilities your server has, how fast we can crawl it, how many URLs we can find from your server, how we can crawl your website in a way that it doesn't end up in infinite space, an infinite number of URLs, and how we can actually index your content. Sometimes we'll see websites that use one URL for the whole website. They have a fancy JavaScript app or a flash app, and the whole website is one single URL. And that's not something we can index. We don't have different URLs we can focus on. So I think, from a technical point of view, there is this foundation that you have to build on. And I think that's a large part of SEO that is really critical to a website. For a lot of websites that are existing CMSs, some of this is probably already covered. So it's not that you explicitly need to work on this, but sometimes it's already provided by default with a generic installation of WordPress or whatever you're working on. Other factors past that when people talk about their keywords and titles, or keywords and headings, and all of those things, I see that more as something almost indirect in the sense that if you have a really great website on top of being crawlable and indexable, then these are things that our algorithms should be picking up on automatically. It's not something where you'd artificially need to be adding your keywords into specific places, or building links with specific keywords to kind of get that thing indexed or kind of picked up in rankings properly. In the long run, if you work on creating a really fantastic website, then that's something our algorithms will try to pick up on from various angles. So, from my point of view, I like to just say you definitely need to have this technical foundation in place. And that's a large part of SEO. That's a large part of the things that many websites do wrong. Even really big websites get that wrong. And, on top of that, you really need to make sure that your website is absolutely fantastic. So instead of focusing on keywords, really make sure that you're covering everything for the users instead. But I know people want a list of individual factors, like where you need to put your keyboard in the URLs, or in titles, or in headings, or which meta tags you need there. But I think those are all really short-sighted metrics, where if you focus on putting your keywords into the titles, and you have a really mediocre website, then in the long run, that's not going to work out. That's something our algorithms are either going to pick up on right away or pick up on in the long run.

JOSHUA BERG: John, just give me the exact keyword density number.

JOHN MUELLER: Yeah. We tried that once. We made a job in one of these Hangouts. I said you need to get this keyword density, and someone didn't realize that we were making jokes, so I'm going to refrain from making jokes like that.

JOSHUA BERG: OK. I have a question about diversity in content. So for a while, differentiation or diversity in content has been a significant factor because no one wants to look at the search results and see all the same kind of content there. But in relation to platforms that are used by many different people-- e-commerce, for example-- I'm curious about how deep that would go. I mean, even if the search algorithm is only looking at the visible content rather than layouts and coding formats, et cetera, all of those things accumulatively on certain platforms do come together to provide a lot of sameness, especially in the area of something like e-commerce. So what I've seen in a project that I'm working on now-- like, one particular e-commerce platform that's a software-as-a-service type of platform-- so it's all online. It means you have to either use their templates or build with what they have. So even if you're customizing, for the most part, people use either scripts on their server are required or fonts and CSS, everything that's on that particular type of server. So I've had concerns about some of these in e-commerce arena not providing enough differences. And so this particular project that I was working on, I did some research of ranking of all like the 100 top sites in that area and then looked at all of the platforms that they were using. And, sure enough, I did find that this particular one that was a software-as-a-service that always had links back to the servers and had a lot of similarity-- none of those appeared in the top rankings, even in the top 100 of this area. And it's a popular service, like in the top 1% of e-commerce. But then another one, which is even more popular, which is like the top 10%-- Magento, for example-- but you can download it, or they have enterprise editions, et cetera. So you get a lot of versatility. When you download it, and you can program and set it all up exactly the way you want, so there's a lot of more room for customization. That particular platform, I see taking up a third of the results. So does that make sense here, or do you think I'm just looking-- that there's not enough correlation there, that I'm missing the wrong things? Because with a particular platform that uses a lot of the same type of content like that.

JOHN MUELLER: I wouldn't necessarily worry about that. I think you might just be seeing more like these kind of secondary effects there. That's something where we wouldn't be looking at it from that point of view and say this looks different, therefore, it's better than another site. We're essentially looking at these sites overall. And if this is a reasonable platform, if it works well for users, and users are happy with that layout, then I think that's completely fine. That's not something where I'd artificially change like the CSS or the UI to be unique just for the sake of uniqueness.

JOSHUA BERG: And so even if it referenced all of the scripts, and fonts, and everything within the same platform, you don't--


JOSHUA BERG: That might be a particular issue?

JOHN MUELLER: I wouldn't worry about that. That shouldn't be a problem.

JOSHUA BERG: All right. Well, I guess then the other things that would provide more uniqueness there then may play a more important role. The people that are running a unique platform like that would also tend to be the people who would know more about how to customize and create really unique content.

JOHN MUELLER: I imagine that's partially the case. But also the case that the people who have the knowledge to keep it a platform like that running on their own, they have a lot of experience with e-commerce. And maybe they have a lot of experience working with users and know which type of interactions make sense, which type really works well. And I wouldn't necessarily assume that if you're using a default installation with even a default theme that the theme itself is something that's going to cause SEO problems. That's more something that users would see, and that they might react to differently where they'd say, oh, this looks like a generic Canadian pharmacy site because all the spammers use the same theme as this legitimate e-commerce site, then that could be a problem. But it's not the case that our algorithms would look at that and say, well, this is a generic theme like all other e-commerce sites that we treat this as kind of lower quality or bad. I mean, you can see the same with blogs, for example, or with default CMS setups where there are a lot of really good blogs on a default WordPress installation, and they rank fine. It's not the case that you need to have a unique UI there if you're focusing on the textual information that you're giving to users.

JOSHUA BERG: OK. Yeah, so the secondary effects then could be--

JOHN MUELLER: Yeah. I mean, that's something if users feel confused about your website, that's something where they might not recommended it as much, or they might not pass that onto other people. They might not go there. They might not search for it directly over time. All of these kind of secondary things where if you're not making your users happy, then they'll go somewhere else.

JOSHUA BERG: Right. All right. Thanks.

JOHN MUELLER: All right. "We use Disqus for user comments. Currently, Google can't see the comments due to JavaScript. Should we reconfigure so that Google can see the comments? Are user comments considered good content? Should we be worried that comments can dilute the quality of our good content?" So there are lots of good questions in this one question. On the one hand, when it comes to being able to see that content, we're getting better and better at understanding JavaScript. So I'd check the [INAUDIBLE] rendered view in Webmaster Tools to see how much of it we're actually picking up on. Maybe we can actually read the comments directly in the meantime. And that might solve that problem, or at least answer that part of the question. Another thing, I believe, which is also with Disqus comments is that you can add a plugin to your website that will add those comments directly into your HTML so that all search engines can see it, even those that don't access the JavaScript version. So that might be another option to kind of get those comments included into your website. With regards to whether or not you'd actually want to do that, that's, I guess, a totally different question. And that's something where we essentially try to treat these comments as part of your content. So if these comments bring useful information in addition to the content that you've provided also on these pages, then that could be a really good addition to your website. It could really increase the value of your website overall. If the comments show that there's a really engaged community behind there that encourages new users when they go to these pages to also comment, to go back directly to these pages, to recommend these pages to their friends, that could also be a really good thing. On the other hand, if you have comments on your site, and you just let them run wild, you don't moderate them, they're filled with spammers or with people who are kind of just abusing each other for no good reason, then that's something that might kind of pull down the overall quality of your website where users when they go to those pages might say, well, there's some good content on top here, but this whole bottom part of the page, this is really trash. I don't want to be involved with the website that actively encourages this kind of behavior or that actively promotes this kind of content. And that's something where we might see that on a site level, as well. When our quality algorithms go to your website, and they see that there's some good content here on this page, but there's some really bad or kind of low quality content on the bottom part of the page, then we kind of have to make a judgment call on these pages themselves and say, well, some good, some bad. Is this overwhelmingly bad? Is this overwhelmingly good? Where do we draw the line? And we do that across the whole website to kind of figure out where we see the quality of this website. And that's something that could definitely be affecting your website overall in the search results. So if you really work to make sure that these comments are really high quality content, that they bring value engagement into your pages, then that's fantastic. That's something that I think you should definitely make it so that search engines can pick that up on. If, on the other hand, these comments are kind of low quality, un-moderated, spam, or abusive comments just going back and forth, then that might be something you'd want to block. Maybe it's even worth adding moderation to those comment widgets and kind of making sure that those kind of comments don't even get associated with your website in the first place. "Removing backlinks from [? developer ?] tools is a very long and frustrating process. What can we do about it?" I think it's not the answer will be not to build these kind of spammy backlinks in the first case. But I know a lot of you are stuck with a website that someone else has kind of promoted in the past, and you work on cleaning these things up. And I know this is sometimes a bit of a problem to kind of get all those links compiled and go through them, manually figure out which ones are really spammy, figure out which ones are actually really good ones that you definitely want to keep. My recommendation there is to work with something like a spreadsheet and kind of go through them individually. If you use something like Google Docs, you can share that work with other people, as well. You can get advice from the community and kind of say, hey, this is what I think I should submit my disavow. Do you agree? Do you not agree? These are the links I want to keep. These are the links I think are really spammy. Can you help me find a solution there? And, usually, when you work like that, it's definitely a time consuming process, but it's something that's doable. I know there are also third-party tools that help you with that, that kind of try to pre-filter those links that you have. And that might be a solution, as well, depending on how big your website is, how much you actually have to work to clean this up. "If you're a new startup website, and a new domain in a segment where there are already lots of high quality, high ranking sites, what's the best strategy? Would you focus on niche articles for long-tail keywords first?" I think I would treat this like any business situation where you're going into an established market, and think about what you can do to be unique. And instead of competing one-to-one with all of these established websites, maybe you can find a niche that is something that the other people aren't focusing on yet and kind of go through the back door. So I don't think there's any SEO trick to ranking a new website in a very well-established market, just like there isn't any business trick that will kind of drive business to your business if there's already a very well-established competition in the same area.

AUDIENCE: John, can I just go back to the question about the spammy links?


AUDIENCE: Sure So I work for a email service provider, and what our app basically does is we have a plug-in where our clients can basically subscribe to it. And we obviously have a link going back to our site for that. Would you treat that as spammy links, as well, [INAUDIBLE] just basically [INAUDIBLE] in the group chat, as well, there?

JOHN MUELLER: OK. And what's-- I mean, how do these links get forward, or how does that happen?

AUDIENCE: So this is basically a subscribe form from a customer, but it's linked to our app. And obviously the app is hosted by us, so automatically these URLs get created, these things get created through our app. And it comes back as backlinks. So would you say that's spammy? Because I remember in a previous Hangout that you had about someone asked about a WordPress plug-in, and they had backlinks in there. And you said that's not a good idea. So would you say this is a good idea or not? Should we just add nofollows on these?

JOHN MUELLER: I guess I'd nofollow just to be on the safe side here. I don't know exactly how you do this at the moment because it looks like there's JavaScript behind it.

AUDIENCE: There is.

JOHN MUELLER: So I don't know if there's even like a direct link there, or is this an iframe that other people would embed?

AUDIENCE: It's basically a JavaScript that's pushing through, yes.

JOHN MUELLER: I mean, if this is just JavaScript that's doing something, then I wouldn't necessarily see that as something problematic because it's not something that we'd pick up on as a real link that would be passing page rank. But if you had a link on the bottom, I guess, the extreme case is one where the link is totally unrelated to the service where you have a cheap casino's link on the bottom of your URL form, then that would definitely be a problem. If you have a link back to your service, that's something that, depending on how you present this, could be completely fine. It could be--

AUDIENCE: So, if it's powered by our company, that would be fine?

JOHN MUELLER: Yeah. I wouldn't necessarily worry about that. Yeah.

AUDIENCE: OK. And then one last question, if you don't mind with regards to hdev lang. So, sorry, I did post in the Q&A , but I'll ask it anyway. I updated my hdev lang tags on the 28th of November. And our CMS that we were using was basically pushing out incorrect ones, like, for example, cn-cn, which was supposed to be zh-cn. So, yeah, that was updated on the 28th. I looked in my Webmaster Tools two or three days ago, and I show that the data was refreshed on the second, but it was still showing the old tags, if you want to call it. It hasn't updated to the new tags yet.

JOHN MUELLER: So we have to re-crawl those pages before we can actually reprocess those tags. So I imagine this is something where, over the next couple of weeks, you'll see kind of a gradual change in those graphs. And that's essentially what you'd be looking for. You wouldn't see a jump from one day to the next.

AUDIENCE: Of course, yes. Thank you very much.

JOHN MUELLER: Sure. All right. "Why is Penguin still rolling out?" I don't know exactly why. I know that the engineers are working on this, so this is something that's essentially still happening. I had assumed that this was complete a little earlier, but I guess it's still happening. Let's click that one. There's a startup question again. So "if you're a startup, should you focus on a smaller number of pages or a higher number of pages?" In general, I'd probably try to focus on a smaller number of pages and make sure that they're absolutely great, instead of a high quantity of pages that are kind of mediocre, or even lower quality, or that might be auto-generated. I'd really make sure that you have something that you can present that really kind of shines, that shows the unique value of your new business. "How will a website be affected because it's using tabs, and what's the best solution for the user and the webmaster? I'm using tabs because of mobile usability." In general, this isn't something that's new. It's, essentially, the way we've handled hidden content for a long time. So this question kind of refers to a comment I made in one of the previous hangouts where we said that hidden content, even if it's hidden behind tabs, or if it's behind click-to-expand sections, that we kind of discount that kind of content. And that's been the case for a number of years now. So that's not necessarily new, and it's not that websites that are using tabs will suddenly see a drop in rankings because of that. If you're using this for usability, and this is content that's secondary to those pages, that's absolutely fine. If this is the primary content of your page, and you're placing it behind tabs, then that's something where I might create separate pages or bring that content into the visual part of the page so that we can actually treat it with the full value. So those are essentially the recommendations there. If you know that this is secondary content that's not critical for people who are viewing the page, maybe additional information about the product that not everyone is searching for directly, maybe your address, which is otherwise mentioned on your homepage already-- those are the type of things that could be perfectly fine behind tabs or behind click-to-expand sections.

AUDIENCE: Hi, John. I just wanted to also ask real quick, if there's an approximate number you can provide of how much time it takes to recover from an algorithmic penalty-- or, in other words, how frequently does Google refresh its index to factor in improvements which a particular site has made?

JOHN MUELLER: That's a tricky question. So, essentially, we have some algorithms that work every time we crawl a page. And, if you have a large website, then we'll crawl some of your pages daily or even several times a day. And other pages on the website, we might crawl every couple of months or maybe even every half year or so. So if you make a significant change on your website, and our algorithms picked that up immediately, then you'll see kind of a gradual change in the search results based on that. You won't see a change from one day to the next. Other algorithms are run less frequently. They might be run every week, or every month, or something like that. And for those, you might see kind of a jump when they actually take place. So that's, essentially, I guess, the main differences there. Some algorithms are pretty complicated to run. They use a lot of complicated data. And they might be run even less frequently than monthly. So it kind of depends on which algorithms you're looking at and what kind of changes you're making on your website. In general, what I'd recommend in a case where you suspect that your site is affected by an algorithm that takes a long time to run is just make sure that you don't get stuck in this kind of iterative battle against this algorithm. So don't try to just fix 10%, maybe that's good enough. But really try to make sure that you're going all the way to clean this problem up completely so that when the algorithm is run again, you don't have to go back to the drawing board and say, OK, I'll do 10% more, and maybe that'll be enough. So really try to take the whole problem out of the world completely as much as possible.

AUDIENCE: OK, that makes sense. Thank you. Do you guys also factor in social shares across various channels. Like, if I'm getting more Facebook shares, or Twitter tweets, or Google shares, does Google factor that into its quality score?

JOHN MUELLER: We don't use social signals directly for the search results. Partially, that's because we don't have access to all of these and partly, because it's just such a big mass of signals that is really hard to kind of bundle into kind of a sign that this is a good website, or this is a bad website because sometimes people talk a lot on social media about something that they don't like. And that's kind of hard for us to differentiate in cases like that. So we don't use social signals. I think I was at a conference last month with someone from Bing, and they also said they don't use social signals. They have access to, I think, the Facebook data, but they don't use those social signals for normal ranking either.

AUDIENCE: Thank you so much.

JOHN MUELLER: All right. "Is it good to promote our clients' products or services to as many social media sites as we can? Is it better to have different content to every social media website?" Essentially, it's kind of similar to the previous question. We don't use social signals for search. But we do index the content on the social media sites when it's public, just like any other content that find on the web. So if you have the same content on a social media platform as you have on your website, you're kind of competing against yourself with the same kind of content there. So I tend to do something at least slightly differently on these social media sites. And most of the time, that makes sense anyway. You don't want to take a multi-page blog post and copy and paste it into Twitter, into 20 different tweets that are all numbered so that people can kind of click through them. You'd do something different on these individual social media sites to engage users to bring awareness of your product or your service to those users there, to kind of interact with those users there, as well. And it's not the case that we'd use the social signals directly as something in search, but a lot of these things happen indirectly in that if users engage with you on Twitter, or on Facebook, or wherever, then maybe they're go to your website. Maybe they'll recommend your services. Maybe they'll recommend your website directly to other people. And that's the kind of thing where, indirectly, it does kind of flow back into search. But, as a direct effect, there's no kind of use of the social signals directly in search. OK. Oh, wow, a question about Authorship. "Authorship markup and reputation-- I know it's no longer used. Does Google assign value to content depending on the author's identity? Depending on the answer there, is there an alternative markup to connect an author to content?" At the moment, we don't use Authorship markup at all. We don't track that information at all. I could imagine that maybe at some point, that'll change. But, at least at the moment, there is nothing specific that you could put on your pages to say, well, this is the exact author. The usual type of things that you can put on pages, like your byline, maybe a link to your profile. I think that always makes sense for the users. It might be something that we'd use directly in search, though. So we don't use Authorship anymore. I wouldn't rely on the Authorship markup on doing anything specific, other than being a link to your profile.



JOSHUA BERG: Do you think the author tag has potential to get used considerably more in the future, in this regard, if adoption is more widespread?

JOHN MUELLER: I don't know. I really don't know. I know we had a lot of this Authorship information already. And if it turns out that the existing information that we had wasn't really useful for us in search, at least, I think something significantly would need to change on the trailing side of where we process all of this information for us to kind of switch to markup and use that for Authorship. So I imagine that's not going to happen any time soon, but I wouldn't say it'll never happen. And, as with other markup on, I think it always makes sense to use that markup if it's trivial for you to add because it gives a little bit more structure to your pages. It doesn't mean that they'll rank better, but it gives us a little bit more information there. And, from that point of view, I don't see that this is a problem if you use that markup for Authorship, but I also wouldn't expect it to have any kind of direct value, at least in the short-term.

JOSHUA BERG: All right. Thanks.

JOHN MUELLER: "Is there an option in Webmaster Tools to refer an m.domain mobile site to a normal domain? There are always errors in the mobile usability report, but we have a special m.domain for mobile." I'd just make sure that you're using the proper markups to tell us about the connection between your mobile pages and your desktop pages, that you have the rel alternate set up appropriately, that you redirect appropriately, use maybe the HTTP [INAUDIBLE] header if needed, you have the rel=canonical from the mobile to the desktop pages so that we can really connect your mobile pages to your desktop. And then, in general, we'll pick up on that, but you'll still find this information separately in Webmaster Tools. So you'd need to verify your m.domain in Webmaster Tools. You'll see probably things like the search query data in Webmaster Tools separately, definitely the crawl data separately in Webmaster Tools. So if you have your dub dub dub and m. Site, I'd definitely check both of those versions to make sure you're looking at the full picture before making any big decisions around that. And this is something where if you use responsive design, use the same URLs, you're kind of side-stepping this big problem about the different sites by just having everything in one version of your site. But a lot of people already have an m.domain, and I wouldn't just arbitrarily switch over just for that. All right. We're pretty much out of time. I saw one question, I think, further down somewhere here that I wanted to get to. "We're a hosting company with subdomain showing client sites. For example, 1.2.3/ changing this is not an option. [INAUDIBLE] associated with our main domain,, and get us penalized. How can I tell Google to ignore those subdomains?" This is kind of a tricky situation because, to some extent, this looks like all a part of the same website. So, if it all possible, I think differentiating between your reversed IP lookup domain host-names that you have there and your main content website would be a great way to kind of separate those issues. Another idea is if you're a hosting company, make sure that all of those IP addresses redirect to the actual website. So, instead of just allowing the reverse lookup to actually show the content, make sure that the reverse lookup actually points to the hosted domain name or the actual domain name that's used by this website. So those are, essentially, the recommendations I'd have there. It's kind of tricky on our side. We try to differentiate between something like a main domain that's used there and subdomains that are used for hosting different content. We try to be as granular as possible with our algorithms there, but sometimes we just see the domain name, and we see all of this content there. And it's really hard for us to separate the good parts from the bad parts. So if you can make it easier for us to differentiate between your main, separate website and all of this user-generated content that you're also hosting, then that makes it a lot easier for us to treat them separately.

AUDIENCE: John, can I ask you a very quick question?


AUDIENCE: I just wanted to know if links inside the HTML header should ever be no followed, or whether it makes any difference at all?

JOHN MUELLER: So you're probably talking about the link element, the rel alternate, those kind of things? No, I don't think there's even a provision for no-follow there, so I wouldn't worry about that. They definitely don't pass page rank.

AUDIENCE: Sure. OK, thanks, John.

JOHN MUELLER: Sure. All right. With that, we're a bit over time. Is there one burning question left?

AUDIENCE: Uh, can I?


AUDIENCE: A really quick one. I have a Canadian client and doing a little query like this with a [INAUDIBLE] domain for the brand name of the client. [INAUDIBLE] client, I get the Knowledge Graph with the map and the address of the client. So he's a Canadian client, and if I switch to, for Canada, I don't get the map anymore and the address, unless I actually add the address [INAUDIBLE] in the query or location. And I used the structure data to let Google know that's the local business, connected to a Google Places page. Is there anything else I can do so for, it also shows the map and address for the brand name?

JOHN MUELLER: I don't know. You can send me the URL. I can double-check. What might be happening is that we're not picking up the geotargeting information correctly. So I'd double-check Webmaster Tools, how you have that setup. Maybe, also, double-check in Webmaster Tools if you have, for example, the HTTPS version, that you have the same geotargeting for the non-HTTPS version, or dub dub dub, non-dub dub dub. I've seen some cases where you have one country set at the dub dub dub version and a different country in a non-dub dub dub version, for example. And that can really confusing. But if you can post the link, or if you can send it to me directly, I can double-check to see if there's anything like that happening.

AUDIENCE: OK. Sure, thank you.

AUDIENCE: John? We still have some time?

JOHN MUELLER: Well, kind of, I guess.

AUDIENCE: I just don't want you to miss the train, so-- it's a short follow-up on the earlier question related to posts of comments. Since there are a lot of websites which use comments, is there any quantity of comments to stay on the safe side? I mean, to post only the first five comments and so on and so forth to be able to keep the spamming part, if I may say so?

JOHN MUELLER: It's not that we look at a specific number there. I think we mostly look at the overall quality of the page, and if the overall quality of the page kind of gets pulled down because of these spamming comments, then that's something that could kind of worry our algorithms. If you have a big piece of content, and there are two spamming comments on the bottom, then that's something that's not going to have a big effect. But if you have a shorter piece of content, and you have 100 spamming comments on the bottom, then that's obviously going to skew everything a little bit in that direction.

AUDIENCE: I understand, so it's about quantity. Yeah, OK.


AUDIENCE: Thank you.

JOHN MUELLER: Yeah. OK. So with that, let's take a break here. I'll set up the new Hangouts, I think, in two weeks, just before the holidays, I imagine. So maybe I'll see you guys there. Otherwise, I wish you guys a great week, and see you next time.

AUDIENCE: Thank you, John. Have a great weekend. Bye bye.

JOHN MUELLER: Bye, everyone.

AUDIENCE: Thanks, John.


AUDIENCE: Bye. Thank you.

AUDIENCE: Thank you.

AUDIENCE: Thank you. | Copyright 2019