Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 22 September 2014

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: Welcome, everyone, to today's Google Webmaster Central Office Hours Hangout. My name is John Mueller. I am a Webmaster Trends Analyst here at Google Switzerland, and my goal with this hangout is to try to help you guys with any webmaster, web search type questions that Google might be involved in. I put something slightly different together for today just to start off with. Sometimes, we see issues that start coming up at some point and that we think might be interesting to share with others as well where, from our point of view, the engineering team finds something really weird happening. And we might go out and contact some webmasters who are doing these type of things, but we don't have any system built into Webmaster Tools, for example, that helps to alert you of these issues. So I thought maybe it would be interesting to compile some of the things we've seen recently to make it possible for you guys to check your sites as well that these things aren't causing any problems. So these are three of the issues that we ran across in the last couple of weeks as being a little bit more visible, more problematic, from our point of view. On the one hand, serving smartphone Googlebot the desktop agents. That's something we occasionally see. Similar, in this inset, it's both about mobile, disallowing crawling of the smartphone pages or the redirects. And we saw this really weird type of broken content type cloaking issue. The first issue that we saw there is specifically about websites that use dynamic surveying or that use separate URLs, and they try to recognize the user agent of the user that's accessing those pages and try to bring that issue up. Can you guys see my presentation?


JOHN MUELLER: Perfect. All right. Like I said, these pages try to serve the right content to the right user, depending on the user agent. And sometimes we notice that they're explicitly trying to recognize Googlebot and always serve it the desktop page. And then what sometimes happens is desktop users will see the desktop page, normal Googlebot sees the desktop page, that's great. Smartphone users see the smartphone page, but the smartphone Googlebot, it looks like an iPhone user agent, just has Googlebot in it as well, ends up seeing the desktop page as well. And that's a problem from our point of view because then we can't really see a smartphone page, and we can't treat it appropriately in search. That's something that's really tricky to recognize on your own because you look at it on your phone and it looks fine, and it's rare that you go out and actually look at your pages using the smartphone Googlebot. That's something you can check in Webmaster Tools using the Render view. It's really easy to recognize if you're showing the right version of the page to smartphone Googlebot as well. The other issue with regards to mobile sites that we saw recently is when these sites use separate URLs or use specific content on those URLs, that's disallowed from crawling. Sometimes it happens that you have your redirect in the JavaScript that takes a look at the user agent, sends the user to the right version of your page, and if that JavaScript file is blocked by robots.txt, then we'll essentially not be able to recognize this redirect to the mobile page. Other times, we've seen the mobile site specifically blocked by robots.txt, potentially because the webmaster thought maybe there were duplicate content issues or those kind of things, which definitely isn't a problem from our point of view, and maybe they'll go out and robot the whole ndot domain, or they have specific URL patterns for the mobile pages, and they block that from being crawled. If you do that, then essentially we can't recognize your mobile site, we can't treat it appropriately in search. Slightly similar to that, if the JavaScript or the CSS files that you use on your mobile friendly pages significantly make it look mobile friendly, then that's something that we'd love to crawl as well so that we can also confirm that it's really mobile friendly. So making sure that those files aren't blocked by the robots.txt is also a good practice. And like before, you can check this in Fetch as Google in Webmaster Tools to what Googlebot would see or where it would get stuck. Take the mobile URL that you would see in your browser, copy that into Webmaster Tools, and see if you can actually see the Render view there in the normal way. And finally, this really weird issue that we ran across on a bunch of sites was where the normal desktop user would be seeing a normal HTML page, but whenever Googlebot crawled, because it doesn't have an Except Content Type in the request, it would get JSON content. So that would be some JavaScript or some data files, essentially, that we can't really index. The content is kind of in there, but it's not in a way that we can actually index, and that would end up with Googlebot being stuck with this weird JavaScript file instead of the actual HTML of the page. Once again, this is something you can check in the Rendered Fetch as Google tool in Webmaster Tools. It's almost like a promotion for this tool last week. These are really weird issues that we ran across on a bunch of different sites, so it might be useful to check here. Just to compile everything, I'd really double check your important pages with Rendered Fetch as Google. Also, make sure if you're using mobile URLs that those work really well with the Fetch as Google. Double check also that we can pick up the content and that we can actually crawl the CSS and JavaScript files that are involved there. If you want to go a step further, you might want to consider creating a script that uses the Googlebot user agent, uses the Googlebot request headers that we usually use, and just double check your pages occasionally so that you can make sure that you don't accidentally run into this situation. From our point of view, all of these aren't really technical issues that we would say we want to alert the webmaster about this. When this content is blocked or when you're serving us slightly different content, it's not that we'd say this is a clear cut error, but when we look at those pages manually and we can tell you do actually have a really good smartphone site, you're just hiding it away from Google, that's something you have to recognize on your own because we wouldn't say this is something that we could automatically recognize as an error. So with that, that was this introduction to the weird issues that we ran across recently.

AUDIENCE: Hey, John, can I interject for a second?


AUDIENCE: I just had an idea for a great feature in Webmaster Tools, that if you could show us some of the important pages, what Google thinks is important and how they render, instead of us going in and checking, that would be awesome. That would be great. You could even put big red outlines around problems on the page, and don't say anything about it. Just put a big red outline and make people wonder what's going on there.

JOHN MUELLER: I think the problem with especially the robot JavaScript files, robot CSS, it's not something that we can call out as a problem because we just don't know what's actually missing. It might be that this is a JavaScript file that you use for your analytics and Googlebot never has to look at that. That's absolutely fine to keep that robot. But if it's something that does something significant, then we wouldn't even recognize that automatically. We'd just say, well, it's a JavaScript file that's being blocked. We don't really know what it's doing. We can't really tell the webmaster that this is a critical problem they need to urgently fix unless someone happens to manually look at that page on a smartphone and say, well, on a smartphone, it looks really legitimate like a smartphone site, but Googlebot itself doesn't see that.

AUDIENCE: Cool. Yeah, I get that part, but there's obviously pages that Google detects on the site that they think are important based on links and whatnot. If you could automatically render two or three of those, one, we would know what you think is important, and we could change that if it's not correct. Two, we could see if it renders properly and maybe fix some stuff if it's not rendering properly. Even if it has some quality issues, we might be able to see that too. Anyway, just a feature suggestion.

JOHN MUELLER: You can get some of that information in the Top Search Query section where you can filter by top pages. That would be based on the impressions of clicks that you're getting for those pages, which is kind of a proxy for what Googlebot would think was interesting. Having that combined with a view of the rendered version, that sounds really interesting, like a one off report where you would go in there and say, well, this looks good, or this looks really different from what I would see when I look at those pages.


AUDIENCE: John, can I ask a question about some edge cases maybe?


AUDIENCE: Can you talk a little bit more about the Buffer in HTTPS migration, because initially, we thought it was a penalty based on what they wrote. They said that wasn't the case. You said that wasn't the case. And initially, they said they did the HTTP to HTTPS migration correctly, but now it has turned out to be they didn't. But it seems like Google's saying also that you guys admit that there was an issue specifically around something that you maybe considered you should have handled with these migrations. Is this something we should be concerned about, and should we look at certain things? Should we not make the same mistake if there was a mistake made?

JOHN MUELLER: I don't think that's something you really need to worry about. I'm not sure exactly which edge case it hit there, but it was something special on our side where we weren't forwarding the signals appropriately, and that kind of got stuck with that side. That's something that the engineering team was able to fix really quickly. The site itself chose to go back to HTTPS because they felt that was safer, which is fine from our point of view as well. At the same time, they also had a really old penalty on their site, and that's something that they weren't aware of. Usually when we see a site with technical issues that also has a manual action, we'll go in and say, well, fix the manual action first so you don't have to worry about that causing any weird effects that you might mistake for technical issues. That's something where if you run into this weird situation where you've seeing something that looks like a technical issue and you notice you have a manual action, I'd definitely split that work up and make sure that you're taking care of both sides, and not that one side is leaking into another and making it look different than it actually is. This was essentially just a normal technical issue on our side, nothing really crazy. I don't think there were really any other sites involved that were seeing something similar or where we were running into this edge case as well.

AUDIENCE: So it was nothing that Buffer did on their end?

JOHN MUELLER: As far as I remember, it was nothing specific that the did there. The manual action is obviously something that at some point they did, but with the change from HTTP to HTTPS, I don't think there was anything specific that they did wrong there.

BARUCH LABUNSKI: John, can I ask you a question about the search box?


BARUCH LABUNSKI: Can you just highlight a couple things in regards to the search box? How would a small business owner, if they were qualified to get a search box, what do they have to do?

JOHN MUELLER: Essentially, just copy and paste the markup onto their pages and specify the internal site search URL that they have. Most smaller sites, most CMS systems, have kind of a site search built in already, so it's not something that they'd have to build up. It's essentially just copying and pasting the JavaScript code there and editing the URL to match their own site. I think Yost has it included in his WordPress plug-in. So basically, you just have to enable the plug-in, and it'll do it all for you. I imagine that's something that will happen for other CMSes at some point, too.

BARUCH LABUNSKI: So if a site has 4,000 HTML pages--

JOHN MUELLER: You'd just put it on the home page. You wouldn't need to put it on the individual pages.

BARUCH LABUNSKI: Just markup and that's it?

JOHN MUELLER: Yeah, exactly.


JOHN MUELLER: Sure. All right. Let's go through some of these questions here. Small websites with a couple unique visitors per day from search results can be penalized by algorithms like Panda or the page layout algorithm. What would they rank only in regional results? Essentially, our algorithms try to be generic in the sense that they apply to as many kinds of pages as possible. That could be across different languages, could be across different locales, and that's something where they try to take the factors into account appropriately. If you're a small website and you're targeting a small niche area, and our algorithms notice that you're doing something slightly sneaky, then it might be that our algorithms respond in a slight way. It's not that they'll just turn on or turn off. They'll usually try to be more granular in that sense, in that they try to react in a way that's appropriate for the issue there that they find there. Theoretically, these algorithms could apply to smaller websites as well. They can apply to big websites. It's something where we just try to be appropriate in our response there.

BARUCH LABUNSKI: What do you mean by "sneaky?"

JOHN MUELLER: Well, it depends on what these algorithms are looking at. If they're specifically looking at the quality of the content, for example, and you have a bit of low quality content on your site, something old maybe that you left behind, and the rest of your site is completely fine, then that's something where we'll probably pick up on the lower quality content, but we're not going to hide your site in the search results just because of one small issue there. We try to respond in an appropriate way.


AUDIENCE: So with regards to the sites being hit by a penalty, obviously it affects the actual normal site rankings, but will it actually affect the local rankings as well, the regional results? Obviously, I know a lot of that's based off your map listing and your Google+ listing, but I know you released an update that takes into account more normal ranking factors. So would a penalty affect your local rankings as well?

JOHN MUELLER: So you specifically mean the local business rankings, the map results, those kinds of things? That's not something that's directly tied into the web search results, and I couldn't really say anything for them. I don't know how they would handle this kind of situation, if they have similar algorithms. I'm not really sure about that. In general, when it comes to the web search results, even for localized queries where we're showing normal web search results, that's something where this could come into play. With regards to the maps, local business entries, that's something that's generally handled completely separately.

AUDIENCE: OK. That's fine.

JOHN MUELLER: How Google treats links to follow with brand or URL address in the anchor text when they're added by the owner linked site and don't include the money keywords. On websites about the same topic, is this still an unnatural link? We want to improve our visibility within our niche. So if you're adding these links to other people's websites and these are essentially not advertorial links, then we would treat those as unnatural links. It's not so much the fact of including your right keywords in there or not, but essentially if these are page rank passing links that you place somewhere else, then it's kind of an unnatural link. That's something I generally try to avoid doing there. If you're doing this to get traffic to your website, I would treat this like any other kind of advertising that you have on your site or you have on other people's sites, and just make sure that there's a no follow there. And then users can click through to those links to your website, but those links aren't passing any paid traffic.

BARUCH LABUNSKI: Because I'm seeing the same thing happen here, a company just pasting their little logo on contractors' sites.

JOHN MUELLER: I mean, to some extent we try to recognize that automatically, but if you're putting this on other people's websites and it's not something that they're editorially doing on their websites, then we would see that as an unnatural link, and that's something that I avoid doing there.

BARUCH LABUNSKI: It's multiplied by 1.7 million.

JOHN MUELLER: 1.7 million. Where are you coming up with that number?


JOHN MUELLER: Widgets. If this is something where you think we should be taking a look at that from a web spam point of view, you're welcome to send us that information, and I can forward that to the web spam team. But usually, it's something where we have our clear recommendation, so that's not really something I'd say has really changed or is any way special for that. But you're welcome to send it our way. We can take a look at that and see what we need to do there, if anything.

BARUCH LABUNSKI: I'll ping you.

JOHN MUELLER: Sure. Great. Thanks. As mentioned in the quality guidelines, can we use a tactic of redirecting links to an intermediate page blocked by robots.txt on links such as paid advertisements on third party sites as an alternative to no follow tag, and how would that look? Theoretically, you can do that. I'd recommend doing that maybe on a separate domain so that those links really don't pass any page rank, anything to your main domain. If you can't disavow those links, if those external links aren't with a no follow on them, then redirecting them through a roboted domain is something you could there as well. That will block the passing of page rank to your final pages as well. Now, I think that's something that some of the affiliate systems, they've been using that for years now. It's essentially just a way of blocking those links from being directly linked to the final page. That's fine. That's a possibility to do that. Google's main focus is to return the relevant result to the user. A lot of people that perform well in the search results are big, well known companies with large brands. What's the best way for a smaller company to perform well in Google and to compete? In general, this is something you almost have to see as a normal business or marketing problem in the sense that there's some people that have really big websites, they offer maybe a lot of different products. As a small company, if you're trying to compete directly with them, that's probably not going to work so well. If you have something very specific, something like a niche market that you can focus on, you're probably going to have a little bit more success there. So instead of trying to compete with the really big brands, try to find a focus area of your own that you can really dominate, where you have the best content, where you have the best products, where you have the best service on your website, the best information on your website, that the bigger companies can't really keep up with. That's essentially a really great way to get even a small company started fairly well in search. Thanks to the internet being as global as it is, you don't have to worry about just your local customers that might find your business, but you really have a much larger audience as well. It's definitely something where the internet isn't locked down by the big companies and nobody with a smaller, new website has any chance of getting in. There are lots of opportunities and there are lots of new ways to get into the search results, have something fantastic, and have something that users want to find from your website.

AUDIENCE: Hey, John?


AUDIENCE: I have a bug report I should probably get out of the way before we get too far along. Is it OK if I share my screen?


AUDIENCE: I promise it's clean.


AUDIENCE: Do you see this search result here?


AUDIENCE: OK. So everything here in yellow is not actually on the site. This is a fan from my YouTube channel, and he tells me that this Colorado standby and Norwell Power Systems, et cetera, this is not actually in the PDF. These are actually their competitors' names, and somehow Google is rewriting the titles or confusing the competitor with this PDF. Although the competitor has similar PDFs, they are in no way identical. And so they don't understand why their PDFs on their site have their competitors' names on some of their technical manuals. Does that make sense?

JOHN MUELLER: Doesn't really make sense that we would do that. I'd need the specific URLs to see what exactly is happening there, so if you could paste them in the chat or somewhere where I can get to them, I'd be happy to take a look at that with the engineering team.

AUDIENCE: Sure. The easiest way is just do the site search for, just as it says up there.

JOHN MUELLER: OK. The thing to keep in mind with site queries is that they're sometimes a bit artificial in the sense that we might be showing the generic title or not the title that users would normally see when they search in the search results, but I'm happy to take a look at that to see where we're getting that from or what we can do to avoid that happening.

AUDIENCE: Sure. That's essentially what I told him.

JOHN MUELLER: Perfect. Did Google use disavow data on the last Penguin update, and if Google runs Penguin again, will the disavow data be used? Yes, we did use the disavow data for the last update, and we'll definitely be using it going forward as well. We've been using it in the meantime, too, but of course the Penguin algorithm hasn't been updated yet. That's something I'm hoping will come at some point in the near or mid-term future. I don't have any specific dates for you guys.



AUDIENCE: Sorry. There's been quite a controversy actually going on. When you say you're using the disavow file periodically, what exactly do you mean you're using it? Some people think that they can use the disavow file to get rid of negative SEO and to disavow any links that they made or whoever made, and then there's other people who think that it doesn't work that way, not on a regular basis, only if you have a manual action or only if you have a Penguin penalty. And again, that's only if Google remembers to use it in those circumstances, and you have to go and delete a bunch of links anyway. So the disavow file is almost secondary in a way, if that makes sense. Can you confirm if you're using it on a regular-- if it automatically no follows links or something like that?

JOHN MUELLER: Yeah. So we do use it automatically. It's processed when you submit it, and we use it every time we crawl the web. And when we find links to your site that are in your disavow file, we'll essentially treat them similar to a no follow. That's something that happens automatically all the time. It doesn't matter what kind of issues your site has, or if it might not have any issues at all and you just say, I just don't want to be associated with these weird stuff I found linking to my website, and you put it in your disavow file. We'll essentially break that connection and treat it as a no follow link. With regards to manual action, that's one place where the web spam team expects you to do a little bit more, and actually expects you to go out and manually remove some of these problematic links as well. Algorithmically, we definitely process this all the time. For manual actions for reconsideration requests, our web spam team really wants to be sure that you're not just playing a game and updating disavow files back and forth, but that you're actually taking this problem seriously and tackling it at the root.

AUDIENCE: So just again, to make sure it's perfectly clear, if you put a link in your disavow file, you will lose that link juice. Either good or bad, you will lose that link juice.


AUDIENCE: All right. Well, thank you very much. I have to go back and apologize to a bunch of people, but thank you for clarifying that.

JOHN MUELLER: Yeah. Well, page ranks, not link juice.



AUDIENCE: Related to that, if you're losing that link juice and you're no following those links, how does Penguin then relate to that? If that's going on on an ongoing basis if you're no following your links, how does that relate to once Penguin is updated? That just removes the penalty involved?

JOHN MUELLER: Essentially, Penguin has a web spam algorithm, and it picks up on these unnatural links as well. That's something where if those links aren't passing any page rank, we wouldn't see them as unnatural links. We'd essentially treat them as advertising that people can click on to go to your site but they don't pass in page rank. It's not a problem for us. By disavowing those kind of links, that's something where these algorithms when they get run again, they'll see, these links shouldn't be counted anymore. The webmaster doesn't want any page rank passing here, so we shouldn't be taking those into account for any kind of web spam purposes.

AUDIENCE: OK, thanks.

JOHN MUELLER: All right. We have another question in the Q&A here. We sponsor gamers. In return, they put links in their YouTube videos. We get a lot of traffic from these links. We want to keep them in place. However, people copy the description onto their websites, and now we have a manual penalty. So kind of like I mentioned before with the manual action, the web spam team wants to see some level of change happening there in the sense that they'd like to see these links being removed or no followed. You can also use the Disavow tool for that. In general, if people are just copying and pasting descriptions from YouTube and putting them on your site, that's probably not something that the manual actions team would worry about that much. My guess is, without knowing your site, without knowing the specifics around it, that maybe there are other issues around unnatural links that you need to take care of as well. As far as I know from YouTube, these links definitely have a no follow. If people are copying them directly, then I imagine that's a smaller group of people and probably not something that the web spam team would say, this is a critical problem you need to fix by yourself. I'd take a step back and look at your site overall and see if maybe there are other things that you're doing that could be seen as unnatural links. Maybe you've been going out and putting your site into tons of directories, or doing link exchanges, or doing anything else like that. Or maybe your SEO was buying links without you actually knowing that. These are the kind of things that the web spam team would be picking up on. That's something where I wouldn't specifically focus too much on these video links. It might be that this is really the only part of the problem, but probably there are bigger issues there involved that you need to take care of. We're unable to recover our site since October 2013, suspect Penguin. We already removed all unnatural links, but still not revoked the penalty. So the Penguin algorithm runs periodically. It hasn't run in quite some time now, so we're hoping it'll be picked up again in the near future. If this is really a problem specific to that and you've cleaned up all of the web spam issues involved there, then that's something where you'll probably see changes when it runs again. At the same time, I wouldn't focus completely on this, and also look at what you need to do for your site overall. Oftentimes, we'll see that one algorithm picks up one problem, but actually the bigger problem is maybe the content on the site, or maybe the type of site itself where, if you're just aggregating content from other sources, then that's probably something that the other algorithms will pick up on as well. I'd take a step back and make sure you have everything covered, make sure you've really cleaned up these link issues if you find them, and to some extent wait until the Penguin algorithm runs again to see how that's resolved. In the meantime, of course, you can do other things to keep traffic coming to your site and to be visible in search or be visible with your clients, your users. We observe search results top 10 for many keywords and sometimes there isn't any change past many months. I mean domains from the top 11 to 100 can't go to the top 10 in rankings. Is there something like frozen search results? No, not really. It's not the case that we manually curate these search results. We really need to make sure that these are all done algorithmically. We don't really have time to manually maintain all the search results. That's not something where there's any kind of an algorithm in place that freezes the sites in those search results. Sometimes we're just seeing that this is kind of the usual activity in that some sites are really, really good and they show up in the top 10, and there's a big gap between those sites that are really, really good or that are doing really well in search and the sites that are kind of not that bad, but they're not really that great either. Sometimes that gap is something that you might see as something like this, where you'd feel that the top search results aren't being replaced by the lower ones, but that's not necessarily a sign that they're stuck there. It's just that the other sites really need to step up their game and take it to the next level.

BARUCH LABUNSKI: Quick question about a server. Let's say you a server problem and something's wrong with 100 pages. They're slow and they come back to normal. And let's say there just a normal Google update on the weekend. Will that lower the site's rankings because of the site speed just for that particular time?

JOHN MUELLER: It shouldn't. We primarily use site speed as a way to recognize really, really slow sites, and if you just have a temporary issue, then that's not going to be a problem. Likewise, if you improve your site speed or the rendering time of individual pages by 10%, you go down from 10 seconds to nine seconds, or from 10 seconds to one second even, that's not something where you'll see a big change in the search results. We really mostly try to differentiate between sites that are really really, really slow and sites that are kind of normal, and we don't really look at the individual milliseconds and say, well, we'll rank this higher because it's a tenth of a millisecond faster today. Likewise, you wouldn't see any big drop if your site was just a bit slow over the weekend.

AUDIENCE: John, could you tell us if you've been testing the new Penguin refresh that is coming sometime before the end of the year in the live search results where webmasters may or may not have noticed this test?

JOHN MUELLER: I don't know.

AUDIENCE: You have no idea if you've been testing it?

JOHN MUELLER: I don't know. I don't know. It's possible that we test this, but I think the thing to keep in mind with all these tests is we run on the order of hundreds of tests all the time. Every time you search, you're going to be in 10 to maybe 50 or 100 different experiments at any time, so these things run all the time. There are lots of different experiments that we do, some of them with regards to the layout of the search results pages, some of them with regards to the ranking. It's really hard to say whether one specific change that some webmasters notice is related to a very specific experiment that we might also be running at the same time. That's something where even if we take that back to the engineering team and say, hey, on this date, five webmasters noticed this change in the search results. What was that from? They'd have to take a couple of hours of their time and say, well, we'll have to diagnose what exactly happened on this day for these exact search sessions to figure out which experiment might have triggered something there. That's not something we generally even get involved in. If we see these fluctuations happening, for the most part we'll say, well, we're always experimenting. We're always working on improving the search results and trying to find a way to make them work better. It's not the case that we explicitly point out a specific experiment and say, this specific ranking change experiment that this team in Tokyo has been working on is what you specifically saw here, because that's really hard to dig out.

AUDIENCE: I understand that. So you're not confirming or denying that Penguin has been tested in the past two, three, four weeks. You don't know.

JOHN MUELLER: I don't know. I honestly don't know if we've been testing that specifically in the live search results.

AUDIENCE: At least when the update's going to be live, it won't be as the other update, right, in terms of the way the transition from the results now to the results in whenever they update?

JOHN MUELLER: I don't know. I can't promise you anything with regards to what the search results will look like then, but because so many people have been waiting for it, we'll definitely let you guys know that it's happening. Maybe we can let you guys know a day ahead of time so that you can see the change. I can't really promise that. We'll definitely not just try to put it out there and maybe a month later say, oh by the way, it's been live for a month now. I know you guys have been waiting for this update, and it's something that we'd at least like to let you guys know that you can also compare what's actually happened. How can we know if a backlink to our website is good or bad for our ranking and our search position? In general, if you're not aware of having placed any kind of weird backlinks or doing link schemes, those kinds of things, I wouldn't worry too much about this. It's something that the normal webmaster who doesn't go out and do explicit link building, those kind of activities, you don't really need to worry about that. There are always going to be weird and shady people that link to your website, spammers that copy the URL from your website and link to it from their spammy pages because they want to have something that looks legitimate in there as well, and it's not really the case that you need to review every individual link there. On the other hand, if you have been doing link building or you know your SEO has been doing link building, then that's something you can take a look at and think about, how did this link happen to show up on this page? Is it something that my SEO was involved in? If you can tell that this is something that your SEO was involved in and this link wouldn't be there otherwise, then that's something you might want to take a look and see, is this really a good link that I want to keep to my site, an editorial recommendation of my site, or is this something that essentially, directly or indirectly, was placed there because of something that I wanted to do? Those are essentially the two points you need to be looking at. Again, if you have a normal website and you've never been doing any weird link building activities, then reviewing the individual links is generally not something you need to do.

AUDIENCE: Hey John, can I ask a duplicate content question?


AUDIENCE: Well, it's more specifically repurposing content. How would Google feel about taking, for instance, a how to article that's been on a site for a while and creating a slide deck presentation with photos and an outline of the article, along with a voice over basically of the article text, and creating a video tutorial of that just as a different form of basically the same kind of content, and then placing it on YouTube with the same title or headline and description, and having a link back to the site? And also maybe syndicating, for lack of a better word, distributing the same kind of content along other sites, so YouTube and Vimeo, and maybe putting the audio on a podcast site, and then the slide deck on SlideShare, things like that. How do they feel about that?

JOHN MUELLER: From our point of view, that's not anything problematic. We wouldn't be trying to recognize the words in the video and say, this matches this article that I found a while back. Essentially, we treat these as individual pieces of content and we try to rank them appropriately. Sometimes it helps to have something on a video. Sometimes it helps to have the text version of the slides directly. Sometimes people are looking specifically for one of these forms of content. If you think your users want to see your content in different media, then--

AUDIENCE: Yeah, because I've gotten requests for videos on the site, and we really haven't done that before. I'm just trying to think of efficient ways to put a lot of the content into video format, and that seems like one, but I also wanted to make the best use of distributing it to the widest range of people.

JOHN MUELLER: That's essentially up to you. What happens in general with duplicate content is if we recognize that it's the same-- for example, if it's an HTML page, if it's a doc or something like that where we have the text-- then we can compare the text and we can say, well, this is a duplicate of this other article, and this other article is essentially the one we want to show in search, so we'll just show that one. So it's not that the site is penalized or demoted in any way. It's just we know there are two copies of the same thing and this is the version we picked to show on search. We don't show the duplicates in search. If you have it translated, if you have it in video format, converted into slides, for example, all of that is essentially not the same content anymore. It's something unique that kind of covers the same topical area, but that's not something where our algorithms will get involved and say, well, this is someone reading the text from that web page on a video. Therefore, we shouldn't show it together with the text page in the search results. Essentially, these are unique pieces of content. We maybe want to show them together in the search results if we find that they're relevant to the user.

AUDIENCE: How about putting the text version in the captioning in the YouTube video where it gives the option for putting the captioning in?

JOHN MUELLER: You can do that. I don't see any problem with that. And again, this is something where we wouldn't penalize a site for doing this, for having duplicate content. We just pick one or the other URL and we show it in search. If we can recognize that it's the same text, then we'll just pick one of these and show it in search. It might be the doc or the HTML page. It might be the video with the whole caption in there. If the video doesn't have the caption in there, then essentially we say, this is a video that's on this page. This is a text that's on this page. We can treat them separately. That's completely fine. From a practical point of view, I'd probably just try it out with a couple pages and see how they go, how people respond to the video format, how people respond to the slides. Maybe you'll find people love the video format and they just want to consume everything on video, or maybe you'll find that your content is maybe a bit more complicated and doesn't work that well on video. It's kind of up to you.

AUDIENCE: Great. Thank you very much.


AUDIENCE: John, can I ask a question, based on I see there's a country specific question coming up in the Q&A?

JOHN MUELLER: OK. Go for it.

AUDIENCE: It's basically how to handle hreflang but now with secure versus non-secure. So essentially, we have four domains-- the secure, the non-secure, and then the country specific, hreflang-ed over to the secure and the non-secure. What's the best way to make sure we get that right? Is that right that we have the original domain, but then 301-ed to the secure version, and then we have the hreflang country specific non-secure, but that one is also 301-ed across to its secure, but the hreflang goes from secure to secure, so you still only end up with one ranking domain?

JOHN MUELLER: Essentially what you do there is pick one of these URLs as canonical, essentially. So you'd have one canonical for maybe the US version, one canonical for the UK version. Do the hreflang between these canonical versions, so between the US and the UK version of this page, and just do the rel canonical within the language version. If you have, for example, the HTTP and HTTPS version and you pick the HTTPS version as canonical, then the HTTP version will have the canonical tag set to the HTTPS version, and that one will have the hreflang set across to maybe the US and the UK versions. So you have two layers there on top, the hreflang between the canonical versions. And within each language and country version, you'd have the canonical pointing to your preferred one or that set.

AUDIENCE: All right. I may have to watch that back again.

JOHN MUELLER: I might have to draw this out for one of the next hangouts. That's a good idea.

AUDIENCE: Well, thanks. It's also within our Webmaster Tools, we found that it's been about three weeks since we moved over to secure, but the impressions within the non-secure have just died. It's down to 5% of what it was, but the secure has in no way caught up with that yet. It seems to know that we don't want one version, but it doesn't seem to know yet that we do want the other.

JOHN MUELLER: That shouldn't be happening like that, though. That's something that should be more or less just moving from one version to the other.

AUDIENCE: It's not like it's 301-ing to a completely different domain. It's the same domain, just with an S.

JOHN MUELLER: I can take a look at that to see what's happening there. Because that should essentially be seen as equivalent, and what goes down by one should come up with the other one within, say, a couple of days. If it's been three weeks, then something weird is happening there. Maybe some weird edge case that we have to take a look at, Barry.

AUDIENCE: But it wouldn't be the first for our site.

JOHN MUELLER: Yeah, but still, we should be able to pick that up appropriately.

AUDIENCE: All right. I mean, the funny thing is if we click into the site map within the secure version, it says, for example, 400 out of 1,400 are indexed. But if you then click through to the next page where the graph is, those lines don't represent 1,400 or 400 at all. I don't know what I would send you to look at that unless you have access to our webmaster tools.

JOHN MUELLER: But that's something where usually from the indexing side, we should be able to pick those up regardless of anything from a quality point of view or otherwise. Especially if you have a redirect set up to the right pages, we should be able to figure that out fairly quickly. I will double check to see what's actually happening there.

AUDIENCE: I'll reply to the previous post, then. Thanks.

AUDIENCE: John, it should match after you switch over, over a certain amount of days? I could share my screen.

JOHN MUELLER: It should be settling down to something similar. It's never going to be exactly the same, but it should be within the normal fluctuations.

AUDIENCE: So for example, here's a search engine roundtable. I switched over on the 12th, I believe. I was in the high graph in terms of impressions, about a million impressions, probably about an average 850,000, and then I switched over. You see it gradually go up, but the highest point ever hit was 750. It's probably averaging about 500,000, 600,000 impressions versus the average of about 800,000 impressions per day, 900,000 impressions per day. Is that what people should expect?

JOHN MUELLER: It should be kind of similar. It's really hard to tell on your graph to see what it looked like before. This looks just like a temporary bump. Sometimes--

AUDIENCE: It looks like it's an average between 600,000 and 800,000. Obviously, weekends dip. Now it's still in the 500.

JOHN MUELLER: I can double check on that.

AUDIENCE: We're getting a different ratio.


AUDIENCE: John, can I ask a question?


AUDIENCE: Thank you. I'm thinking about marking up. If I'm marking up an HTML4 website with microdata, knowing that this markup syntax doesn't validate in W3C, as microdata should be used with HTML5 in order to validate, is this creating any issues with regards to crawling, indexing, and ranking? I mean, can that markup be ignored by Googlebot?

JOHN MUELLER: No. That's completely fine. We don't expect valid markup, and most of the web doesn't have valid markup, so we have to live with that.

AUDIENCE: I know, but I was thinking just to use microdata in HTML4 and if that's fine.

JOHN MUELLER: That's fine.

AUDIENCE: OK. Thank you.

JOHN MUELLER: All right. We have a bunch of questions still here, so let me try to go through them a bit quicker so that we get through a bunch of these. If someone is in the UK based on IP and visits our German site, can I redirect them to the UK site if I have an equivalent page? In general, this is something we don't recommend doing because it's very easy to also redirect Googlebot to the local page as well. If Googlebot crawls from the US and is always redirected to the US version of the page, then they'll never see the German version and we'll never be able to index that. Our recommendation is to show something like a banner on top to let the user view the UK site but also click through to go to the German site or vice versa. That's generally our recommendation there. You think here it's better to have our blog on our website or to have it on WordPress with links from and to our website? Essentially, that's up to you. Both options work. Sometimes it makes sense to put it on the main domain if you want people to kind of associate strongly with your website. Sometimes it makes sense to put it on a separate domain or to keep it on Blogger, whatever. Will a closeable advertising banner showing once for each visitor trigger the top heavy ads penalty and/or Panda? When we look at the page layout, we look at what we crawl with Googlebot. Since Googlebot doesn't use any cookies, we'll probably see this ad every time we visit, so that's something that we would take into account there. You might want to consider turning it around and of showing the first visitors your great content. If you see that these are repeat visitors, encourage them to sign up, for example. Showing them a big banner from the beginning is something we'll probably index or we'll use for the page layout algorithm. Why would Google stop showing our homepage for generic search and show alternating sub-pages after showing the home page in the search results for eight years? Just about every search results shows the home pages for all of the listings. I'd have to have explicit examples to see what you're looking at there. If you want, you can send me some links to search pages and the titles or the pages that you think should be ranking and what we're actually ranking so that we can take a look with the engineers. In Webmaster Tools, we see that we have back links that can be considered bad, so how can we delete them and not affect badly our ranking score and our search position? Essentially, there are two ways you get rid of them. One is to actually remove those from the other sites if that's something you have access to or something where you can email the webmaster to request that. Another thing you can do is to use the disavow file, which essentially tells us to ignore these specific links. With the disavow file, I'd recommend using the domain directive as much as possible so that we can actually take out all of these links, instead of you having to really figure out what exact URL from this shady site is linking. Using the domain directive makes it a lot easier to avoid mistakes there. In Webmaster Central, it tells you you can use JSON-LD, microdata, or RDFa, but Google Developer site doesn't tell anything about RDFa, and even says we recommend using JSON-LD. Alternately, you can use microdata. Can I use RDFa for a site link search setting? I don't know specifically. I'm pretty sure you can use RDFa for that. I just know for a lot of the rich snippets formats that we do use, we don't support JSON-LD for those at the moment, so we expect the formats that we have in the Help Center or that we link to from I don't know about RDFa with the site link search box in particular, so I can double check on that. Let's see if we can get some higher click ones here. If I use a search monitoring tool to check 500 keyword positions daily for my websites, this will affect Google's search click through rate, my question in turn affect my rankings within Google if my click through rate goes down? There's two aspects here. If you're using automated queries to determine the ranking of your pages, that's against our terms of service and that would be against our webmaster guidelines, so that's something I'd like to discourage as much as possible. The other aspect here is if we recognize that these are automated queries, then generally, that's not something that we count towards the click through rate or the impressions. We try to recognize that where we can, but essentially, if you're using tools to scrape our search results to figure out your rankings, that would be something that would be against our webmaster guidelines. You're essentially crawling pages that are blocked by robots.txt, and we really don't want you guys to be doing that. Instead of using a search monitoring tool like that, maybe consider using the Top Search Queries in Webmaster Tools. Could you elaborate on your decision to re-include spammy news sites in Turkey four hours after the ban? Pages with spam are still accessible via Google. How should us white hatters take this? I don't know about the specific situation in Turkey, so I can't really tell you about that. I know it's something where various people from the web spam team have been working on this to try to find a solution and where we saw a lot of spam happening on some really big news sites and where we had to take some pretty strong action there. I don't know what was involved there, how much was cleaned up, what might still be out there. I think you sent me an email as well, and I also forwarded that to the web spam team to double check. Go ahead, Joshua.

JOSHUA BERG: In what you were talking about last, are there scenarios where that might hurt or affect someone's ranking negatively, seeing as it is against the guidelines that they were using any automated query?

JOHN MUELLER: I can't think of anything specific at the moment there, but since it is against our terms of service and against our webmaster guidelines, I imagine we reserve the right to take action where we see that this is really causing a problem for us. That's something where sometimes we'll do things like block IP addresses, block IP ranges, those kind of things. It's something that does cause a lot of problems for us. It causes extreme server loads on our side. At some point, I can't completely rule out that we might take action on these kind of things. At least in general, we wouldn't be removing these sites from the search results.

JOSHUA BERG: And ultimately, might it also have some effect on tools like autocomplete, messing with that in some way?

JOHN MUELLER: It shouldn't. We have some pretty strong algorithms for that, and we've seen this activity for a long time. So it's not completely new and something that our algorithms should be taking into account.

AUDIENCE: One thing interesting that I've seen with autocomplete is different clients have come to me regarding certain ones that might have negative queries that come up alongside their business. Or maybe businesses that didn't operate very well sometimes have an exceptional amount of negative queries that come up in autocomplete, which maybe they need to do some good PR instead. Aside from that, there really isn't much of anything that these business should be--

JOHN MUELLER: I can't think of anything specific there. All right. So we ran out of time. Got people waiting for my room here. I'd like to thank you all for your interesting questions and comments. I hope it's been an interesting one for you as well, and I wish you guys a great week.

AUDIENCE: Thanks, John.

AUDIENCE: Thanks, John. You too.

AUDIENCE: See you next week. | Copyright 2019