Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 03 July 2015

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: Welcome everyone to today's Google Webmaster Central Office Hours Hangout. My name is John Mueller. I am a webmaster trends analysts here at Google in Switzerland. And part of what I do is talk with webmasters, publishers, like all of you and try to answer any questions you might have around search-- around your websites, those kind of problems. I see we have some newish faces here in the Hangout at the moment. Is there is something that's on your mind that you'd like to start off with-- a question maybe?


JOHN MUELLER: Hi, go for it.

MALE SPEAKER 1: Can you hear me?


MALE SPEAKER 1: Hi, John, thank you. I've got a question. We received quite recently-- for the last three months this message from Google Webmaster Tools, or the Search Console should I say, that we've got too many URLs, basically. We've got an e-commerce website, which lists dynamic links generated from faceting. But we tend to control all those links. It's a message we had a long time ago. We've been blocking most of those URLs through robots noindex, nofollow or through the robots text, we have been blocking them as well. And we also configured the Webmaster Tools to block those URLs. Now, we've received this message again-- these messages. I'm checking the examples, URLs, of those messages. And none of those pages do index. Basically, they are all noindex-- noindex, nofollow. What should I do?

JOHN MUELLER: So this message comes before we actually check those pages. So essentially when we crawl your website, we discover all of these new URLs. And we don't really know what is on those URLs yet. This is usually a sign that something within your website is generating all of these URLs. And that's making it harder for us to understand which URLs you actually should be crawling. So especially if they have a robots noindex on them, we have to crawl those URLs before we can see the noindex.

MALE SPEAKER 1: I understand that. But let's say it's one which is blocked through robots.txt, you shouldn't get to that stage. It should be blocked before.

JOHN MUELLER: Theoretically, we should notice that and not flag those, yeah. So if you want, you can send me the link-- maybe drop it in the chat here. And I can pick it up afterwards and double check our triggering on that. I know this is a bit of a misleading message sometimes in the sense that maybe you are doing all the right things. But it's still showing up. From our point of view, it's really something where we discover a lot of these new URLs on a website when we crawl it. And this is the kind of a situation where, oh, it looks like something completely broke on the server or maybe there is a type of navigation within the website that generates infinite URL variations. And those are the kind of situations we want alert the webmaster about. Maybe you're doing everything right already. And we should have recognized that better.

MALE SPEAKER 1: One more thing-- I was thinking of the implementation on the site. You put a rel nofollow on the links themselves.

JOHN MUELLER: That can help.

MALE SPEAKER 1: That would help us? OK. And what is the danger of these messages with respect to Panda?

JOHN MUELLER: These messages are purely technical. So this is purely a technical issue on your website. It has nothing to do with how our quality algorithms would view your websites purely with crawling.

MALE SPEAKER 1: I'll try it there. Thank you.

JOHN MUELLER: All right. More questions from those who are new to these Hangouts. What can I help you with? Nothing special? OK. We have a bunch of-- yeah?

MALE SPEAKER 2: Can you hear me, John?


MALE SPEAKER 2: Great. Yeah, I've been watching the Hangouts for a while. I worked for a company called Columbus Direct. And we were hit with a Penguin penalty back in 2012. The website has been doing some pretty aggressively link building tactics for the 10 years preceding that with good success. And we've spent the last three years really trying to remove links disavow. And so the penalty-- we got the notification that the penalty was removed more than 18 months ago. But our rankings haven't recovered. So I was just wondering whether you could give us any tips on how we can move this forward.

JOHN MUELLER: That's always hard because you almost have to take a look at the individual situation of the site there. So that's the .com version. Or do you have different versions?

MALE SPEAKER 2: We have different versions. But it's the .com version is the main-- it's for the UK market. And that's where the marketing effort was focused.

JOHN MUELLER: So I think if you had unnatural links, for example, on your website and that was a problem and maybe there was even a manual action there, then that's something that can still linger from an algorithm point of view in that our algorithms are still kind of worried a bit about those unnatural links. So I think that's probably what you're seeing there.

MALE SPEAKER 2: So we've engaged a number of agencies to remove links and to build up the disavow file, which has now turned into a kind of a mammoth thing. And I'm not sure if we've got any natural links left almost. And certainly we are a very large player in UK travel insurance market, but not next to the people that are dominating the front pages in today's results. It's the aggregators, of course, and some of the bigger insurance players. And I was wondering whether we would ever make it back-- whether we should continue to pay people to remove links-- whether we should be looking at focusing in other areas or what you would suggest.

JOHN MUELLER: Well, I think this is always a tricky situation because you don't really know exactly what you might be doing there that would help your site in the long run or that would help in the short run. I think, in general, if you've worked hard to clean up all of these bad links -- to put them in a disavow-- to remove the ones that you can-- then that's probably the right step there. And moving forward, I just continue focusing on your website to make that the absolute best. So instead of focusing too much on links now, I try to find a point where you can say, well, I've cleaned these up as much as we can. There might be still some back and forth, but--

MALE SPEAKER 2: Because we'd put so much work into cleaning them up and there was manual re-inclusion because it was a manual action that we were notified about and natural links. I think from the business's point of view, when we had the message saying that the manual action has been removed-- but 18 months later, we are still in this kind of penalized position. So whether we've been penalized by algorithmic factors as well and how we could kind of identify what the problem that's still on the--

JOHN MUELLER: Yeah, so in a situation like this, it's certainly possible that some of our web spam algorithms are still picking up issues there or were picking them up when they were last updated. So that's kind of a situation where I try to find a point where you can say, well, I've cleaned up all of these web spam issues-- these problematic links as much as possible. And just from here on, I just want to focus on making the website better to really kind of build up the natural strength that's otherwise existing within the website. So instead of focusing on things that so far behind you that you've already definitely cleaned up, I'd really just focus on the future.

MALE SPEAKER 2: OK, thank you.

JOHN MUELLER: I think there was one question in the chat. "On a .com domain with English and German content, will the German content benefit from English links and vice versa?" Yes, of course, if this is one website and you have different kinds of content and some kind of links, then that's something where, generally speaking, the website itself will grow in value over time as you gain links-- as you gain popularity-- those kinds of things. And the main thing I'd watch out for there is really making sure that you put these German and English content on separate URLs so that you don't have one URL with both German and English maybe side by side, but really separate URLs for each language.

MIHAI APERGHIS: John, more follow up on that. What you have a CCTLD-- so a Romanian website-- and you get links from international English-based websites. Would those have a lesser effect than usual or would they be ignored? For example, some people wrote about something I did, but in English, they say, look, the remaining version is over there. But it's in English. It's from an English website.

JOHN MUELLER: I don't see a problem with that. I think that's perfectly fine. That's not something you need to artificially channel to an English version of a page or something. I think that's perfectly fine. That kind of situation happens all the time. You also have that between different versions of your own website. So if you have an English version of the content and a Romanian version, you'll probably have a link back and forth saying this is the English version-- this is the Romanian version. And that's perfectly fine.

MIHAI APERGHIS: Right, but in that situation you know, especially if you have hreflang that this is the same entity, just a different version of it rather than an external type. I'm curious if it's a less effect if it's coming from a different language, for example. Because the users-- that link might be that, well, I don't know that language. So it's not as important.

JOHN MUELLER: No, think we treat that just the same. The main difference I guess there is that the anchor text might be different. If we see a lot of anchor text going to one page with the specific text, we might assume that page is about the text. And if that text is the wrong language, then that might make it hard for us to figure out, OK, which of these pages should we rank for this English query. Should we rank this Romanian page or should we rank an English page we might have that confirms that on the page.

MIHAI APERGHIS: That makes sense.

MALE SPEAKER 3: Can I touch base on Dan's question?


MALE SPEAKER 3: OK. I know that part of his problem might be that the links that he removed are what was helping him rank. So that might be part of the reason why he's not ranking or getting back to where he was. But my main question is as you said that a lot of times after you revoke a manual review, the algorithm will still be itchy on that URL. How long does something like that last?

JOHN MUELLER: Sorry, I didn't understand the last part.

MALE SPEAKER 3: Oh, I was just saying you also said that at times after a manual review is removed, the algorithm will still be touchy on the actual domain because it's like, hey, this was a manual, penalized site. And how long will something like that take for it to be like, OK, it's cool?

JOHN MUELLER: It really depends on what all happened there. So I guess the main difference when it comes to manual actions and algorithmic changes is that for manual actions, we really focus on the links that you have visible in webmaster tools or in the search console. And if you clean those up, then from a manual action point of view, we'll probably say that's good. You you've done a good job cleaning this up. And we will resolve that. And from an algorithmic point of view, we take a look at everything that we found, which might include some things that aren't directly visible in search console. So that's something where if you have a few small things that just aren't shown because they don't match that threshold that we use for search console, then that's fine. But if there is this giant mass of links out there that are way below this threshold in Search Console, then those can also add up. And that can be something that algorithms respond to because they see, well, it's not that there are individual links that are really visible that we should show in Search Console. But there's this giant mass of these small things that are hidden around the web with paid links or whatever they are. So that's kind of the difference there and how the algorithms see things compared to how manual actions-- how the web spam team would see that directly. But it's also not the case that our algorithms have any kind of a grudge where they would say, well, this had a manual action. They cleaned up all the problems completely. But because they had a manual action, we will kind of devote it for a while anyway. That's not the case. It's really that they're both looking at the current situation. And algorithms sometimes have a bigger set of data to look at. But they also might not be running as frequently as maybe a manual web spam review might go through them.

MALE SPEAKER 3: So just because a manual review has been revoked does not mean that it could be completely good to go?

JOHN MUELLER: Yeah. So if the manual review is kind of OK, then that's, at least from a manual web spam point of view, that means we're happy there. That doesn't mean that all of our algorithms are going to say this is a fantastic web site.

MIHAI APERGHIS: John, what do mean about this threshold that you mentioned? Can you expand a bit on that?

JOHN MUELLER: So we try to show the links in Search Console that we think are relevant to your site. It's obviously a sample of the links that we have. It's not just the full set of links that we have. So that's what meant there. It's not that there is a magic page rank threshold or something like that. But we try to show what's relevant to the webmasters-- what they might be interested in. And that's a [INAUDIBLE] threshold or something like that. But we try to show what's relevant to the website-- what they might be interested in. And you see that-- mute you guys-- so you see that, for example, when you have one site that has site-wide links to your website. So if you have two websites, and one of them links from every page to your other websites, then you'll often see that we know about [INAUDIBLE]. But we show [INAUDIBLE].

MALE SPEAKER 3: So you see that, for example, when you have [INAUDIBLE].

JOHN MUELLER: I am going to mute you guys. Someone has an echo or is watching the YouTube video at the same time. So that's kind of what I'd was looking at there.

MIHAI APERGHIS: But is it the case that you might miss or might not display any link from a single domain? I noticed that you usually show one, two, three, five links even if it's site-wide reality, there are a 1,000. But is it the case that you might miss entire domains. So you have five links from a website. But you don't show any of them?

JOHN MUELLER: That's possible if we don't think that they're really relevant. So it's not something that I'd say that the average webmaster would really notice because maybe this is like a link on some site that nobody ever visits-- some spammy site that exists out there. But it got indexed. But it's not really something that we would say, well, people need to know about this.

MIHAI APERGHIS: Right, I actually [INAUDIBLE] was doing the disavow file for my penalized client. And I noticed he gave me a list to all of the directories he submitted his site into. I noticed that a lot of them weren't in the Search Console part. So I added it to the web file anyway, but I just thought it was.

JOHN MUELLER: Yeah, so usually what I do there in a case like that is try to find the pattern. So if you know that they submitted that to directories, and they used the same title or the same description, then search for that separately. So just do like normal Google searches and try to find those directories explicitly-- maybe like directory and then the title that they submitted-- something like that-- to find those as well.

MIHAI APERGHIS: Yeah, that makes sense.

ROBB YOUNG: John, following up on that, you said that the links that you find in there-- Google finds them relevant or there is some reason for them to be in there. But surely they can be good or bad. As Mihai was saying, if you're using Search Console to find bad links. But you're saying there are some that you should probably-- if they are not in there, don't worry about them. Are you advocating using external tools then to find bad links? Surely relevant goes both ways. Relevant is good or bad.

JOHN MUELLER: Sure, it can make sense in some situations. I think for the most part, when webmasters are looking at things like a disavow file, going through cleaning up old issues, then on the one hand, they didn't know what they were doing or they should've known what they were doing and can guide you towards that content like with the directories in this case. And on the other hand, usually you'll find the patterns in the links in Search Console. You'll see there's a bunch of directory sites listed here. So maybe there are more of that I could clean up as well. And, finally, if they're really not listed there, and it's something where you suspect there might be some other ones out there, then it might also be that they are so small that you don't really need to worry about them. But adding them to disavow file is basically just more work than you gain from actually cleaning that out.

ROBB YOUNG: Right. But if we find links within Webmaster Tools, they are relevant for something, whether that's good or bad? It can go one way or the other, surely, because otherwise you wouldn't need Webmaster Tools to clean up links because they are all positive.

JOHN MUELLER: We try not to do a big quality analysis before we show those links in Webmaster Tools. But sometimes we just have to categorize them like that-- show some of these things in Webmaster Tools because we think they are relevant to the webmaster. And some of them we think, well, this is a sample that we've already shown you before like, for example, the site-wide links. And that's something that you don't really, really need to happen. I think especially the site-wide links is one of the reasons, for example, we recommend using a domain directive in the disavow file. So instead of focusing on those individual URLs that are listed there, try to find the pattern that matches all of that. And disavow that on a broader basis. I know it's not always easy. But you guys are the experts on these things. So it's good that you know a little bit about the background there, I guess, and sometimes you do have to make judgment calls and think about, well, is it worth digging even deeper into this hole of all of these tiny links that might still be out there. Or is this basically just time I should better be spending on making my site even better and saying, well, at some point, I have to make a cut and say I have to leave the rest of these small problems behind and focus on really creating more value on my site in general.

ROBB YOUNG: Surely you benefit, Google benefits, and every user benefits from you showing the ones that you want us to work on. There's no point in giving us a set of links that have no bearing whether it's good or bad. Otherwise, they are not helping to clean up the index or report bad links or to help you use disavow files for future research. You're just getting junk back?

JOHN MUELLER: We try to show those links actually independently of disavowing and bad links/good links situation. So we try to show them because these are potential traffic sources to your site. And we had the links featured long before we had the disavow problems there or the disavow files. So it's not something that these links are provided in Search Console just so that you're completing them out. It's really for the site owner in general that these are potential traffic sources to your site. And some of these might even be nofollowed. Some of them might be disavowed already. So it's more than just a web spam a check for your site to look at that.

MIHAI APERGHIS: By the way, if I could give a couple of examples that I've noticed that you show and don't show. For, example, I noticed you don't show my grandpa was building some Blogspot websites brand-new and was putting some articles with links for them. I noticed you don't show that which is pretty normal because those weren't existent a day ago. So it's normal that they wouldn't even bring traffic back. But you do show some popular directories, even if it's not a good link-- a natural link. You do show it because maybe it might bring some traffic back even though it should be visible. Is that a good assessment?

JOHN MUELLER: Both of those cases probably fit. Yeah. It's really not the case that we provide this link list as a means of cleaning up web spam issues. But rather, we think that a lot of sites might want to see where they are getting linked from and where they might getting traffic from. So that's essentially the background information there. But let's go through some of the questions that were submitted because people have been voting on those as well. And I don't want to leave that out. We have this question from Barry. "In the German Hangout, I am told you said the Panda refresh is coming this week or next. What did you mean? Did you mean it still can even later than next week?" So a few weeks ago, we said that this refresh would be happening in a few weeks. And so it's coming up. I doubt it's going to happen this week because it's a holiday. And it's Friday. And people are busy doing other things. But I imagine in the next couple of weeks, this is something that might be happening. So we have to set expectations. We try not to provide an exact date when these things rollout because things can always change in the meantime. But I expect this to happen fairly soon.

MIHAI APERGHIS: It will be a popular subject. Are you going to do maybe a special Hangout quality content and best practices-- things like that, maybe?

JOHN MUELLER: I'm going on vacation next week so probably not. But maybe we can do something when I am back to look at these issues. If people have questions around that because I think especially the blog posts from Amit Singhal, which is now four years back-- "23 Questions You Can Ask Yourself about the Quality of Your Content." That's still very relevant. And that's still very important to look at. "Can Google crawl a hosted video and understand that the same video is hosted in more servers and published in more sites?" We do try to understand when something is a duplicate and treat it appropriately. So we do that with textual content-- web pages, for example. We try to recognize if something is a duplicate and filter those out when we show them in search. We do that with images where we can. And we do try to do that with video as well. So if you go and host your video on a number of different services, that doesn't necessarily mean that your video is going to show up five times instead of once in the search results. "If you have a site like and under every video I put a button for the user who wants to download the videos he likes more, will this thing help me in Google Search. Can Google track this kind of user action?" If you make a site that works well for your users, then that's something we'll try to pick up indirectly at least. And people will try to recommend it more. But it's not the case that we are going to see what users do within your website because we don't really see that at all. So from that point of view, what exactly you do on your website is really up to you. But if we see that users really love your website and they recommended it to others, then that's something we can pick up. The pop up that's opening in another tab can ruin site positions and search results. So this is an almost obnoxious behavior if you visit a site and it just opens up pop ups. It's not something that I am aware of us picking up as a quality signal or using for ranking directly. But, of course, if you are obnoxious towards your users, then that's something that they might reflect in how they recommend your site. So that's again something more indirect that we might notice that people don't really like your site. They don't like to recommend it compared to other people's sites that they do like to recommend.

ROBB YOUNG: John, can I ask a question on that?


ROBB YOUNG: When you track-- actually, it's probably more of an analytics question where Search Console does the same now. When you track time on page or how long a searcher is on the landing page, do you track active window versus-- because let's say there's lots of sites. Let's say it's for cars or holidays or real estate where you're looking down on the map and you open, or I certainly do, open 10 tabs. I want to look at that one, that one, that one, that one. Then you work your way through the tabs because that's how most people browse these days. What about that 10th tab I've had open for five minutes without actually looking at it? So how long have I actually been viewing that page according to Google? The five minutes? Or the 10 seconds I look at it and close it?

JOHN MUELLER: We don't use analytics data directly in search. So that's at least one thing that we wouldn't do there. In general, when we look at this kind of behavior, it's something that we look at on an aggregate level where we'll look at it for maybe algorithm changes where we see, well, people are generally clicking on the first result when we make this algorithm change. And people are generally click on the second result with this other kind of algorithm change. So maybe we're doing something wrong there. So on that level, we might look at that. But it's not something that we'd look at on a per page level. And I imagine it's also when you look at the broader public, they don't exhibit that behavior that you're doing there like opening everything up in tabs.

ROBB YOUNG: It's only for a few sites. If it's the certain types of site whether you're looking at properties to rent or looking at cars or if you're trying to book a vacation and you open 10 hotels, I'm sure a lot of people do that. But it only effects certain industries you would imagine.

JOHN MUELLER: We look at this data more on an aggregate level when we look at overall the algorithm changes across all types of searches. So it's not something that we say, well, this would affect your site. People are opening it up in a tab and looking at it later or comparing it to other tabs that they have open. So that's not something I don't think would ever really affect something there. "Does an exact match URL still boost your rankings compared to a non-exact match URL when looking at a brand-new site?" As far as I know, no. So on the one hand, with exact match domains, so in general, that's a practice where you put the keywords for your site into the domain name and hope that your domain name ranks better because it looks like it matches those key words. But as far as I know, that has no effect at all in search at the moment. It used to be maybe looking back what was it? Five years or longer that did have some effect. But at least at the moment, that has no effect at all. Is it still more important for rankings to place your keywords in the first 100 words of your content? Or does it not matter anymore? I suspect that never really mattered. So we try to understand what these pages are about. And we look at the whole page if we can. There's obviously a limit to the size of the page that we can download. But I think that's around 10 megabytes. So if you have your content within those 10 megabytes, then we'll be able to recognize that and show that in search. "We're using parameters for campaign tracking like utm_source? Is it duplicate content if parameters on each site are unique, but the content is the same?" So technically that would be duplicate content in the sense that you have different URLs with these different parameters leading to the same content. In practice, that's not a problem though. So on the one hand, we recognize that this is exactly the same content where we can filter it out even before we index those pages. On the other hand, if we do manage to index those pages separately, for example, if there's a date or timer on the page or the sidebar that's dynamic, then we'll try to filter those out when we show them to the users. So that's not something I'd really worry about there. The other aspect is if we find those parameters just within your normal analytics code, for example. Then that's something we'll try to filter out on our side anyway. We'll try to recognize that. If we find those parameters in normal links within your website or external links to those pages, then we might try to crawl and index those pages like that. But you can block that by using the URL parameter handling tool to let us know that these parameters are actually totally irrelevant. And chances are if you use analytics anyway with these parameters already that when you go to search console to the URL parameter handling tool, it'll already say we recognize these parameters and these probably aren't that relevant for your site. "Does using keywords in the domain-- is that still bad for SEO? Will I get an EMD penalty for using the keyword in the domain name?" As I mentioned before, that's not really something I'd worry about here. With regards to the EMD penalty that was mentioned there, this is something that we mentioned a while back where we are seeing a lot of sites use exact match domains with really low quality content. And really the combination of having really low-quality content on these exact match domains is what the problem is for us. It's not so much that your keywords happen to be your URL because a lot of sites have keywords in the URL-- a lot of brands have their brand name in the URL which makes sense. And just because that matches what people are searching for doesn't mean that it's automatically bad.

MALE SPEAKER 3: So can I touch base on that?


MALE SPEAKER 3: So basically what you're saying is if somebody owns like, example, If you have crap content, then, yeah, it might be looked at and maybe pushed to the back a little bit. But if you own and you have good information about tools, quality content, and user interaction, then having the will actually help you rank a bit better because you have the EMD? Is that what you're saying?

JOHN MUELLER: No, no. It's not that you would rank better because of that exact mention of a name. But it wouldn't cause any problems for your site.


JOHN MUELLER: So just because you have a great site and the keywords for your site happen to be in the domain name doesn't mean that we're going to penalize your site.

MALE SPEAKER 3: So EMD has nothing to do with ranking at all anyway?



JOHN MUELLER: And we've seen this a lot in situations where someone will go off and buy green tools, brown tools, yellow tools, garden tools, all these different domain variations and essentially put the same content up. And that's something that we were trying to target with that specific update. So if these are really doorway sites essentially with low-quality content, then that's something that we might take action on. "Anchor text with internal links that are not part of the navigation bar-- does the text matter if the same text is used too often? Is it penalized?" No, that's absolutely no problem. You can link within your site however you want. The bigger issue that we sometimes see is that people try to use this as a way to create a sitemap page within their content where they'll have their main content on top. And on the bottom, they have this big chunk of text with keyword-rich anchor text links to individual parts of the site. And it's not so much that these links would be unnatural. It's just that it looks like keyword stuffing. So we look at that page. And we find this big chunk of text on the bottom with just keyword, keyword, keyword. And that looks like keyword stuffing and our algorithms might look at that and say, well, I don't know if I can trust this page. So it's not that those links are not natural. But it's just that you've stuffed it with so much text that we don't really know what's actually relevant on these pages. "If you 301 redirect HTTP to HTTPS, will there be a slight drop in ranking similar to what happens when you move from one site to another?" Not really. So this is not something that you would see in practice or where you would see any kind of a drop. So we try to treat HTTP and HTTPS sites as similar as possible. And if you move from one version to another, then we'll try to just forward all the signals that we have there. "What's the quickest way to get Google to read a disavow? Do you have any tips on structure or content of this disavow? Should there be more or less detail?" So we read the disavow file right when you submit it. There's nothing special that you need to do there. What I would recommend doing though is using the domain directive as much as possible. So instead of listing individual URLs, try to list the whole domain. That saves you a bit of time because you don't have to chase all of those individual URLs. And it also makes it easier for us to process it because we don't have to match all these individual URLs. So that's maybe a tip to watch out for. Past that, it's really up to you. You can leave comments in a disavow file if they help you. But we don't read them at all. So these are the processed by our systems automatically. If there's a comment in there that you want to give Google, then the disavow file is probably a bad place to leave that. "Googlebot found extremely high number of URLs. I think we talked about this briefly before. When using a CDN for speed, the number of images show up indexed even though we have set up the rel canonical." So I guess there are two aspects with regards to images that are kind of relevant that you'd want to watch out for. On the one hand, you need to make sure that your web pages point to the images directly as much as possible. On the other hand, we tend not to recrawl images as quickly as web pages because usually they don't change and usually they're pretty big. So what happens is if you change the URLs of your images, then it takes a lot longer for us to actually find them. So as much as possible, really try to keep the URLs of your images as the same as long as possible. That means if you're using a CDN, try to maybe use the same URLs as you had before if that's possible. If you're moving from one host name to another, then just keep in mind that it takes a lot longer for us to process that when it comes to images. So definitely make sure you don't have any kind of session parameters in the image URLs. Really try to make sure that the images URLs that you have specified on the web pages match exactly the ones that you use for hosting the content that you won't want to have indexed. We also don't use rel canonical on images directly. So you really need to either redirect those image URLs if you're moving from one site to another or at least make sure that the web pages really point at the actual image URLs. Another aspect we sometimes see is that people will set up a CDM that's kind of like a round robin set up where you have different host names where the images are sharded across different servers. And if you do that just make sure that you're always playing at the same hosting for each image so that it doesn't happen that we crawl the web page. And we see one link to version A. The next time we crawl that page, there's a link to version B. The next time we crawl it to version C. But instead, really make sure that the URL you use for the image is as static as possible and that it stays the same.

MALE SPEAKER 3: Well, what we did is we were trying to get every page to load under two seconds. And the only way that we accomplished that with a 2 megabyte page was to use a max CDN service. So we set that up almost a year ago now. And the images are still showing in the Google Webmaster Tools area. And it's about a year that that's been set up. And we haven't made any changes whatsoever. But the CDN itself-- we made it where it's And then, of course, the string. And that's what we've done.

JOHN MUELLER: Is it crawlable? Is there a robots.txt in place maybe?

MALE SPEAKER 3: Yeah, they actually said for us to use the noindex, nofollow that they offer because you don't want to run into duplicate content issues. So should we actually take that noindex, nofollow robots off of there. Because I can see what you're saying. It's not getting it because the robots on that CDN is saying don't index this. So maybe try taking the index off? We just didn't want to get in trouble for duplicate content. You know what I mean?

JOHN MUELLER: I wouldn't worry about duplicate content there? Because if we recognize it's exactly the same as we've already seen, that's a technical problem for us. We just try to fold those versions together and treat them as one. But if we don't have any version all, then--

MALE SPEAKER 3: Yeah, this just happened.

JOHN MUELLER: So I'd definitely look at that. And maybe what I'd also double check is in Search Console-- in Fetch as Google-- you can look at the rendered view of a page. Really make sure that in the rendered view, the images also show up so that it's really kind of a check to see that Googlebot can actually access those images. And if there is no index in the way, then actually we should be able to index those too.

MIHAI APERGHIS: John, regarding fetch and render, there was actually a product forum thread about somebody asking that if he uses j-query to generate a rel canonical tag I think that was. But it doesn't show up in the code when he does the fetch and render process. And the rendered image wouldn't show anything else in the head portion of the HTML. Does that mean that Google doesn't trust this?

JOHN MUELLER: It's tricky. If you're doing something like that with JavaScript, then chances are we'll pick it up. But it's not guaranteed. So if, for example, for any reason we can't process the JavaScript file, then we'd crawl the page without the rel canonical. The same thing happens if you use JavaScript to add noindex, for example. So it's a situation where I imagine most of the time we'll get it right because we'll be able to render these pages with JavaScript and see that JavaScript adds a rel canonical or noindex to the head and we can use that. But if for whatever reason we can't process the JavaScript file, then we have that version without that extra metadata. And if that's a problem, then you might want to prepare for that and just have some kind of failsafe within the HTML.

MIHAI APERGHIS: And one more thing regarding that. So the code that is say in the fetch and render result before you render the JavaScript and the after? Because that person would have never seen that rel canonical tag in the code.

JOHN MUELLER: Yeah, so with the fetch and render, we show the screen shot of the rendered version. And we show the HTML source that we actually downloaded. So you wouldn't see if there's a rel canonical that we would render with JavaScript. I guess one thing you could do is set up a test page that inserts rel canonical and then uses JavaScript to try to read that out and display it in the big text so that you could see it in a screen shot. But usually we have no problem finding that kind of rel canonical when it's generated with JavaScript.

MIHAI APERGHIS: That was my suggestion.

JOHN MUELLER: All right. "We're currently receiving a large amount of referral traffic from spammy domains. Will this analytics data affect our rankings? Would do you suggest blocking these domains in HT access such as filtering and analytics?" I can't speak for the analytics team directly. I know they're aware of this problem. And they're working on resolving that. One of the things to think about there is a lot of the spammy traffic that I have seen at least with my sites is from sites that haven't actually visited my website. So it'll look like referral traffic and analytics. But it's not actually from someone who visited the site. So by using htaccess to block them on my server, that wouldn't necessarily block that from appearing in analytics. So you might want to double check that before you spend a lot of time working on an htaccess file. There are ways that you can filter this data in analytics though. So that's something you might want to look into. I've seen a lot of blog posts that have the instructions on how to set that up. "How much does silo structure affect rankings for e-commerce sites?" I don't really understand that question. I guess that's about the URL structure within commerce sites where you have a complicated structure sometimes. From our point of view, how you set up the URL is essentially up to you. It's not something that we'd say you need to do it like this or like that. Some people use path. Some people use dashes in between different parts of the URL. Some people just use the IDs that come directly from the database. And all of that works for us. What's important for us is that we can actually crawl your website and go from one URL to the other to find all of that content. And ideally, that we understand some kind of a structure when we crawl the website so that we see this is the main page. This is a category page. There are lots of detail pages here. Maybe some of these detail pages are related to each other and they have cross links. All of that helps. What doesn't work for us so well is if you just have one homepage. And it has a search box in it. And none of your content is actually linked from those pages. So if we have to make up random search keywords to try to search through your website to find those pages, that's going to be very tricky. But if you have a clear URL structure that we can click through and follow through to your pages, that's perfectly fine. How you structure the URLs themselves-- if you use different path elements or all one word with dashes, that's really up to you.

MIHAI APERGHIS: John, I think the question mainly refers about segmenting your site into separate topics and the topics themselves don't really link from one another. So that's siloing. Like you have a TVs section and gardening products. And those two don't really link to one another thereby separating them.

JOHN MUELLER: That's fine. That's fine too. If we can recognize those individual parts of your site, that's perfectly fine. A lot of people have this structure naturally. And they use maybe different CMS systems on the same domain. Maybe they have an e-commerce site and a blog. And they're kind of different CMS systems. So they don't cross link by default. And that works too. As long as we can really crawl from any place within the website to discover the rest of the website, then that should work for us. "If a person takes a hot link from my video or photo that I host on my website and publishes it in his website, is that counted by Google as a backlink for me?" I don't think we count that as links directly. But it does help us to understand the images better and to figure out how we should be showing these in image search. "As you confirmed in the German Hangout that Panda update is coming. Is this just for German or for all languages?" These updates try to be as general as possible so that they be for all languages. "How to implement site link search with a Smarty template? I tried using literal. But it's not working." You probably want to talk with someone who has experience working with this kind of templating language. So that's possibly not something that you'd find help for in the Google Webmaster Help Forums because you really need to have someone who knows that templating language and can give you some tips there.

MIHAI APERGHIS: Actually, I used Smarty to implement that. And I didn't have any issues.

JOHN MUELLER: OK, so talk to Mihai. Ha, you just volunteered, sorry. "In regards to 404 pages, how long does Google keep these 404 URLs in Google's index before they get removed, even if it's removed, can we use 301 to get those pages back to the index?" We do try to remove them as quickly as possible. With 404 pages, we might check the URLs maybe once or twice or three times just to make sure that it's really gone before we actually drop it completely from the index. But there's no fixed time frame where we would say after one week it'll happen because sometimes after one week, we wouldn't have even crawled that URL once. So it really depends a lot on the site and how we crawled that. "Can a drop down navigational menu affect my site's crawling, indexing, and ranking?" Sure, theoretically. So with a drop down navigation menu just like any other kind of navigation, it helps us to understand the site structure and to kind of follow through and find links to the individual parts of the site. And if you don't have links to the rest of your site-- to the rest of the site structure-- that makes it really hard for us to actually pick up and figure out what we should be showing here. So if you have a navigation menu that leads to all the parts of your site, that's perfectly fine. If that's a drop down navigational menu or a sidebar or whatever else, that doesn't really matter so much as long as we can crawl them. Sometimes what's tricky is that you'll use a JavaScript widget for this navigation. And if the JavaScript is blocked by robots.txt, then we won't be able to see that menu at all. Then we'll just see that you're crawling some JavaScript file. We don't know what's in that JavaScript file. So we can't use that to crawl the rest of your site. But if this is a navigation menu that we can pick up for crawling, then we'll try to use that to crawl the rest of the site. "For new websites that are just beginning the SEO process, is it better to launch the website even when it's not 100% optimized and finished or really make a complete site that's full and optimized with metatags and alt titles before launching?" I think this is kind of up to you. It's not the case that we would count it against the website if they don't have metatags or clean titles on their pages. We try to look at that every time we crawl those pages and take that into account. And if you want to launch now and you know that your site isn't 100% perfect. But you think people are just waiting for it and eager to look at your site and to recommend it to others-- maybe launching a slightly incomplete site makes sense. On the other hand, if you want to make sure that everything is 100% perfect before you launch, then maybe waiting until you're satisfied with this stage makes sense. So it's really up to you.

MALE SPEAKER 3: Can I ask question on that?


MALE SPEAKER 3: This is something that is like-- everybody talks about this all the time-- about new site versus old site. Does that matter? If I have a site that's 20 years old with the same length, same content, as a site that's a month old-- same length, same content-- does it matter if my site is 20 years old versus a site that's a month old? Because it's asked everywhere.

JOHN MUELLER: Yeah, it's a very theoretical question because if your site is 20 years old, then chances are it has a lot more history than something that's maybe a month old. So saying that they are exactly the same with regards to links, for example, that's usually not the case. But it's definitely not the case that there's any kind of old site bonus where we would say, well, this domain has been up for 10 years. Therefore it should receive a ranking boost. That's not the case.

MALE SPEAKER 3: So all things equal, the age doesn't matter.

JOHN MUELLER: It doesn't really matter, no. And I guess one aspect might come into play is if it's a really new site. And we don't really have any good signals for that site yet. So it takes a while to understand that this is a site that has good signals where we see people are recommending it. It looks like something we should show in search. But if you're talking about something that's maybe a month old. And that gives us time to look at how that is shown-- how we should show that in search. So that's kind of comparing a site that's one-day-old to something that's 20 years old. It's really hard. But once you're past like a month or something, then we understand how the site should be shown in search. Let's see. "Can I include canonical URLs in site maps for SEO? For example, url.htm? Here's a duplicate of example I use this tag in the sitemap." I recommend trying to do that on the pages themselves or in the HTTP header instead of doing it in the sitemap. I don't think it's directly possible in the sitemap file, but I might be wrong there. I think with mobiles pages, for example, we do have the option of having the link rel alternate in the sitemap file. But I'd really try to keep that in the HTML as much as possible-- the canonical. "In Search Console, I recently received a message, add app property or URL error reports. But I have already added the app index app property. And reports come there. Is this message just a notification?" Yes this is essentially notification. So to back up a little bit, we recently added the ability to add Android apps directly to search console. So that if your app is indexed for search using the app indexing API, then we can show you information directly there if you have that app verified in Search Console. In the past, if we recognized that your app belongs to a website, then we would have shown your app information on the website in Search Console or rather within the messages that we sent to the website in Search Console. So it's kind of, I think, maybe a month or two ago we added that ability. So if you have an app that you're using app indexing for, make sure you have it verified separately in search console so that you actually get this information about clicks and impressions-- any crawl errors that we might have for the app-- all of that. "Is app indexing for app-only content already reflected in the search results? At Google I/O, Scott Huffman said that some of this would finish in a few weeks." I don't believe it's live yet in search. So app-only indexing means you have an app that has content. And we try to show that in search results so that if someone searching for your content, we can recommend to them if they're on a smartphone that kind of uses-- allows for this app-- then we can recommend that to those users directly in search. But I don't think that's live just yet. I know they're working on some of the issues there to make that easier. It's always a bit trickier if you have an app with content in it that doesn't have an associated website because then we really have to crawl through that app to actually get all of that content. And that's definitely not as easy as crawling HTML pages. All right, we don't really have time. But I have a bit more time. So if any of you have any questions, if you want to hang around a bit, let's chat.

MIHAI APERGHIS: John, this is actually regarding one of the questions left. And I've noticed this in Romania as well. There are some sites that hide the content behind a pay wall, but do allow Google to actually crawl the content that would be otherwise accessible only if you pay. And so that content in Google is crawled. And when you search in Google, you see the result. And you see the meta description containing those keywords that you searched. But when you actually access the website, you don't see any of that. They ask you to get a subscription. And that's not one of the most friendly user experiences.

JOHN MUELLER: Yeah, we classify that as cloaking. So you're showing users something completely different than what you're showing Google. And that might be something that the website team would take action on if you submit a web spam report. It's always a kind of a tricky situation there. What we usually recommend for sites that do want to have a pay wall or sign up flow is to look at the first click free setup where users going from search directly have the first couple of clicks free to actually see the content. And then they are confronted with the pay wall or the login page.

MIHAI APERGHIS: So I could submit a web spam report or a feedback thingy?


MALE SPEAKER 3: I have two questions.


MALE SPEAKER 3: One is we-- as you know, I'm in the celebrity niche stuff. But we had another blog that we were doing about celebrities-- what it's going on in the news and that kind of stuff. And then we have another site that we do but found that we didn't have enough time to continue doing both. So both celebrity related-- so it's OK to 301 redirect the celebrity site that we were doing before to our main one because we want to focus on the main one instead of trying to separate both and put up crappy content versus one site with good content?



JOHN MUELLER: That makes sense. If you can do it in a way that the new content is equivalent to the old content, that's perfectly fine. If you can do it in a way that--

MALE SPEAKER 3: It's closed.

JOHN MUELLER: If you can find something like a page-by-page matching and do the redirects like that, that would be totally awesome. And that would be absolutely no problem at all. If this is the case that you're folding one side into the other and it's essentially the same target audience, then even just doing a site-wide redirect like that to maybe a home page is also an option.

MALE SPEAKER 3: That's what we did. We did the htaccess. OK, and my other question is that you mentioned that site-wide links can be bad. What about the same scenario-- another person like TMZ is linking to us in their sidebar. But that gives us a site-wide link. But we don't really want to disavow TMZ because they are a strong company. So are you talking about the site links that can hurt you as like I'm a celebrities blog. And this is a blog about gnomes.

JOHN MUELLER: No, I site-wide links aren't necessarily bad. And in a case like that, that's a perfectly fine situation where you say, well, this website is linking to my website from all its pages. That's a fantastic endorsement, I think. That's not something you need to blog. It's just that we wouldn't show all of those individual links in Search Console separately. So if TMZ is linking to your website, you'll probably see a handful of links from TMZ in Search Console, but you won't see all of the individual URLs from TMZ that are linking to your pages. So it's not that they're bad. It's just that we don't show all of them because we think the sample gives you enough information already.


JOHN MUELLER: One last question from anyone?

ROBB YOUNG: John, I'll ask. I don't know if you'll answer if there's ever been any update change in our original issue from two years ago?

JOHN MUELLER: I'd have to take a look.

ROBB YOUNG: I can wait.

JOHN MUELLER: I thought we looked last time, right?

ROBB YOUNG: No you looked at the new site because we'd moved completely. But we still have 10-year-old domain that had that original mystery underlying problem.

JOHN MUELLER: That's the one without the E, right?

ROBB YOUNG: Yeah. And if it will ever lift because we don't want to close it because it's got 10 years of history there. If it comes back, we can then 301 to the new one and we should theoretically get that boost back.

JOHN MUELLER: At least for the moment, I'd continue working on the new site.

ROBB YOUNG: All right, then there's nothing we can give us in terms of-- because it was a mystery problem. So I don't know if that's something that you needed time to work out and whether it's been enough time. It was two years.

JOHN MUELLER: I'd have to check with the team what's happening there. At least for the moment, it looks like it still is affected by this.

ROBB YOUNG: Right. And this is-- affected by what?

JOHN MUELLER: Affected by the algorithms that affect that.

ROBB YOUNG: All right, if you wouldn't mind just asking again.

JOHN MUELLER: Yeah, I'll check.

ROBB YOUNG: I'll drop you a Google+ reminder as if you need one.

JOHN MUELLER: Right. Mihai?

MIHAI APERGHIS: Yeah, this is actually for one of the other product forums that I couldn't find an answer to. This is a URL-specific issue. I've noticed that if you use a site query on that URL, it shows that the URL is indexed. But if you're searching for in the title-- the title of the page or all in title, it doesn't show any results. And the person was complaining that whenever searching for that specific title, his page isn't showing the results. It's actually showing another page of his website that links to that page. And that's not very user-friendly for the searcher. It's a really interesting issue because it looks like it's indexed. But it's not indexed.

JOHN MUELLER: It looks like it's indexed normally, actually.

MIHAI APERGHIS: So why wouldn't it show up in title or all in title?

JOHN MUELLER: I don't know. I haven't used those queries in a really long time. So I don't really know what we're looking at there. But if you look at the cache page, you can see that it's actually there. So from that point of view, it's not something where I'd say it's not indexed were it's actually picking that up. But I'd have to check watch what the title shows there.

MIHAI APERGHIS: Even if you search for the quote title quote, it shows up a different page of his website that links to that page instead of actually showing the page itself.

JOHN MUELLER: OK. I don't know. I'd have to take a look or maybe talk with the team about that. It's actually a mystery.

MIHAI APERGHIS: If it shows for the site query, should it also show for the E URL query? Because it doesn't.

JOHN MUELLER: They are slightly different. And the way they are processed internally is very different. So it can happen that it shows up in one and not in the other. What I usually do is either an info query or a cache query to see which URL is actually indexed because sometimes what will happen is we have www, non www. And we show it with the site query. But we don't show it with the cache query. According to the cache query, you'll see the different URL there. So that's something I sometimes use to double check is this specific URL actually indexed?

MIHAI APERGHIS: Well, I left you on Google+ the forum thread.

JOHN MUELLER: OK, I'll take a look. That's a weird mystery in that case. All right, so we've come to the end here. I'll be setting up the next Hangout. But next time I think I'm on vacation. So it'll probably be in three weeks-- something like that.


JOHN MUELLER: Yeah, things will go on. Don't worry. All right, so thank you all for joining. Thanks for submitting so many questions. And hopefully we'll see each other again in one of the future Hangouts.

MALE SPEAKER 3: Have a good vacation.

MIHAI APERGHIS: Have a nice vacation, John.

ROBB YOUNG: Thanks, John.

JOHN MUELLER: Bye everyone. | Copyright 2018