Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 02 June 2015

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: All right. Welcome everyone to today's-- oh, we have a bit of an echo-- to today's Google Webmaster Central Office Hours Hangouts. My name is John Mueller. I'm a webmaster trends analysts here at Google Switzerland. And a part of what I do is talk with webmasters and publishers, people who create great websites, and make sure that their input comes to our engineering teams as well. And accordingly, if there's anything from our engineering side that we need to pass on to you, we'll try to do that, too. So as always, if anyone wants to ask a first question, I don't know, maybe one of the new people or the people who haven't been in the Hangouts so much before. Thomas, is there anything specific on your mind?

THOMAS: Well, I have a lot of questions. But I've been watching. And first of all, thank you very much for doing these. I think I've watched every single one that you've done, and Matt Cutts prior to that. But I've been watching from afar. This is my first participating. I've had my website. now for a little over 11 years. I started it when I was in Navy f Command, and it was to find out how we can actually recruit within internet games, which was pretty interesting at the time. But I've kept the site and kept it going. I went to the HTTPS protocol back in August, and I saw an immediate drop in traffic. I was getting tremendous amounts prior to that. And I found out that it was my stupid coding errors in HTTPS trying to redirect everything to we're supposed to go for the HTTP to HTTPS. I think I have everything settled now. A few months back, you mentioned about taking a look at sites and being able to do evaluations on those. I don't know if you ever plan on doing that in the future. But if you can take a look at my site and tell me if I actually have all little wickets where they're supposed to be and be pleasurable, that would be good. So I don't know how to have you do that.

JOHN MUELLER: One thing you could do is if you have a thread somewhere in the Google community or in the help forum somewhere, if you could send me that link to that thread, I can take a look there and see what we can add from our side there as well.

THOMAS: I'll email that to you.

JOHN MUELLER: OK. Fantastic.

THOMAS: Thank you.

JOHN MUELLER: All right.

MALE SPEAKER: Hi, John. I was wondering if I could just quickly ask you about, again, this is a query about a particular site and maybe valuable information for other webmasters as well. It's a client called We went through a process of disavowing a lot fairly dodgy and manipulative links about six months ago. And I think the client actually prior to that had spent quite a long time tidying up the backlink profile. And we've still not seen any kind of recovery for the homepage in particular, which seems to have a quite granular penalty for spa breaks and spa days. And this is in Google UK by the way. So we're just wondering, we've been waiting for a Penguin data refresh. But I know you've potentially got a penalty server you could look at. I was just wondering, a, whether you could give us a definitive answer, whether it's penalized or not, and then perhaps elaborate a little bit more on is there anything else we need to do, how long could we be looking at another data refresh happening.

JOHN MUELLER: That's always kind of tricky. I'm happy to take a look afterwards if we have a bit of time left. But it's usually helpful if you have a link to a forum thread or in a Google+ thread somewhere, where I can just take a look and see what other feedback you've getting, and make sure you're kind of on the right track there. Sometimes there is something we can kind of say about how our algorithms are looking at a site. But oftentimes it's just our algorithms essentially ranking a site as they think it would be relevant there. And improving that ranking is more a factor of just making sure that you're taking that site essentially to the next level-- from quality point of view, from a user experience point of view-- really making sure that it's a top site for that niche. And sometimes that's easier, and sometimes that's pretty tricky. But I'm happy to take a look and we can see afterwards if there's anything specific I can add to those threads.

MALE SPEAKER: Yeah. So thanks, John. I appreciate that. I'll ping you a link. [ECHO]

JOHN MUELLER: We've got a bit of an echo. All right. More questions from--

MALE SPEAKER: If I can just jump in. I've got a similar issue. I've got a website. We don't know yet. It has been penalized in the past. So if people have any manual reviews or something like that, we in working now over six months and just trying to make, everything good or better or the best. So we're doing every single area, user experience, making our site definitely better than technically [INAUDIBLE] for users [INAUDIBLE]. If I compare in our industry, our website is actually on the top. But I've got this feeling that the more we try, the more our rankings have declined. And I was like, what is the problem there. And I'm just run out of ideas.

JOHN MUELLER: It sounds like for a lot of the site-specific things, it would be really helpful if you have a public thread somewhere where I can jump in and see--

MALE SPEAKER: Yes. I just talked to you in a private message in Google+. And if you could have a look at it, that would be great.

JOHN MUELLER: I mean, if there's a manual action, if there's something manually that was done wrong, you would see--

MALE SPEAKER: That would be in the past. But you didn't have any manual actions, or something like. So the only thing I know is that we had some, I think it's SU attacks, and so many dodgy links linked to us. But I don't know if it's really affected our website. We don't know.

BARUCH LABUNSKI: All right. John.

JOHN MUELLER: I can take a quick look afterwards at the [INAUDIBLE].

BARUCH LABUNSKI: All right. John. Can I ask a question?


BARUCH LABUNSKI: In terms of the rating system, like the stars, when are they being eliminated? Are they in the works?

JOHN MUELLER: The rating system?

BARUCH LABUNSKI: The rating stars.

JOHN MUELLER: They should be there. I mean, they're like the rich snippets, right? Are you talking about that?

BARUCH LABUNSKI: Yeah. The rich snippets. Yeah.

JOHN MUELLER: Usually when we figure out that this is something relevant for the page, we'll try to show that. At least I don't know of any plans to turn that off completely. Yeah. So from my point of view, that's something that still completely relevant. If you have content that you have user reviews for where you have those reviews online on your site, then hopefully you can mark that up with the review markup.

BARUCH LABUNSKI: OK. So there's no plans of removing them, yeah?

JOHN MUELLER: I mean, even if we don't show some types of markup, that's something you don't necessarily need to remove from a website. That doesn't cause any problems. It makes it a little bit easier to recognize what the content is about. And even if we end up not showing, it's still kind of relevant for those pages and might be useful for other search engines.

BARUCH LABUNSKI: But so it's just the [INAUDIBLE] that are going to be gone.

JOHN MUELLER: Yeah. The Unicode characters and those kind of special symbols, that's something that we've been looking at where we think it might make more sense to kind of clean that up and just hide those from search results.


JOHN MUELLER: All right. Matt. We have a new visitor. Is there something on your mind, some question that we can help you with?

MATT: Uh, me? Yes. So I posted this in the Q&A. So we just noticed over the last couple of weeks we've switched everything to HTTPS. I actually asked you about this in a previous office hour about a month ago. Everything was fine for a couple weeks. But now we're seeing something very, very strange, which is we've lost about 80% of videos from our video index. We're a video site, so that's kind of problematic. And Webmaster Tools Search Analytics is saying, we're to like basically 100 or 200 clicks a day from some search and that we're getting like almost nothing happening. But Analytics is showing us that we're getting like two orders of magnitude more traffic from that a day from Google search. So there's something very weird going where something makes it look like we're doing really pretty badly. And the last time that we lost this many videos from the index, we had a catastrophic traffic collapse. Well, we're actually doing better than ever. So we're just not sure where the disconnect between Webmaster Tools Search Analytics and Google Analytics is.

JOHN MUELLER: So one really common thing that I've noticed a lot of sites do when they move to HTTPS is that they don't look at the HTTPS data directly in-- what is it-- Search Console now, so not Webmaster Tools anymore. So basically, you have to have your HTTPS site verified there as well. And then, you'll see the data for HTTPS. So what you should be seeing is like a drop from the HTTP data and the rise again in the HTTPS data. But you'd have to look at the different sites separately.

MATT: That makes a lot of sense.

JOHN MUELLER: So that might make it a little bit easier. I don't know about the videos though. I don't think we show that specifically in Webmaster Tools. We do have information if you have a video sitemap file.

MATT: We do. Yes.

JOHN MUELLER: Then, you should see the index counts there for that sitemap file. Do you host the videos yourself? Or do you put them--

MATT: They're streamed videos, so we're hosting the HLS playlist files and the [INAUDIBLE] playlist files. And the play pages are our domains. But we use Akamai for this as a CDN to the actual string.

JOHN MUELLER: And I guess some of that probably also moved to HTTPS, at least the plain pages.

MATT: But they setup an HTTPS for quite awhile. It's just we moved everything on the site to HTTPS. And we started a gradual migration of sitemaps over to HTTPS. So we're 301ing everything. But we're now also kind of generating new sitemaps with the HTTPS links, whereas the old site might still be generating the HTTP links for now. And the idea is to switch those old ones over one at a time as we sort. So we have quite a large problem with latency. So it can take a week or two for a sitemap to come back into the index if it gets pulled, which was the other question I'd like to ask, if that's all right. So we are a video service, so all our content is official record-label provided content that we have basic parental guidance ratings for those. And then, our assumption is that the reason that it takes so long to get into the video links is that a human being has to assess your video to see whether or not it's explicit or not. And we were wondering is there a process by which people who have that kind of industry rating applied to content already can use that as a way of preventing you having to go that route so that would enable you to skip that. And it would mean that we see a reduction in latency and getting it to the index, which might be good for both the [INAUDIBLE] and for us. I don't know whether that's feasible.

JOHN MUELLER: I don't know. I'd have to check with the video team on that. I'm not sure completely. I know there's a lot of metadata that you can include in the video sitemap file. But I don't know if the ratings information's in there or if we--

MATT: Well, [INAUDIBLE] family-friendly or not, which we're setting based on the record company's explicit, not explicit flags. So we're basically setting all the metadata we have. I think all the metadata can be related to the video. So if there's some stuff raised to [INAUDIBLE], we're not doing it. But everything else we're doing. So the other question obviously is so when you get the various rating systems, making sure that your content is rated in a way that it's compatible with the various rating systems. I don't know for, say, [INAUDIBLE]-- Xbox One video stuff-- there's a very complicated multi-tiered rating system you have to go through. But you guys seem to have family-friendly, not family-friendly.

JOHN MUELLER: I don't know what the current standards there is with the video search index. So I think one thing that would help here is if you could send me an example URL, something where you're setting this correctly already. So I can check with the videotape to see if we can essentially keep that and just use that directly. And maybe also send me an example with the HTTPS, where you're seeing an HTTPS video disappear or disappear since you've moved to HTTPS, so I can double-check with the video team to see if that's really working the way it should be.

MATT: I'll send you some links. Thank you.

JOHN MUELLER: Sure. I know with images, it's always a little bit trickier with HTTPS. It shouldn't be that much of a problem. But if you move domains, then it just takes a lot longer for images to refresh. But I don't know for sure how we handle that with videos.

BARUCH LABUNSKI: By the way, are you guys changing the Search Console URL path? Because it still says Webmaster Tools.

JOHN MUELLER: No. We're going to use that to trick everyone. I don't know. We have a big list of things that we can still change. And I think, at some point, we'll probably do a public callout for more things where we're still inconsistent with the naming just to make sure that we have everything covered. It's all over.

BARUCH LABUNSKI: Lots of work.

JOHN MUELLER: These are always fun things. All right. Let's go through some of the questions that were submitted. And then, we can get back to questions from you all. If anything comes up in between or towards the end again. "My company signed up as a beta tester the new Webmaster Tools API. We completed the NDA. Where can we find out what's happening next?" So essentially at this point, we're still working on finalizing everything. We're getting people signed up so that they're ready when we're ready to bring that out for testing. But we'll get in touch when things are prepared. I think there's a similar question somewhere else. We can see if we get there later. "Is there any setting or variables that I can use to tell Google that my mobile website I use an alternate desktop tag on my desktop and a canonical on the mobile site. I use a separate subdomain for mobile website." If you set that up correctly, then that's essentially what you need to do. There's nothing special that you need to do in addition to our best practice recommendations here. "How does Google handle linking from a blog post to blog post using keyword anchor tag?" We pick those up as normal links. So it's not something that we would say, this is a blog post therefore the link is more important or less important. Essentially, it's a link. We can follow that. We have our algorithms to figure out how we should treat this specifically. But essentially it's a link like everyone else. "I'm using the G Data Webmaster Tools API to fetch keyword data and nos I can't login." So if you're using the old API to download the CSV files from Webmaster Tools, you need to make sure you're using the right service name. So in our examples, we have I think it's called app ID there, where we can specify an ID that you should use. And this is the one that we're looking for at the moment. And with that one, you can still get into the CSV downloads. And as I mentioned before, we're working on the new API for the search query data. If you're interested in much, send me a quick note with your email address so that I can send you have NDA to get you started on that as soon as we're ready. "Hello, John. Is page rank still used in the Google algorithm? And is this still an important factor? Also, are you able to pass page rank on by using a canonical tag, hreflang, or 301 redirect." So we do still use page rank internally. It's not completely dead. We don't update the toolbar page rank anymore. So if you look at your page rank in the toolbar, then that's not really going to be that relevant. For some aspects of our search algorithms, it's still a relevant factor. It's still something that we use there. So if it were something that we wouldn't use at all or that has a tiny amount of value, then we probably wouldn't be using this anymore. So we try to keep our algorithms a little bit lean. And if there's something that we notice doesn't affect search or doesn't affect search at all anymore, then we'll try to take that out instead of keeping it running. Passing page rank. With the canonical tag, you're kind of forwarding the signals there and you're giving us some information that you'd like to have these pages combined. And we'll try to do that if we could figure out that to really make sense there. Hreflang tag is essentially different, because we want to index these pages separately. It's not that we're forwarding page rank. These are essentially separate pages. And you're just telling us this is German, this is French. And we'll pick out whichever one makes sense. And the 301 redirect, essentially you're forwarding any signal that you have to that page to the next one, so we pass page rank there as well.

MALE SPEAKER: John, can I ask a follow up question in regards to canonicals and 301s?


MALE SPEAKER: We've got two sites in the UK. One is the same as our USA one, lots of general experiences. And we have a second site, which is only extreme stuff-- the skydiving, bungee jumping stuff. And it's in conjunction with a license partner. But we're looking to move all that stuff away from that one to our normal site. So over time, we were going to use either 301s or canonicals between the sites. Is there a saturation point where if you've got, say, 50% of your URLs from one point to another with either a 301 or a canonical Google-- actually, these might essentially be the same thing? Or is it can you do 99% of it and it'll still considered the 1% to bet its own domain? Or is it different in every case?

JOHN MUELLER: We try to do it on a per URL basis. So if this is something that we can see that a part of the site has moved to a new domain and may be a part of the site stays there, we'll try to keep it like that. It's trickier if like you said like 99% of the site has moved to the new domain. The homepage has moved. The robots text file has moved. The sitemap file has moved. Then, at some point, our algorithms are probably going to say, well, probably the whole site. But if they're really still separate sites, if the home page is still separate, if the robots text file--

MALE SPEAKER: The home page will be lost.

JOHN MUELLER: Then, that's perfectly fine. And I think a rel canonical is a perfect use there, where you essentially have two different markets that you're targeting, some of the products are overlapping. And you just pick one of those two sites essentially that you say this is my primary site for this specific project. And the other site can still list that product, but it has a rel canonical pointing to the primary site. So the user could stay within that other site and see all that content. But for search we would concentrate on the link we specified as the primary version. So that's essentially perfect use case there.

MALE SPEAKER: Right. OK. Thanks.

MIHAI APERGHIS: All right. John, if I can follow up as well on the hreflang tags. Since you say that page rank isn't really forwarded in any way when you use hreflang tags. Is it really worth it to use them when you target English-UK and English-US and you just change the currency from dollars to pounds? Should you have two separate pages for that? Is that too excessive for just changing a symbol basically.

JOHN MUELLER: It depends. I think it really depends on your website, on your users. If it's something that is really completely the same, then I might just use a rel canonical there, especially if it's the same currency. And that's something where you could, maybe in like Germany and Austria, you could say, well, it's the same currency, same description, everything, I could just use a rel canonical. But especially if it's a different currency or a different price, then what might happen is we pick up the price and show it in the snippet, and then you're searching for your product in the UK and you get the US dollar price in this snippet. And then, that looks kind of weird. Then, the user goes, well, they're selling it to the US. It's probably not for me. So that's kind of the situation where you want to kind of separate these pages and make sure that they're indexed separately, that they really show that they are four different users.

MIHAI APERGHIS: But since no page rank is forwarded whatsoever, then you could be ranking very well with the US page in US and very badly with the UK. I mean, other than the metrics that apply domain-wide, for example.

JOHN MUELLER: Yeah. I mean, theoretically that's possible. What happens with the hreflang tag is we swap it out. So if the US page were to rank in the UK in a higher place, then we would swap that out with the UK page. So that's kind of an advantage there. But of course, you're creating separate pages. And they kind of have to rank out by themselves. And if you really have one really strong page and a lot of really mediocre pages, then maybe it makes sense to just say, I'll focus on my really strong version and the snippet might be wrong, the currency might be wrong with the snippet, but it's worth it because I have a really strong page that's more visible than individual pages. But it really kind of depends on your website.

MIHAI APERGHIS: And is geo-targeting the recommended, if you had that example with the English-US and English-UK. Should I just do subfolders and geo-target them instead?


MIHAI APERGHIS: Would that help? Or would it be-- I mean, geo-targeting usually does affect ranking in [INAUDIBLE].

JOHN MUELLER: I mean, that that would help. But of course, if you have a subdirectory for the UK and a subdirectory for the US, then those are separate pages again. So you can say, this page is geo-targeted for the US, UK and Canada, and this page for a list of other countries. You have to kind of do that separately. I mean, for some sites it makes sense to separate it. For other sites, I think it makes sense to combine things. There's always this extra administrative overhead that's involved when you separate things into separate sites. And I don't know. Sometimes it makes sense. Sometimes it doesn't. Trust your judgment.

MALE SPEAKER: John, can I ask?


MALE SPEAKER: I have a question. For example, in Google Webmaster Tools, we have a website that is linking to our home page with 10,000 links. The website is disavowed for more than a year, but Google never removed it from those links. That's the first question. And the second question is we have done most of the sweep of the website. We cleaned, I hope, everything and we would like to go to HTTPS. Will Google recrawl [INAUDIBLE] website faster if we move to HTTPS? Because that's our last resort. If that doesn't work, we have to change the domain.

JOHN MUELLER: So first off, if you disavow links, they will still be shown in Webmaster Tools. So it's not that you should expect that they disappear there. So maybe things are essentially just working away that they should be, and it's not something you have to worry about. So I think maybe that's the first thing to know. If you do move to HTTPS, then you would redirect probably from the HTTP to the HTTPS versions. And essentially, all of that would be kind of forwarded to the new versions. It wouldn't affect the recrawling of those external links. So it would recrawl the internal pages, but it wouldn't change how the external pages are recrawled in the process.

MALE SPEAKER: But would Google recrawl our website? Because we have daily index threat now, I think about 200,000 links-- 200,000 pages we have indexed. But Google is still moving down it's about 80,000 errors left now. It's recrawling everything now, but would it make it faster? You don't know.

JOHN MUELLER: I don't think so. I think it would take about the same time. So we do try to recognize when a site is moving and crawl it a little bit faster to process that site to move faster. So you might have a small effect there, but I don't think it would be a significant effect. But of course, if you want to move to HTTPS, I don't want to stop you. I think that's a great thing to do.

MALE SPEAKER: Well, we have to do something. I don't know. You know my story already, so this is the final resort before we move to a new domain. If this doesn't work, then we have to 301 if I'm not wrong.

JOHN MUELLER: I mean, if you removed those pages already, then I think you could keep it like that. You could move to HTTPS if you want. I think they're kind of equivalent. I don't think it's something where you need to force those pages to disappear from the internet. They'll drop out automatically when we recrawl them. And that's essentially fine.

MALE SPEAKER: It's easy when you have 10 pages. But when you have half a million pages, then it takes for Google like a year to do it.

JOHN MUELLER: That's true. What you can also do is use a second file to let us know that these pages changed.

MALE SPEAKER: Yeah, we do that.

JOHN MUELLER: We can do a change done. OK, good.



MALE SPEAKER: John, Can I ask a question before I go?


MALE SPEAKER: Is that a "yes," or just like, OK, fine?


MALE SPEAKER: About two weeks ago, you said that you're working on making Penguin and Panda run a little bit faster. Could you describe what that means by run a bit faster?

JOHN MUELLER: I don't have anything specific that I can share about that though. So run a bit faster is about as much as you can get.

MALE SPEAKER: Penguins don't run too fast.

JOHN MUELLER: Well, I don't know. You could put them in your car. I don't know. Don't read too much into these analogies.

MALE SPEAKER: And one last shot. Could you go ahead and show us that Google penalty server?


MALE SPEAKER: No? OK. Had to ask. Thanks.

JOHN MUELLER: I mean, we keep our soccer games to ourselves. And we kind of keep track of what's happening there.


JOHN MUELLER: All right.

MALE SPEAKER: Well, it would be cool if there's a way to submit a form, because people do ask you all these things. Am I penalized? Do I have a Panda penalty? Do I have a Penguin penalty? And you do sometimes answer people. Wouldn't it be cool to have some kind of forum in Google Webmaster Tools, where people could submit it and maybe get a response?

JOHN MUELLER: Yeah. I don't know. The "maybe" part kind of bothers me there. I think if we have something like this, we should be fair towards the webmasters and provide that to everyone if we can. To a large extent, I think the difficulty is just that the information that we have on the way our algorithms are working is really, really hard to map to something specific on the webmaster side. So our algorithms are created to kind of adjust the ranking. They're not created to give specific feedback to webmasters. So while sometimes I can look at these internal things and say, well, it looks like you have this problem or it looks like you don't have any problem at all, that's almost I guess an exception, when I can look at a site and kind of see, oh, this is obviously like this or this is obviously something I can tell people. Most of the time, is really just our algorithms are doing their thing, and there's nothing specific I can really point out to you. Obviously, our algorithms think this part here is good, or we've seen lots of good signals here, lots of bad signals here. And we kind of have to even it out and say, well, in ranking that means this for these specific communities or locations or whatever. But that's really, really hard to kind of point back and say, well, as a webmaster, you should mention this word twice more on your page, and then you'll rank one point higher. That's not really the way our algorithms work. But I totally appreciate that the constant pressure to get more information out there for webmasters.

MALE SPEAKER: OK. Thank you, Have a good day.

JOHN MUELLER: Sure. Thanks. All right. Let's see. "Is there a long term solution to eliminate a referral span? The [INAUDIBLE] option doesn't seem effective, and the number of ghosts refers keeps increasing." So I think this is related to Analytics, where there are all kinds of crazy things happening with the refers. I know the team is working on something there to make that a little bit easier. But I don't have anything specific to share there. I don't know what they've announced so far, what they're posting the forums. I think there's some options in Analytics that you can use to kind of clean this up yourself. But I imagine it makes sense to have a more general solution on our side for some of these things. "Testing a site before indexing. I want to check it in Google Search Console, but I can't because the site has a robots text disallow, so I can't use Fetch as Google." Because it obviously says it's blocked. So that's something that's always kind of tricky. So we recommend using robots text, especially for testing sites, dev sites, so that they don't actually get indexed themselves. What you can do for testing, if there's something specific you want to test, is to kind of remove the robots text file for a bit and to use an X-Robots-Tag HTTP header, which you can add with your htaccess file, for example, if you're an Apache server that essentially tells us not to index any of the content there. So we'd be able to crawl this briefly. We'd see the X-Robots-Tag, no index there, so we wouldn't any other content there. And that way you can kind of test it out there. But this is something where obviously the site will be available for the general public if someone were to find the URL and kind of try it out or share it on Twitter even. So it's something you kind of have to be cautious with. And it's also something that internally at Google we don't have any secret tools to handle this any better. So if a team on our side says, oh, I'm watching this fantastic new feature to catalog-- I don't know-- all the key presses that we found on the internet where people have taken screenshots of keyboards, there might be millions of pages there. But they can't actually test it until they actually make it life. So you're kind of stuck behind the robots text log I guess at that point. "Does Google consider Trustpilot reviews as a brand signal? What's the best way to implement it-- linking to the Trustpilot page or creating a static testimonial page on your website and copying the reviews?" I don't know how we would use that. So I imagine this is for Google+ local for the local reviews. I don't know how we would use that there actually. So I don't have any information about that. When it comes to the rich snippet side of things, we do want reviews markup that are specific to the page, that are specific to the product that you're selling there. So if it's something where you have a review about your website in general or about your company in general, then that's not really what we'd want to see for rich snippets. So we'd really want to see something specific to the primary content of that page. "I've looked into Webmaster Tools and was much surprised to see the page removed URLs with some hundreds of link removal request with the status of expired. Less than 1% has a status removed. This looks like we've done all this work for nothing. Why are you doing this Google?" So from our side when it comes to URL removal requests, when we can tell that they've been processed through our indexing system, when we can see that actually this page that you removed from your website has been removed from index anyway, then we'll expire the URL removal request. Because there's no reason to keep that unnaturally essentially in our system. So if we can tell that you don't actually need this request anymore, we'll just expire it. So that's probably what you're seeing there. It's not that you're doing the work for nothing. It's essentially just a sign that our algorithms are pretty fast, and they pick up these changes, too. And you don't need to submit them anymore for those specific URLs.

BARUCH LABUNSKI: So how long does it take for the URL to disappear?

JOHN MUELLER: I think the URL removal requests remain in place for 90 days. So usually we expect that within these 90 days, we'll have recrawled and reprocessed that URL. And if it has no index or it has a 404, then we'll remove it from your index. So that's kind of the time frame that we think is relevant there. Sometimes it takes a little bit longer. Usually, it takes a lot less. "Our international website has different regions appearing in SiteLink-- so US pages appearing on French results. Anything we should be doing aside from hreflang that can stop the wrong links from showing?" So hreflang is definitely the first thing that comes to mind there. The other thing that comes to mind is sometimes our algorithms just think this makes sense. So for example, if you have a really strong US site and a somewhat medium-strong French site, sometimes it makes sense that if someone is searching for your brand in France that we show the French pages. But maybe we also show a SiteLink from the US site, because that's something that we've noticed people tend to go to in tend to try out. So sometimes our algorithms try to mix that in. But with hreflang, that's really the primary way that you can tell these pages are equivalent and we should swamp them out. If we don't swamp them out, if we can't swamp them out, because we don't have the hreflang markup, then sometimes we will still show them in SiteLinks there. So that's something where I'd make sure you have the hreflang markup, and if that's in place and you're still seeing this, I'd think about whether or not it actually makes sense to sometimes show it like this. And if you're this, really doesn't make any sense at all, then feel free to send me some examples so that we can share them with the team to figure out how we can improve our algorithms there. "Using hreflang, our global website isn't ranking higher than our country CCTLE but in other our global of the website is ranking higher on brand. We set the code equally on both. What could be the possible reason? Should I use EM on the global site or x-default?" If it's really a global site that's equivalent, well, on a per-page basis to the local sites, then obviously you can use hreflang on these pages. Maybe it makes sense to use x-default there. So essentially with x-default, what happens is if we don't have any match within the hreflang pairs that you provide-- so maybe a German and a French version-- and if someone from the US is searching, then we wouldn't know which one of these to use. If you have an x-default, then we use x-default for all of the locations and languages that you don't explicitly cover with your hreflang markup. One thing that she might be the case here-- it's really hard to say without looking at the example-- is that maybe you're not using the hreflang markup properly. So in particular, we need to make sure that the hreflang markup is on a per-page basis and is the exact URL as we have it indexed for those individual countries. So for example, if you have a subdirectory /fr the French content and the subdirectory /de for the German content, you need to map those pages exactly. You shouldn't use /de/index.html. I mean, it's the same page, but it's a different URL. So you really need to make sure that those URLs map one to one between those pairs and across all the pairs that you have. So if you have a global home page and a local homepage, make sure that the local home page makes it a global version. The exact URL in the global version links back to the local version, so that you kind of have that confirmed from all sides. So my guess would be that maybe you have these URLs setup a little bit incorrectly, that we're actually showing both of these versions because we can't process the hreflang. "Sitewide internal links like imprint, terms and conditions, I found different recommendations from no problem until you must use rel nofollow. What's up here?" So essentially for internal links, we recommend just using normal links. There's no need to use nofollow for internal links. We can crawl those pages directly. We understand that a lot of websites you have something like a disclaimer, or terms and conditions, or about us page that's linked from every page across the website. And that's perfectly fine. That's not something that you artificially need to suppress. I use nofollow more internally, if there are parts of the site where you don't necessarily want us to start crawling. So that could be maybe a calendar section where you say, well, if nothing is explicitly linked, I don't want you to kind of randomly follow my calendar until the year are 2099 or whatever, however far your calendar goes. So those are kind of the places where I'd recommend using nofollow. I wouldn't use it for normal content that you're linking within your website. "We're using the Google Custom Search engine to set to specific sites. What's the best practice for removing dead links?" I don't actually know how the Custom Search engine is setup there. In general, I just recommend using a 404. That's something that we use for essentially all kinds of crawling, where if we see a 404, then we'll try to drop that URL from the search results. There's no need to kind of like redirect it to a higher level page, or redirect it to any other page. A 404 is essentially the strongest signal you can give us to say, well, this page doesn't exist anymore. You can forget about it. "Some of my page URLs which use 301 redirects still I can see the URLs in web search. How long does Google take to figure those 301s out?" It kind of depends. Sometimes, we can figure that out fairly quickly. Sometimes, it takes a long time. The thing to keep in mind here is we'll follow those redirects right away when we crawl, and we'll try to forward the signals right away when we've picked those up. But what might happen is we still index those URLs if you search specifically for those old URLs. So if you have an old site and you're

redirecting to a new site, and you do a site: query for the old site, the chances are we'll still show URLs for a really long time, because we think you're explicitly looking for those old URLs. We might know that they all moved to the new site. But you are really looking for those old URLs, so we'll show you kind of what you're trying to find. So that's I think the general situation where people get confused in that they think, oh, Google isn't processing my 301s. But actually we are processing them. If you do a normal search, we'll chosen those normal URLs, those new URLs. But if you explicitly search for the old ones, we'll try to pick them up and show them again, even if we know that they don't actually exist there anymore.

MALE SPEAKER: I have to ask one question. I have noticed that content from our website-- images that are uploaded-- when somebody uses the embed code similar to what YouTube has, those images, if you go search, I find those images indexed on that website. I'm going to send you in chat. Here is the link for example. All those are my website. But they are indexed on somebody else's. Why is that happening?

JOHN MUELLER: I'd have to take a look at your specific example. But with images--

MALE SPEAKER: I'll send it.

JOHN MUELLER: I mean, with images, it's always harder, because you have, on the one hand, the image landing page URL that's shown in the browser and, on the other hand, the images that are actually embedded in there. So, for example, it could be completely normal for you to say, well, I have my website on, but I have a really slow server, so I'll host my images on-- I don't know-- Picasa or Flickr or somewhere else. So you kind of embed the images from a different site. And that's something where we try to index the landing page anyway, even if the image itself we say, well, this actually belongs to another site. And which landing page we show there kind of depends on the query that you would do. So a canonical example is if you're a photographer, and you created your fantastic photographs, and you put them on your website, and on your website you just have the URL image number on top, like DSC2517.jpg. And we know this is actually the original source for this image. But if someone takes image embedded in an article, for example, and they're writing about their fantastic vacation that they have in this one location, then obviously if someone is looking for vacation in that location in image search, we'll show that, because this page has that content even though the image might actually come from someone else's site. So that's sometimes what you would see in there with image search. I don't know if that's the same as what you're seeing in that specific example. But I'll take a look after the Hangout. But that's something to kind of watch out for. And if you are the original source of these images, if you're a photographer and you're creating this content, I'd just make sure that you really kind of have some context for those images on those pages so that we understand that this is a great landing page, not just because it's the original source, but also it has some context, some information about what people might be searching for that image.

MALE SPEAKER: We had to introduce the algorithm. We had to create similar something like Google is doing. We have this durability that tracks the title, description tags, and everything inside. And we are also integrating TinEye. It's an API for finding out the source of the content. So everything that is not sourced from our website is going to do no index.

JOHN MUELLER: That sounds good. All right. Let me run through a bunch of the questions that were submitted. And we'll probably have some time for more questions afterwards. "I have a client that has a local business with a My Business page. Is there a way to completely remove the Google page from the search results?" I think if you remove the Google+ page, you should be able to submit a Google URL removal request for that, but I'm not completely sure. So I'd double-check with the Google My Business team to see if you can actually remove that page from their side so that it drops out of search, too. "Why is it better to submit one sitemap index that has multiple sitemaps rather than multiple sitemaps directly?" It doesn't really matter which way you do it. Sometimes, it's easier from an administrative point of view to say I have URL for my sitemap, and it automatically points to the individual sitemap files. Sometimes, it makes sense to just have these directly listed. It's really up to you. No preference on our side. "What happens if we use page level data index follow and use rel canonical targeting different page? Which page gets indexed?" Well, first off, if you have the default robots meta tags on your page, so index and follow, we essentially ignore those. They have no new information for us, because by default our page is indexed and the links are followed. So we kind of ignore that. If you have a rel canonical on any page, if it has an index follow on it or not, essentially we'll try to follow the rel canonical and pick that up and use that. So you don't need to use robots meta tags. I'd use index and follow-- default for default behaviors. And it has no negative effects. But essentially it's not something that you need to do. "Suppose Googlebot sees 0/0/0 in the URL while crawling a website. Is it possible that might not crawl an index's URL zero just because it suspects a spider trap." No. We would try to recrawl it anyway and see what we find there. So I think there are very, very few patterns that we say, well, we don't necessarily want to call this for a web search. But something that looks like a URL path like this seems completely normal for us. What would be a bit different is if we start seeing it recursively. So if we crawl /0 and you have a link to /00, and then from there is 000, 000, kind of recursively onwards, at some point our systems will just say, well, this website is broken somehow. We're going to stop crawling this specific recurring pattern. "After launch, do Google's core algorithm updates work using real-time data which is updated by crawl or rely on periodic data, inputs, pushes like with certain website algorithms?" We do both. So a lot of our algorithms work in real time. When we crawl something, we'll process it right away. Some of our algorithms work more in batch mode or kind of separated from the real time crawling. So we try to do both. "Any differences between app indexing for iOS and Android? Are they correct? iOS won't work as a ranking signal. iOS won't cause app install buttons in search showing up." I don't know. So we're just starting with some better test for iOS app indexing. So I expect to see some news over time. But I don't think there's anything specific that we have to share around that just yet. "Is a high ratio of broken internal links a signal of low quality?" No. Not necessarily. Sometimes things just get broken. Sometimes a site gets hacked and has a lot of URLs that you see exist that don't work anymore. And all that is essentially normal. We have to live with that. And it's not a sign of the website being lower quality just because there's something technical that's not perfect there. "Everything I found on the hash-bang and URL shortening and push state is one to three years old. So I want to get the newest information." I recently did a Hangout with the AngularJS team maybe three or four weeks ago. So I'd check out that Hangout video. I can link to it from the event entry. And there's a bunch of information there about the current state where we are now. "How to get your website indexed fast?" Really quick way is to use fetch as Google and submit to indexing. That's essentially the fastest way to get your pages in our index. You can also use the sitemap file, which tells us about multiple parts of your site. So those are really fast ways to get indexed. "Saw a massive increase in crawl time and massive drop in pages crawled on April 21 for web pages that have a very slow mobile page load time. Is that due to the algorithm update that happened or more coincidence?" I'd go for more of a coincidence. But essentially if you have a mobile site that's really slow, that seems like something you probably want to improve. Because especially if it's a mobile site, you want to make sure it's as snappy as possible. "We've used single directory to contain our products. We're starting to write it in one directory per product category. Is that easier for Google to understand? What should we watch out for?" I'd set up redirects from the old URLs to the new ones. But essentially it's up to you. That's something where from our point of view, we wouldn't tell you which way you should use that. I'd just make sure that you really have a clean and a strict URL structure so that you don't end up with different parts of the path that are actually irrelevant and that cause more crawling problems than they actually help there. So I would make this kind of a change for Google. You don't need to do that for SEO. If you think it makes sense for your usability or if it's necessary for your CMS, then of course fine. Make this kind of change. But anytime you make URL changes on a website, especially across a board of a website, you're going to see some fluctuation, so some drop in ranking, some drop in visibility in search at least for a certain period of time, which might be a week. It might be up to a month. So this isn't something that you'd want to do on a regular basis. And you wouldn't want to do it just a fluke, because someone said, well, this makes things a lot easier for Google. We can work with both of these variations. All right. We just have five minutes left, so I'll jump to your questions instead.

MIHAI APERGHIS: I've got one, John. This is a [INAUDIBLE] issue that I actually ask you I think about a month ago regarding breadcrumbs and rich snippets. So the site was using an icon for the homepage and the breadcrumbs. And the rich snippet testing tool was picking it up fine, but it wasn't showing up in the search results, the rich snippets.

So we changed it to a text but with the display: none. We are actually still showing to the users the icon but the test is-- let me just give you the link maybe. That could help you.

The text is still as display: none. And you said, that shouldn't really be there. Google kind of expects it or the text will be visible use as well. And we're still not showing the breadcrumbs rich snippets there. So I've seen that the product prices, for example, also have product markup are showing up fine. So I'm not sure what the issue could be.

JOHN MUELLER: I can take a look at that with the team afterwards and see if there's something specific I can let you know about. I know I passed that onto the team, but I'm not sure what the result was there. But I can take a look at that specific example again.

MIHAI APERGHIS: I'll just follow up for next time again.

JOHN MUELLER: All right. More questions. What's on your mind?

KREASON GOVENDER: Hi, John. Tell me, with regards to user-generated content, is it considered quality content? For example, if you open up [INAUDIBLE] on a page and you get a thousand comments from different people, is this regarded as good-quality content? Because generally from the way you look at it, it looks pretty spammy because it gets quite long on the page. But we've seen in search that these generally rank higher up.

JOHN MUELLER: It depends. So just because it's user-generated content doesn't make it spammy or lower quality. I think some sites do a fantastic job with user-generated content. For example, Wikipedia is essentially user-generated content. So it's something where just because it's user-generated content doesn't mean it's lower quality content. But on the other hand, if you let the user-generated content go completely wild, and it's just filled with spam or irrelevant comments, or-- I don't know-- crazy abusive stuff that they sometimes gets posted on the internet, that's something where users will probably look at that and say, well, this looks really kind of cheap and not something I'd want to recommend to other people, where also our algorithms might look at that and say, well, overall this page has some good content here but there's this big chunk of content here that's essentially just cruft that we can kind of drop. So that's something where I'd just watch out and make sure that overall your pages are gaining value through the user-generated content, not that they're losing value through the user-generated content.

KREASON GOVENDER: Is there any specific guideline to what counts for the page, like once it reaches 2,000 words we should cut it off and not add [INAUDIBLE]?

JOHN MUELLER: No. Word count is totally up to you. Whatever you think makes sense for your site, for your users. That's totally up to you.


JOHN MUELLER: No. And I think with user-generated content, it's important to keep in mind that when we look at these pages, we think that this is the content that you are publishing. So it's not that our algorithms say, well, this is user-generated content, therefore I don't have to account it for or against the webmaster. You're essentially providing the whole page the way that it is. So you're kind of by publishing this user-generated content saying, this is what I want to stand for and what I want my site to stand for. All right.

KREASON GOVENDER: John, but just a question.

JOHN MUELLER: Guys, I have to head out. So it's been great talking with you. And I hope I'll see you guys again in one of the next Office Hour Hangouts.

MALE SPEAKER: Cheers. Thank you, John.

JOHN MUELLER: Thanks all.

MALE SPEAKER: Thank you very much.


MALE SPEAKER: Thanks, John. | Copyright 2019