Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 15 August 2014

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: OK. Welcome everyone to today's Google Webmaster Central Office Hours Hangout. My name is John Mueller. I'm a Webmaster Trends Analyst at Google in Switzerland, and I try to connect webmasters, publishers like you all with our engineers and make sure that information is flowing freely in both directions. There are a bunch of questions submitted already. There are a handful of people here already in the Hangout. As always, if one of you wants to take the stage first and ask the first question, feel free to go ahead.

AUDIENCE: Hi, John. I have a quick question about videos in search results. So, a few months ago, when you did a site search for my website and then you filtered by video results, it used to show all the pages that had YouTube embeds. And it doesn't do that anymore. So I pasted a link in the Q & A, basically a site search. It shows YouTube embeds on the blog, which is on a different subdomain, but it doesn't show any of the pages that have embedded YouTube videos on the main website. So, is that because Google is not treating it as embedded video, or what happened?

JOHN MUELLER: I imagine that's just one of the usual changes that we make in Search, in that sometimes we change the way that snippets are shown, the way that embedded content is shown directly in search results. And so some sites will see changes like that. Other sites will see changes in other directions. So it's essentially just a normal change in the way that we bubble up this information in the search results. Usually, you should still be able to see them if you go directly to Video Search specifically. But essentially, these type of snippets, and other types of rich snippets, are things that we don't always show in the search results. Sometimes we'll understand that they're there on these pages, but that doesn't mean that we think it always makes sense to show them directly in the search results.

AUDIENCE: Right. But my question is not about video snippets. This is specifically-- so if you look at the link that I shared-- this is specifically for video results-- in the group chat.


AUDIENCE: And so what I've noticed is that certain websites, you'll still show the fact that there's a YouTube embed as a video on the page. So is this like a quality signal for the site, where you think this is not good quality enough which is why, even if there's a YouTube embed, you don't see it as video?

JOHN MUELLER: Do you submit video sitemaps for those pages?

AUDIENCE: No, but we never did. And Google automatically picked it up.

JOHN MUELLER: Yeah. That's something I might consider doing. If the video is the primary part of the content on these pages, I'd definitely consider setting up the video sitemap on your site. It doesn't have to be that the video is hosted on your site-- it could be hosted on YouTube or Vimeo or where else. But it's important to let us know about this connection, and that you think that this is the primary content of those pages.

AUDIENCE: OK. It's not the primary content. It's just embedded because it's relevant to the content of the page.

JOHN MUELLER: Yeah. I guess in those situations, it's something that our algorithms will have to make a decision about-- whether or not it makes sense to show that as a video search result. Because if people are explicitly looking for videos, and this is just-- let's say an extreme case-- a random video that's included somewhere on the bottom of the page, then that doesn't seem so relevant for the user in a case like that. So that's something where I imagine our algorithm is just trying to figure out how relevant those pages are for people explicitly searching for videos.


JOHN MUELLER: OK. All right. Let's start off with the Q & A.

AUDIENCE: John, sorry, can I ask a follow-up to that, then? Cause I think the question that he was asking there was because he's using site colon. It should bring absolutely everything with a video on that site, or supposedly. Does that mean then, given the end of your answer then, if you use site colon, just the normal site and normal web results, that won't necessarily bring absolutely everything that's on the site as well? It will only bring the important stuff. I thought that operator was supposed to bring everything.

JOHN MUELLER: Site queries are always a bit of a special case. I don't know, especially how Video Search handles it, how News and Image Search handles that. But within Web Search, one of the things there is, we see this as a restrict, we don't see it as a way of saying you want everything. Essentially what you're saying with the site colon query web search is restrict only to results from this site. So that's why in some cases when you do a site query you'll see completely, almost arbitrary numbers on top, like about 50 million pages, and you know you only have 5,000 pages. That's because we see this kind of as a restrict, and we see, especially the counts, as something that we optimize more for speed than for accuracy. So that's something where I would use the site colon query for specific URLs, if you're looking for something specific to see if it's indexed. But I wouldn't use it as a way to determine which pages are actually indexed. Because there are lots of reasons why we might filter it out from a site query, why we might change accounts a little bit for the site query. It's definitely not something I've used, in general, for diagnostics.

AUDIENCE: Right. Does that follow-up question help?

AUDIENCE: No. I mean, it's a good question. The thing is, there are hundreds of pages that used to show up in that query. And it makes me feel like somehow Google has put this black mark on the site saying, you know what, we're not going to treat any videos on this domain because we don't-- because we got hit by Panda or whatever it is. And I've seen this in other sites that I think have been hit by Panda. But other sites that have not been hit by Panda-- like Venture Beat or whatever-- they have long articles that also have embedded video, but those videos show up. But my site has long articles with embedded videos where a specific site colon search for videos doesn't show up.

JOHN MUELLER: I wouldn't necessarily use that as a signal that you're somehow demoted by one of our algorithms. I think this is really just our algorithms trying to figure out what is the most relevant there for a query like that. And especially when you mention that these are videos also on the page together with a lot of text, then I imagine that's a hard situation for algorithms to determine how specific these pages are for video search. But I'll definitely take a look with the team to see if there's something more specific that I can pass on there. But I don't see anything that's technically broken in that regard. It's essentially the way our algorithms are classifying those pages on your site, and to some extent both variations could makes sense.

AUDIENCE: OK. Thank you.

JOHN MUELLER: OK. A lot of tube sites are publishing adult or non-adult content around, are growing their link popularity thanks to iframes they give around. Is this a correct link building strategy? They put a Powered By link under the iframe. In the future, are they at risk? So I'd love to see some examples of this. But, in general, if this is something that the webmaster has a choice on when they put this on their site, that's usually fine. On the other hand, if this is something that's specifically tied to these embeds, then that could be a bit problematic, in the sense that the webmaster might not have a choice to actually remove this type of link. Or they might not even see this link. But, I'd definitely love to see some more examples before really saying that this is OK or this is not OK, because I could see this going both directions.

AUDIENCE: This is-- do you hear me, John?


AUDIENCE: This was my question. I asked it because I work in this industry also. And it was one of my curiosities since a long time ago. I don't find nothing that can confirm if it's bad or good, having a link building strategy like this. It can be a link biting strategy, because people see your content, likes it, grab it from your website at publish it in his. And I'm not really sure if this is a correct strategy, or maybe in the future the Penguin would touch the people who visit.

JOHN MUELLER: If you have some examples, I'd love to take a look with the Web spam team, and maybe we can give you a more specific answer next time.

AUDIENCE: Look on all the big websites. Everybody does it.

JOHN MUELLER: Everybody.


JOHN MUELLER: If you can send me some examples of pages that are embedding these, I'd love to pass those on directly.

AUDIENCE: If I can send it through the chat, I will send it to you.

JOHN MUELLER: Sure. You can put in the chat window here, and I'll copy it out afterwards and take that.

AUDIENCE: OK. Thank you.

JOHN MUELLER: Sure. OK. There are many types and properties the Structured Data Testing Tool doesn't report about, or reports about falsely. Do you have any indication if and when the Structured Data Testing Tool will be updated? I know they're working on updates there, so I think at some point you'll see some changes. I don't have any specific timeline for when those changes will be active. What kind of problems are you specifically seeing there?

AUDIENCE: Hi, John. Can you hear me?


AUDIENCE: Great. Well, one of problems with the Structured Data validation at this moment is that there really are tons of properties it reports about that's not part of a scheme. Yet, when you go to they are part of certain entities. At the same time, also certain combinations of types, multi-type entities, the Structured Data Testing Tool completely skips the second entity. And that way, it's starting to become really close to impossible to go beyond anything rich snippet-like. And that's starting to become problematic, because is growing and growing and growing and the Structured Data Tool isn't growing with it.

JOHN MUELLER: So, the Testing Tool isn't in sync with the changes that are happening on

AUDIENCE: Exactly. There are a lot of types and properties that the Structure Data Tool doesn't recognize.

JOHN MUELLER: OK. I can see if we can do something there to maybe do an update in the meantime. But I know the team is working on bigger changes there. So maybe they'll just want to focus on those bigger changes, and make sure that when that's updated that it kind of keeps up automatically. But I can definitely pass on that feedback to the team so that they can see if there's something shorter term that they can do to help out.

AUDIENCE: Can I ask another question in regard--


AUDIENCE: Last time I spoke to you, we spoke about the ROI of but most of all, [INAUDIBLE] were showing more information, what it does with the entities you marked up. And you said last time you would take it back to team. Is there any chance of any movement there? There's going to be some more reporting about it?

JOHN MUELLER: It's hard. I don't know of anything short term that's coming there to add more reporting for the structured data side. I know the Testing Tool is one thing that they're working on revamping there. But I'm not aware of anything big that's shortly before being launched at the moment. So I don't have anything interesting I can share with you there.

AUDIENCE: Unfortunately.

JOHN MUELLER: Yeah. Sorry. But I know it's something that the teams here are very keen on. And they're very keen on also making it more interesting for the webmasters to also provide this kind of information.

AUDIENCE: Right. We'll keep waiting, then.

JOHN MUELLER: Sorry. Let me see. We have a multilingual page and redirect recognizing browser settings to /en or /de. Should we use 301 or 302 redirect? Should we use a main language for and use redirects for the sub pages? We have this, I believe, documented on General Information pages for multilingual pages. We also recently did a blog post on how to handle homepages. And essentially, if you have a redirecting homepage-- so the normal page-- that's something you can use a 301 or a 302 redirect there. That's not necessarily a problem on our side. What we'd like to see, however, is the [INAUDIBLE] is marked up properly, so that the homepage is set up as a [INAUDIBLE] default, and that the lower level pages-- so /en, /de-- are all accessible directly. So, essentially, we need to be able to access the individual pages. But from the homepage you can redirect if you think that you can do that in a smart way. And just let us know that this is the default version of the page, then. OK. I think is a two part question. On a catalog that is paginated, and with faceted navigation we use rel="next" or rel="prev" on pages. But for filtered subsets, what's our canonicalization strategy for a subset of pages that are crawlable on unique URLs? For example, intersection of pagination and real canonical and-- goes on here-- One should presumably not canonicalize page one of red shirts, faceted navigation to page one of all shirts results. Canonicalization is not obvious here. This problem occurs on many large catalogs sites. So, we have two fairly detailed blog posts on pagination and faceted navigation on the Webmaster Central blog. I'd definitely check those out, because there are a lot of subtle things there that you probably want to watch out for. So for example, to some extent faceted navigation is OK for us to crawl and index. But sometimes there are elements of faceted navigation that result in problems. It could be that you end up on a page that's actually more like a 404 page. Like if you search for red shirts in the color blue, maybe that's something that you could pick out in faceted navigation. But of course, it would lead to no results. So, that's something where you want to watch out for those kind of issues. But all the details are on these blog posts, and I'd definitely go through those, because there are things that are very specific to some sites that don't make a lot of sense for other sites to watch out for. And it's probably not that helpful if I go into all of that in general here. SSL-- does Google take into account which type of SSL certificate is used? For example, self-signed, domain, or organization validation. What about free certificates-- are they any good? Is there any weight given according to the new soft ranking signal? We don't take into account what type of certificate it is, but it has to be a valid certificate that the browsers accept. So, if this is something that you use self-signed, then probably that's not that useful, because browsers will flag that as something problematic. You can use a free certificate if you want. There's some providers of free certificates, for example for non-profits or open source software. One thing to watch out for with free certificates-- I believe one provider provided free certificates, but then for updates started charging for that. So if you need something that's a fairly low cost, maybe you have to watch out for the recurring cost as well. From our point of view, it's important that the certificate has 2048 or more bits-- keys. But that's not something that we're currently taking into account for the ranking signal. So for us, at the moment, if it's indexed as HTTPS and it has a valid certificate, then that's enough for us to trigger this slight ranking signal that we have there. Yeah. I think that's about it with regards to certificates there. Any trick to make analytics filters for Europe? I'd like to make a property for European sales team, and fail to [INAUDIBLE] the whole continent, but the property filter neither has continent or enough letters to add all European countries. That's something you'd probably want to talk with the analytics team about. One aspect I could think of is making sure that your site is, kind of, split up into those sections that you want to track-- make it a little bit easier. So, if you want to track this in Webmaster Tools, for example, then one idea might be to use subdirectories or subdomains, and to verify those separately in Webmaster Tools so that you'd have that information separately within Webmaster Tools as well. I imagine for analytics you could do something similar. But maybe there are other tricks that you can use in analytics that I'm not aware of. So I'd definitely check with the analytics team on that. After all the changes that have been going on the last couple of years, are the most blog websites better off using Blogger? Does the same guidelines apply to sites also apply to blogspot sites? Yes, we treat blogspot sites the same as any other website. It's essentially a website. You don't have to have a traditional blog, in that you always publish like daily snippets or whatever on a Blogger site. You can use it for any traditional website. I've seen it used for restaurant websites, where they published the menus. I've seen it used for all kinds of other websites as well. So using Blogger is fine. Using other platforms is also fine. It's not that Blogger has any inherent advantage on our side when it comes to search. It's just a platform that works as well as many others. GIF pictures-- is good for website and SEO instead of png or jpeg? You can use any of these. If it's a supported image format, we can pick it up for Image Search and use it there. That's essentially up to you.



AUDIENCE: Can you hear me?


AUDIENCE: Are you viewing the chat to view the example you told me to show you?

JOHN MUELLER: OK. Great. I'll follow up on that.



AUDIENCE: I said don't put it on screen share.

JOHN MUELLER: OK. OK. I'll be careful. Yeah. I copy these texts out before I close the Hangout all the time, so I can follow up on what's happening there. OK. When can we expect the next Panda and Penguin updates? At the moment we don't have anything to announce. I believe Panda is one that is a lot more regular now, so that's probably happening fairly regularly. Penguin is one I know that the engineers are working on. So it's been quite awhile now, so I imagine it's not going to be that far away. But it's also not just happening this morning, so. Since is we don't pre-announce these kind of things, it's not something that I can give any date on. I have some keyword ranking in first position according to Webmaster Tools. Some key words are three, five, nine position. But the problem is clickthrough rate is too low for them. How can I increases the clickthrough rate? That's always something that, I guess, everyone wants to know-- how can I get people to click on my site in the search results? And there's no real magical answer. There's no technical solution to that, essentially. I imagine there are two aspects there. On the one hand, the pages that are ranking there should be relevant to the user. They should be something that the user finds matches their query, their intent, why they're searching. So that's something you can double check on your side-- take those keywords that these pages are linking for, and think about whether or not this is relevant for the user or not. And if they're not relevant for the user, then maybe the clickthrough rate isn't something you should be looking at there. If they are relevant for the user, then maybe the user is confused with regards to what kind of content is on these pages. So one thing you could think about there is whether or not you might want to try a different title. Maybe test different titles for these types of pages. Think about the meta description you have on these pages-- if this is something that makes it really clear to the user in the snippet what this page is about. And if this is something that maybe encourages the user to click through, if is something that they care about. So, those are the kind of things you can look at there. There's no technical solution to this question. This is essentially a matter of your site showing up in the search results and the user recognizing that this is actually what they were looking for. Do you think ever Google AdSense publishers want to get approved quickly? I don't know what the process is for the Ad side of things. I'm sure that there are publishers that want to get approved quickly. But I'm also sure that our advertisers want to make sure that these publishers are vetted accordingly, and the right ones are approved. But I don't really know how the AdSense side handles these things. So, I'd check, if you think that there are problems there in the AdSense help forum, and give them the information that they need to look at your side and double check there. Maybe there are those things that appears when we look at your site, and say, oh this is something that should be fixed before you apply to AdSense, or this is something that makes your site look really bad. And those are sometimes important things to hear, even if you don't want to hear them in the first step in time. So, that's the kind of feedback I'd try to get there. I've uploaded multiple disavowed files, but have seen no change in rankings. When will they kick in? These disavowed files, if they're technically correct, then they essentially get processed automatically. And the next time we crawl the URLs that you have mentioned in your disavowed file, we'll drop the links from those pages to your site. So, that's essentially a technical element that happens ongoing and automatically. And that doesn't necessarily mean that you'll see immediate changes in rankings, even when they do kick in. Sometimes it can even be that a site was partially supported by problematic links, and if you disavow them, if you remove those links, then that support is also missing. And maybe the site will even go down a little bit during the time when things kind of stablize. So, that's something where these disavowed files are processed automatically. They're processed ongoing, and taken into account as we recall things. So, you would generally see that in effect there. But it's not something where you'd see an immediate effect on your rankings. What needs to be done if my site is taken down for spam for no reason? And if there was a real reason, how can I find out what to do? I imagine this is in Manual Actions section in Webmaster Tools. So that's usually where you'd see this information, if it's taken down, for example for pure spam reasons. Usually if something is taken down for pure spam reasons, then it's pretty clear what was happening there. A lot of times, we'll see sites such as aggregate content, that just rewrite or spin content, and these are things that we tend to take down. If there's really no value in this site being shown in the search results, and us even sometimes even crawling or indexing this content, then that's something that we might take down. So, that's the kind of thing I'd watch out for. If you really don't see what you might have done that is so problematic, I definitely take it to the Help forum and get some hard feedback from your peers to see what they find there. And maybe there are some things that are really problematic that you weren't thinking about when you created that website. So, that's kind of where I'd head there. "Should I expect a rise in traffic for a site that made a partial recovery after the Panda 4 release to be one-off? Or is it normal to see the visitor counts rise gradually, continuously? Ever since the Panda 4 release, we've noticed site did not change." Usually if it's a good website, you'd see continuous changes like that. So, that's something where if you made significant changes and the update-- algorithm update has happened in the meantime, then you'd see, on the one hand, a step when the update is happening. So the site kind of changing-- changing the rankings, changing the visibility to search results-- when the update happens. But it can be normal that you also see like a continuous or gradual rise over time, as well. And that isn't necessarily tied to the algorithm update. But it might just be that things are working out for the site, and users like the site better, and things are just generally trending up.

AUDIENCE: In reality, that site didn't change. It was already suffering from Panda for like two years. And we were just getting into a new round of investigations to see what we could improve to that site to see if we could escape the Panda situation. Then the Panda 4 update came. And besides the big step up, we suddenly saw the site also-- [INAUDIBLE] like 10% per week-- starting to grow. But we didn't change anything at that moment. So that's where the question came from. Because the step up I can understand. But the gradual incline, where does that come from-- after being frozen for like three years and on a flat line?

JOHN MUELLER: That's weird. Yeah. I'm not aware of any other algorithms being such that they'd gradually ramp up like that. So I wouldn't necessarily say that that's something from Panda, or something specific like that. If you want, I can take a quick look at the site afterwards, if you can post it in the chat. But I think that doesn't sound like anything specific to one of our algorithms saying, oh, we'll try 10% more this week, and next week we'll try another 10% more. That seems more like a natural progression in search. Maybe it's also something that takes into account what else is happening for those search results. Maybe the other sites in the search results are doing worse than they were before. It's really hard to tell. But I'm happy to take a quick look.

AUDIENCE: I pasted a link in the chat already.

JOHN MUELLER: All right. Great.

AUDIENCE: Thanks. Mobile or HTTPS-- what should I work on first? I think you'll probably see the bigger impact if you make things mobile-friendly first. That's something where users, at the moment, they notice this very well. There are lots of users that use mobile devices as their primary internet devices in the meantime. So if you have a website that doesn't work well on mobile, you'll definitely see a fairly noticeable change in user activity at least, if you do make it mobile-friendly. And that's something where, at the moment, if we notice that there are real problems with the website on mobile, we'll take that into account. In the future, it might be that we also take into account how friendly it is in general for a website like that. But that's something where I think at the moment there is an extreme change in how people are accessing the web, and it's really going towards mobile. There is a really strong push for users who are trying to do as much as possible on their mobile phones. And that's something where if your site currently doesn't work at all on mobile, then you will almost certainly see a reasonable change in how users are interacting with the site, and how people are basically taking your site and converting in the way that you want them to convert. HTTPS is something that I imagine the long run will become more and more important. But it's not something where if you change to HTTPS that users will automatically notice that and say, oh, this is fantastic, this is just what I've been waiting for. It's a little bit different on the mobile side, because that's something where users are actively trying to reach these websites already. So, from my point of view, if you have to make a decision between these two, I'd definitely work on mobile first. If you have a chance to revamp your website to work on mobile, maybe you can at the same time include HTTPS as a forward-thinking idea, in the sense that you take care of it now instead of having to take care of it later. But I'd definitely focus on mobile first. What is your suggestions for optimizing site revenue from AdSense? I don't really have any suggestions there. I can't really speake for the AdSense side of things. We keep things completely separate, so I don't have anything I can add there. In a recent reconsideration request for manual action for links, the sample URLs on my deny letter were URLs I had listed in my disavow file. Any idea why this may have happened? I probably would need the URL to see what specifically happened there. I imagine there's two aspects I think that could have happened here. On the one hand, maybe the disavow file is formatted in a way that we can't really read completely. So maybe there's something in the URLs that you have listed there that doesn't quite 100% match the URLs that were included in the deny letter there from the reconsideration request. That could be, for example, if you list individual URLs in your disavow file and there are other URLs on this website that almost look the same, but have slightly different URLs. So, in cases like that, I'd just make sure that you use a doman directive in the disavow file, to make sure you're covering everything from the site. Another thing that could reasonably have happen is that maybe there's a timing issue there, in that you submitted a disavow file just while someone was processing the reconsideration request already. Theoretically, that's thinkable. Finally, it's also possible that some mistake happened on our side-- that someone processed the reconsideration request, and for whatever reason accidentally included the disavowed links as well in there, and didn't realize that you had already submitted them. So, if you're sure that technically your file is correct, if you're submitting with the domain directives to make sure that you're catching all of these URLs on there, and you're sure that there couldn't have been any kind of a timing issue there, then I'd definitely submit another reconsideration request, just following up on yours to say, hey, these links were already disavowed, are you sure that this is still a problem? The other thing worth mentioning here is usually the reconsideration request team, when they process these files, they don't take into account just the 3 or 5 URLs, whatever is listed as a sample. They really look at the overall picture from your site. And if the overall picture for your site is still bad, then it doesn't necessarily mean that you need to take those individual URLs out. It's really a sign that you need to work on the overall picture first and clean all of that up, not just these individual 2 or 5 URLs. So, that's generally what I'd recommend there.



AUDIENCE: Have you-- perhaps I was thinking of the wrong thing, but I thought before you'd said, in response to a similar question, that the sample URLs you might receive back are exactly that-- they're samples of the type of URLs that might be, or could have caused the problem. They're not actual URLs that you need to go away and fix. You probably should, but they're actually-- you should use those as a guide to these are the types of sites or URLs we don't like. Go away and fix all of them like that. Not just fix these 3 cause then that'll stop the blockage.

JOHN MUELLER: Exactly. Yeah. But, at least, the sample URLs that we specify should be relevant samples. It shouldn't be things that you've already taken care of. So, from my point of view, if we send back URLs that we can see you've taken care of already, even if they're representative for the type of problem, then that's not really that helpful. So, we essentially try to avoid that situation if we can. But, as you said, it's really more of a sign that there's a general problem that's still left there, and the person who was processing this file maybe took a bad choice of sample URLs.

AUDIENCE: All right.

JOHN MUELLER: Is Google planning to index to lazy loaded images using data URL or data source? That's not working right now. At the moment, we don't support that for Image Search. I know that's something that the team has been looking at, but I don't what the timeline is on that. So, I don't really have anything specific I can add there. Why are only a small number of submitted sitemap pages indexed, yet most of my webpages show up in the Google index? This is a fairly common question. This is something that we see from time to time. Essentially, what's worth keeping in mind here is that for the sitemap's index count, we take into account the exact URLs that you have specified in you sitemap files. So for example, if you submit your sitemap files with the www version of your site, and your site is generally indexed with the non-www version, then even though those URLs lead to the same content, we won't count those as being indexed. So if you look at the sitemap's index count, maybe you'll see a really low number there, just because it doesn't match one-to-one exactly what you have indexed on the website. So, what I'd recommend doing there is making sure that you're as consistent as possible within your website-- that you have a clear preferred domain setting, that you use www/non-www, and that you use the exact same URL structure within your website, so when we crawl your website, we find exactly the same URLs as you have specified in your sitemap files, and that your sitemap files doesn't include URLs that we don't index like that. Sometimes, for example, we'll see the website internally linking with rewritten URLs, and the sitemap file has parameterized URLs that are actually rewritten on the server one we crawl them. So, those are the kind of things where if the URLs don't match exactly what we have indexed for your website, we won't count them as being indexed for that sitemap file. You will still see the actual index count in the Index Status feature in Webmaster Tools. It's just within the Sitemap feature you won't see that this content is indexed. We'll essentially just focus on the URLs, not on the general content.

AUDIENCE: John, can I ask another indexing question?


AUDIENCE: We have a sitemap that has HTML pages which are kind of like Wikipedia's image pages. So the URLs for those pages end in a .jpg or .png, but these are actually HTML pages. And what I found was that Google was not able to index these pages, crawl them properly. Is it because you expect a .jpg URL to be an image? Even though--

JOHN MUELLER: Possibly. Probably. Yeah. So, to some extent we can recognize that a HTML URLs that end with .jpg, for example, are still HTML pages. But what probably happens is we primarily crawl them with our Image Search crawler, and te Image Search crawler goes, hey, this isn't an image, so I can't include it in Image Search. And we won't even really try it with our normal Googlebot crawler. So, if you can avoid the misleading file type things there, that probably makes it a lot easier for us to actually crawl those pages.

AUDIENCE: It's a MediaWiki setting, so Wikipedia has the same issue. But, of course, you have an exception for them.

JOHN MUELLER: I don't think we have that much of an exception for them. But maybe we just learned it better for their website already, because we have more information about how that crawls. But, if you can avoid doing that, I think that would make it a lot easier for us to actually crawl those pages.

AUDIENCE: Would a trailing slash help? So if the URL ended with .jpg and then a trailing slash, would that help?

JOHN MUELLER: I'm pretty sure that would be fine.

AUDIENCE: Thanks. In Webmaster Tools there are pages on my site with duplicate meta descriptions. These are pages within the same category, for example, mountain p=1, mountain p=2. How do I overcome this problem? Also, href link tags are on my bilingual sites. Why do duplicate title tags then? Essentially, we give this information in Webmaster Tools as a guide for potential issues on your website. We're not saying that this is causing any problems with your crawling, indexing or ranking. But during our normal crawling, we noticed that these pages have the same description or the same title. So that's why we bubble that up in Webmaster Tools. We don't make any kind of a judgment call on that. We don't take into account other aspects on those pages. So maybe they even have a rel=canonical. Or, like you mentioned, maybe they have href link tags to say that these are essentially different variations of the same content. We don't take that into account for the Webmaster Tools HTML suggestions there. So, that's something that's a fairly low level in Webmaster Tools. We bring that up as a suggestion there. It's not a sign that this is causing any problems. The AdWords team tells me I violated webmaster guidelines and asked me to file a reconsideration request. However, Webmaster Tools says there are no manual action and won't let me. What can I do? I passed the site on to the team to take a look at, and I think they resolved that already. But in general, what might have happened in a case like this is something really old got stuck somewhere, and just needed someone to take a quick look at. So in a case like this, escalating back to us to take a quick look to pass on to the team is always a possibility. Is there a model of Google being put together that eliminates the use of links the ranking factor? If so, do you have a projected date for that release? We tend not to pre-announce these kind of things. I wouldn't have any date for that, and wouldn't be able to really say that we're doing this or not. But I believe the rankings teams do take into account these kind of issues, and think about what they can do to move away from links, or move to the next bigger, more important ranking factor. And as we find those kind of ranking factors, and as we can double check to make sure that they actually work really well, I'm sure the team will be looking into taking that step. It's not the case that we're holding onto links arbitrarily. It's just that from our point of view, they still make quite a lot of sense to use for some amount of ranking. Our price comparison site's considered low quality sites by Google. What's your recommendation on improving keyword rankings for price comparison sites? So from our point of view, there are definitely some variations of these sites that would be considered low quality, that would be considered just aggregating content from various other sites that have no unique value or no unique information or content of their own. So, there is definitely an aspect there where, if you're just aggregating content from other sites and showing it on your pages, that doesn't necessarily make your site something useful and compelling of its own. So we really recommend, as with any other type of affiliate sites, that you have something really unique and compelling of your own on your website, where we can say, if someone is looking for a specific product or a specific type of service, then this page has something unique that nobody else has. And if all you're doing is aggregating feeds from other providers and showing them next to each other, then maybe that's not really as compelling as it might look. So, that's something I'd try to take into account-- be it price comparison site, or be it an affiliate-based site, or be it any other type of site-- you really need to have something specific, something really high quality of your own on your website that gives us reason to send visitors to your site and not to any of the other sites also process these feeds. We're a news site and we get some of our content through syndication to Press Release Newswire site. PR Newswire seems to have been hit by Panda. How will this affect sites that display their news? We have an RSS feed from them. Similar to the previous question, if you're just aggregating content from feeds and not providing anything of value of your own, then that's not really so compelling for our users. It's definitely not so compelling for our algorithms. It's not something where we say there's something really unique that we'd like to show up here. And when I talked to, say, Sites in the forum, and I bring those issues up with our engineers, our engineers generally come back and say, well, if they're doing the same as all of these other sites, if they're just aggregating these feeds and not providing anything additional of value, why should we show that at all in the search results? We already have enough other sites that are doing exactly the same. Why should we even include them in the search results, for example? So, that's something where you'd want to take a step back, take a good look at your website, and think about what you can do that is significantly better than everything else out there. And that's the kind of content where, if we take that to the engineers and say, hey, look at this site-- they're aggregating content from feeds, but they're doing this fantastic thing here on the side, that's like nothing else that we have in our search results, that provides a lot of unique value to our users, we should be doing better in showing them the search results. And generally our engineers take think that kind of feedback very seriously, and think about what they can do for the long run for sites like that. But if your primary content is just aggregated from somewhere else, and there's lots of other sites that are doing exactly the same thing, our engineers are going to say, well, we already have this content in our search results, we don't need to add it again. Hi, John. My website is being filtered on Safesearch. The website is now clean from any adult material. I think it was the adverts. How long it take to be re-included? We have a form in our help center for review for Safesearch sites, so that's what I'd submit there. If this is something that we picked up algorithmically, then generally what you need to do is to let us recrawl and re-process your pages. And depending on the type of site, that can take anywhere from maybe a few weeks to a few months or even longer. So that's something that can take a bit of time. So I'd definitely make sure that you really have everything covered there. If you're saying that some of your advertisements were for adult content, then I'd just double check the rest of your content as well, to make sure that those advertisements were not targeting something specific on your site. So, really make sure that your website on the whole isn't something that might be considered adult in any way. "Hi, John, do you believe in domain authority?" Hard to say. I mean, I don't really know what specifically you're looking at there. But we do have some algorithms that look at websites on a domain or a site level that try to understand in general how good or how bad is this site on a whole. And, in a way, you could see that as domain authority if you wanted to do. So, for example, our high quality sites algorithm is something where we look at the website overall and try to make a judgment call on how high or low quality the content is there. That helps us a lot when we see new pages from this website, because we can easier categorize them and say, hey, overall this was really great content on the site, so, new content that we don't really know so much about is probably going to be good as well. So in that regard, that's something that could be seen as a domain or a site level. Let's see. Woo, lots of questions left. Let me try to take some that are more general. "I'm showing GUIs on the basis of an IP address so they are dynamic, and when I use Fetch as Google it doesn't show them-- different location." Actually, this is kind of like cloaking, in the sense that Googlebots see something different than your users would see. So that's something to watch out for. The other thing to watch out for is Googlebot generally crawls from the US, and if you show US user specific content, then that's what we're currently index. So, if you have dynamic content like that, you just have to take into account that we'll be indexing one version of that content not all the other ones. If you have something specific for individual locations, I'd use something like href [INAUDIBLE] to let us know about that, so that we can crawl and index these variations separately.

AUDIENCE: John, I have another question in regards to that.


AUDIENCE: Because of the way our site is a US site, but it also has activities from all over the US-- so balloon rides in California, or a wine tasting in New York. And we used to deliver content based IP, because if you land on in New York GIFs page, you want to see activities in New York and the same for California. We found that we suffered from that same problem-- that we were being spidered-- where the New Yorker pages was being spidered, but you were seeing California content for it. So if you've got two different countries, I understand being spidered from the US is a big problem delivering IP-based content. But what if you've got a US site and you're delivering IP-based content? Shouldn't you guys be able to pick up that not everyone lives, or is based, in California?

JOHN MUELLER: It's tricky, because for the most part I believe our IP addresses are just based in California.

AUDIENCE: Right. That's what we saw. That's what we saw, because we had a page-- each page might have specific stuff. But then it would have Recommended or Closest for You, but you would see everyone was close California. And so we'd ge-- then it would look like we were delivering the same results to everyone and having duplicate content issues.

JOHN MUELLER: Yeah. But if someone goes to your New York site, they'd see New York content regardless of where they're located?

AUDIENCE: For some of it. There would be other related products-- this is the [INAUDIBLE] you're looking at-- but let's say you're looking at a balloon ride-- but here's other stuff for you. And it would show the closest stuff. Because of the way our site is broken down, they wouldn't necessarily land on New York GIFs page. They might end up on a hot air ballooning page. So we show all the hot air balloons we've got across the country, 50 off them. But we'd deliver it via IP, so everyone would see different, apart from Google, which would see all California. But our whole site [INAUDIBLE] based in California basically. And New York took a big hit in terms of--

JOHN MUELLER: Yeah. I don't know what the best solution there is. This is something where our IP addresses are primarily based in California and I think that's regardless of which data center we crawl from. And the other part is that we generally have one copy of the content per URL in our index, so we wouldn't-- even if we saw like the New York content for this specific URL, we wouldn't necessarily be able to differentiate that, if it's exactly the same URL. So that's kind of a tricky thing there. What I'd watch out for is to see as much as possible if you can serve general content on these pages and personalize that as well. That's a great thing to do. But the personalized content-- if that's not the primary piece of content on these pages that makes it a lot easier. Because then we would focus on the primary content and say, OK this is a general page about balloon rides, there's a lot of balloon information here, there's various events on this page, but it's not only focused on one location. But it's something--

AUDIENCE: Because we were worried about cloaking as well. So you have the-- I don't want to cloak because I'm don't want to get banned. But for my users it's surely better to show a balloon ride that isn't 3,000 miles away. So I want to show them that.


AUDIENCE: So does it become a business decision, rather than an algorithmic or content decision, indexing decision?

JOHN MUELLER: Yeah. I'd see it more as a business decision, because, like I said, from our point of view, we take one URL and we assume that the copy of the content that we get through crawling is representative for this URL. We don't assume that maybe if we crawl from different locations we'd see different content there. So, especially within the same country, that's something where we wouldn't even know that there might be New York-based content on this general page if we never see that when we crawl from California.

AUDIENCE: Right. Would it be better maybe to treat it in similar to a lot of tablets and mobile sites? I know that it delivers the same page, but then it has a "Show me your location" or "Share your location with us," which would then deliver content. Like most-- you know if you browse on a mobile or a tablet now it will go, "Will you share your location?"

JOHN MUELLER: Yeah. That might be a possibility. Or it might also makes sense to split these up into pages per region, for example.

AUDIENCE: Href state or href metro-- introduce those.

JOHN MUELLER: It depends a lot on your website and what kind of content you want to show there.

AUDIENCE: I assume Yelp and those sort of people have similar issues.

JOHN MUELLER: Yeah. I've seen that, I believe Yelp and Craigslist, they regularly try different things. So, this is something where I think there's no one size, one solution that fits everyone. So I'd see how they're handling this thing. You can usually check by just looking at the cache page and see which version ends up being indexed from their site. From our point of view, it's not something where we'd say this is a webspam issue, that you're cloaking to us and being a spammer. It's essentially more of a user side issue, where if you're looking for balloon rides in New York and all you find is balloon rides in California because that's what we indexed, then that could be confusing to the user. That might not be optimal from our point of view, or from your point of view. But that's something that you can generally control by giving us separate URLs to crawl and index. So, to some extent there is the effect of splitting it up into separate URLs, or finding ways that you can generalize this content so that the general page makes sense for all users.


JOHN MUELLER: OK. I think we're kind of out of time, so I'll just open it up to you guys. If we could have like one or two more questions. Then we should be all good.

AUDIENCE: I have a question, John.


AUDIENCE: Why it's not possible to move SIP directory in Webmaster Tools, only www?

JOHN MUELLER: You can do that with a 301 redirect. But-- Yeah. So we use this feature primarily to recognize significant site moves-- so, if you're moving from one domain to another and we need to forward everything to that domain. And that wouldn't work so well from a technical point of view on our side for sub-directories. But, the problem I have with that is, of course, that this is an internal decision on our side-- how we handle this information-- and that shouldn't necessarily be something that the webmaster has to worry about-- how Google internally coordinates their data. So, I imagine at some point we'll be able to improve that a bit so that either you have a way of more generally giving us information about site moves, or we are just able to focus on the signals that you give us through redirects and rel=canonical and just, OK, we can trust you on this, we can take your word and just process that directory.

AUDIENCE: How necessary is it to use the--

JOHN MUELLER: It's not necessary. So, it gives us an additional signal. In general, for site moves, if we start seeing a lot of 301 redirects, we'll crawl a little bit faster to just double check that it's really a whole site that's moving over. And then we'll process that. So, we'll generally pick that up automatically as well. We'll pick it up maybe a little bit faster if you give us that information in Webmaster Tools as well.

AUDIENCE: All right. Thank you.

JOHN MUELLER: Sure. All right. One last question, who wants to grab it?

AUDIENCE: Can I have it?


AUDIENCE: I leave it to you in the QA, and it's very interesting. It has the website outdated and changed the entire lay out and put some useful content. The site is five years old, and he's asking about-- would that effect his current rankings in Google?

JOHN MUELLER: Yes. It could. So, any time you make significant changes on your site-- with the layout, with the way that the pages are interlinked-- then that's something that our algorithms have to learn first. So that could be something where maybe you'll see some fluctuations briefly, maybe you'll even see some changes in the long run. So, taking an extreme example, if you have a website that's completely based on flash, it's one flash file, and it's been like this for years, Google indexes it more or less. If you change that into something that's a nice and clean HTML formatted site, then we'll probably be able to pick that up a lot easier and be able to crawl and index that easier, and probably be able to rank that better in the search results as well. So, even if it's an old site, if you do a revamp of the design, if you do a revamp of the structure of the website, the way the pages are linked to each other, then that's something that can, and generally does, have an effect on the ranking.

AUDIENCE: OK. Thank you very much.

AUDIENCE: Would that mean changing the CCS file around?

JOHN MUELLER: Just change the CSS file, then that's probably not something that we'll recognize that quickly. So, there's one, maybe, exception there, in the sense that if you use CSS file to make it more web-friendly, then, of course, that's something we could take into account. But if you're just tweaking things and changing the font colors, changing the font sizes, then those are the kind of things we'd probably say, well, these things happen all the time, we don't necessarily need to take that into account for rankings. All right.

AUDIENCE: On mobile-- one more--


AUDIENCE: If you have a, for example, an m. subdomain also, but your normal site is not mobile friendly because you have the m. site-- will you still be punished for having not a real mobile site, or compatible, or-- on your desktop version?

JOHN MUELLER: So, what will generally happen there is, in the best case, we'll recognize that these sites are related. For example, if you have the-- what is it-- the rel alternate link between those pages so that we recognize the mobile site belongs to the desktop site, then we'll show the mobile site. We'll be able to focus on the mobile site for the mobile search results. If we can't tell that they're related, we'll treat them as separate sites. So, what could theoretically happen there, in a worst case, is that the desktop site just ranks a little bit lower and the mobile site ranks a little bit higher in the mobile search results. And you'd see changes like that. But it's not--

AUDIENCE: So desktop version could show lower in a desktop search then?


AUDIENCE: Doesn't affect desktop search, only mobile search.

JOHN MUELLER: It's only for mobile. Yes. And at the moment, we only take action on issues that are serious for mobile users. So, if your desktop site is all Flash, and that's something we recognize is not working for mobile, then that's something we'd either flag in the search results or kind of demote in the search results for smartphones users. Not for the desktop users, but just for smartphone users.

AUDIENCE: And where does your smartphones stop and your tablets start? I mean, my phone may be 6 inches.

JOHN MUELLER: Yeah. That's always tricky, right? From our point of view, we tend to treat tablets sometimes as desktops, sometimes as mobile phones. I believe in search we would treat them more like smartphones, because the capabilities are more like smartphones there. So, for instance, Flash is something that's rarely available on smartphones and on tablets. There are also these type of faulty redirects that we find that websites do for smartphones and for tablets, where you click a desktop URL and instead of taking you to that desktop page or the equivalent mobile page, it redirects you to the homepage of the mobile website, which is really frustrating. Those are the kind of things that tend to be similar for smartphones and tablets, so that's kind of why we treat them together. But it's tricky because some tablets have a higher screen resolution than my laptop. And it's not always exactly clear which version you should be showing two which user. So, that's something where I imagine in the future there'll be a little bit more shuffling around happening and more refining of which element goes where.


JOHN MUELLER: All right. So thank you all for all of your questions, and all of the feedback. I'll take a look at those URLs that you guys posted in the chat, and see what we can do there, if there is something that the team needs to work on. And hope to see you guys again in one of the future Hangouts.

AUDIENCE: [INAUDIBLE] 10 days, isn't it?

JOHN MUELLER: Yeah. I do two every 2 weeks, something like that. Every other week. Yeah.

AUDIENCE: Goodbye. All right. Have a great weekend. Bye everyone.

AUDIENCE: Thanks, John. | Copyright 2019