Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 21 July 2014

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: OK, welcome everyone to today's Google Webmaster Central Office Hours Hangout. My name is John Mueller. I'm a webmaster trends analyst at Google in Switzerland and currently in the US. We have a bunch of questions that were submitted already. If you're watching this, you can add questions while we have the session as well. And if you're in here live, feel free to ask questions during the course of the Hangout as well. Maybe we can start off with a live question as well. Does any one of you want to ask a question first?

AUDIENCE: Sure. Can I start, John? Hey, nice to see you again. I have an SEO strictly related question. Are keywords using URL composition used as a ranking signal?

JOHN MUELLER: You mean keywords in the URL?


JOHN MUELLER: I think that might be some small factor that we take into account, but it's definitely not the main factor, and not something where we'd say, if you put your keywords in your URLs, then you'd see any kind of visible ranking change. So that's something more subtle than that. So it's not the case that you need to put your keywords in your URLs. When we talk to the engineers here, they even say you shouldn't artificially rewrite your URLs because chances are, you'll get it wrong. And you'll cause more problems than you would even partially fix there. So that's something where we do take that into account slightly, but it's not enough of a factor that I say you should ever change your URL structure to do that.

AUDIENCE: I understand. So if you choose between a short URL and a keyword-rich URL, you would go with the short URL?

JOHN MUELLER: It's really hard to say because there is so many other factors that are always involved there. But if, for example, you had a choice between a URL that was fairly long that included all your various keywords in there, and the URL that was short and used parameters, then I'd definitely go with the one that uses parameters instead.

AUDIENCE: OK, John. Thank you.

JOHN MUELLER: All right. Let's grab some more from the Q&A here. These are kind of ranked by the number of people who clicked +1 on there. So in the future, if you see a question that you like, make sure you click the +1 button in the Q&A feature. Let's see, the first one here. "How can you notify a change of address if you move a site to a folder? For example, I had, and now I have, like in a subdirectory. The old domain redirects directly to the new /fr folder. Webmaster Tools says I can't change the address unless it redirects from root to root." Yes, that's currently-- let's say the way the Webmaster Tools feature works, in that you have to redirect from your site's homepage from one domain to another one to make a clear change of address signal for us. But if you're using clear redirects, if you're not blocking those redirects from being seen with the robots.txt file, then that's enough of a signal for us as well. This change of address to feature mostly helps us just if you have a one-to-one move from one domain to another one. As soon as you change folders, as soon as you change the URL structure, then this tool doesn't really help you. It doesn't make sense for that. So setting up the normal redirects and letting those run naturally is essentially what you should be doing here. What I'd also do there is set up a rel="canonical" so you have that as an extra signal as well so that we don't just see one domain redirecting to another one but also having that confirm on that target domain really kind of helps us there. "A keyword our old SEO used in 2011 and 2012, and we've been removing, disavowing, ever since, has been used by a high quality news website as an anchor text to us totally naturally in the last three weeks. Should I disavow and remove that to be on the safe side?" If this is a completely natural link, you don't have to disavow that. You can keep that. That's a great link to have, I guess. So just because it uses a keyword that you previously used for kind of problematic reasons doesn't necessarily make this link a bad one. "What's the longest time Google has waited before refreshing an algorithm that holds websites back? Does Google get to the point of one or two years and decide to make the algorithm redundant?" I don't know for sure how long the longest time an algorithm has been in place is, but essentially a lot of our algorithms, they stay in place the way they are, and they just keep running with updated data, and they keep doing the same things. Some algorithms are a little bit more problematic in that we need to double check the data that they generate. For example, the Penguin algorithm is something like that. So that's something that I believe has been almost, eight, nine months, something like that. Some people will probably know for sure. That's something I know the engineers are working on to create an update to that as well. So that's definitely coming at some point. It's really hard to say otherwise past that because there are lots of things that just stay in place for a really long time, and they work really well the way that they are. It's not the case that we have to kind of update the algorithm all the time just to keep it fresh, because maybe it's doing the right thing. That said, if we recognize that there's a chance to kind of deprecate an algorithm because it's no longer necessary, that's something we love to do. Anything that, where, if you're running a website, if you're running any kind of software systems, if you know that you can remove something that you don't really need anymore, then that saves you complexity from maintenance. That makes it a lot easier for you to kind of move forward as well because you don't have to keep thinking about this old thing that you have in place that doesn't really do anything useful there. So if there's a chance that we can remove some algorithms because they're no longer needed, and we love doing that. It's not that we just stick to them for no reason. So if we can remove something because it's no longer needed, we'll try to do that. And that's the case, I think, across the board with all of our systems, where if we can recognize that some code is no longer needed because it works really well without that code, then we'll take it out and make things a little bit leaner and easier to move forward with. Moved to SSL sitewide, created a new HTTPS site in Webmaster Tools, but I can't use the change of address tool because the new HTTPS site doesn't appear in the available domains. Yes, that's a problem. We're looking at that with the engineers to see what we can do to update the change of address tool to handle that a little bit better. We think moving to HTTPS is a great thing to do, so we've been kind of cataloging all the places where Google uses HTTP and HTTPS and making sure that, from a web master's point of view, we make it as easy as possible to move from one type of connection to another type, and kind of making it so that there's as little holding you back from moving to HTTPS as possible.



AUDIENCE: That was me for this answer. There is a real benefit to use SSL, HTTPS, and SEO benefit? There's a real benefit.

JOHN MUELLER: At the moment, we're not using that as a ranking signal, so it's not the case that if you have everything on HTTPS, then we'd automatically promote your website in search. But that's something we might look at in the future, if we think that this is a good thing to do, if the metrics that come out of our analysis there say say that this is the right thing to do. But for purely for SEO reasons, I don't think there is any real difference between HTTP and HTTPS. I imagine there are a lot of second order factors that are involved, such as user trust, and depending on the type of data that you have, whether or not they recommend your website, if they want to stay on your website, if they had do more things on your website than just the basic things. But that's something that doesn't really have a direct SEO aspect to it.

AUDIENCE: OK, thanks.

JOHN MUELLER: I work for a pharma company. The problem is that some of our products' description is the same, except for strength. For example, some product 10 and some product 20 are two different products with different strengths. If I create two separate product pages, will it be treated as duplicate content? First off, I guess yes, we would recognize the duplicate content on these pages because it's duplicate content. If the rest of the page is the same, and you're just changing the 10 against the 20, then that's something where we'd see if the rest of the page is being duplicate. But from a practical point of view, that's not something you absolutely need to be worried about. It's not something where we would penalize a website or demote it in search just because it has duplicate content on it. So essentially what happens here is we recognize that these pages are almost the same. We'll index both of them. And depending on what the user is searching for, we'll show one or the other. We won't show both of them in search, but we might show maybe the 10 if we recognize the user's searching for something with a 10 in it, or we might show just one or the other kind of depending on the other factors that we have, if someone is searching just generally for this page.

AUDIENCE: I just wanted to go back to the question previously about the SSL. So for instance, there's the $30. one, the $10 one, and there's the $100 one, which is the premium one. But if you get the $30 one, Google Chrome will say it's not as-- the user hasn't identified their identity. My question to you is, for the future, do you think it's just better to get the $100 certificate? The one that really verifies the business all the way.

JOHN MUELLER: I don't know. I imagine the pricing depends a lot on where you get those certificates. So it's not that I'd say the $30 is this specific certificate type, but rather it's just this product that this one supplier is essentially offering. So I don't know exactly what's behind those different numbers there. What we do say it is to make sure that your certificate uses 2048 bits at least. And we've seen, I think most of the suppliers offer those kind of certificates. What you definitely don't need is the extended validation certificate, the EV one with the little green bar on top. That's something that may make sense for your users, but it's not really what we're looking for at the moment. So at the moment, if you make sure that the connection between users and your website is just secured appropriately, then that's something that we think makes a lot of sense in general and something that we would like to recommend that webmasters do.

AUDIENCE: Yeah, I know. The reason I'm saying is [INAUDIBLE] it could continue with the transition because the user is used to seeing the green. So it goes to the bank, the bank sees HTTPS, and it goes to somewhere else. It's all green. And then if it sees the red, it's crossed with the next.

JOHN MUELLER: That sounds more like something broken with that specific certificate then. It shouldn't be that Chrome is making a decision and saying, oh, this certificate looks kind of shady. It's technically OK, but it looks kind of weird. I imagine something might be broken what that certificate. There's I think a site-- what is it, Qualis. Let me find the URL. They have a testing tool for certificates.

AUDIENCE: Yeah, yeah. I have it, yeah.

JOHN MUELLER: That essentially gives you a grade and says this is working as it should be. It's like grade A or whatever kind of scoring they have there. That's something I do with the certificates just to make sure that you've got it set up correctly, that you have it set up with subdomains or wildcard domains or whatever you're using there.

AUDIENCE: Yeah, I saw the video with Igor and Pierre.




AUDIENCE: So if a site is using SSL, and say there's a problem with their certificate, maybe it hasn't been renewed properly, or sometimes it's a server issue, even though the SSL's not a ranking factor, if a site did have problems with a certificate for a continued amount of time, can that be a low quality signal that could cause a demotion related to Panda or something like that? I saw something once that gave me the idea. It just makes me wonder if it was that or something else.

JOHN MUELLER: Yeah. So we wouldn't actively demote a site just because it has a broken certificate. But there are two things that could happen. On the one hand, we could send a message via Webmaster Tools saying hey, the certificate is broken. You should fix that. Which I think is kind of like pointing out the technical problems because if the user clicks on that search result, then they'll see that warning first. They won't see the content. So that's kind of pointing out the technical issue first. The other thing that we might do is if we see the same content on HTTP and HTTPS, and we recognize that the certificate on HTTPS is broken, then we might choose to kind of change our canonicalization towards HTTP for those URLs. So instead of showing the broken HTTPS link, we'll show the HTTP link in search results. But that doesn't mean that we would demote that site or that we would change its ranking in any way. We just-- essentially, if we know that the content is the same, and one of them has a broken certificate, we'll show you the version that actually works.

AUDIENCE: OK. Yeah, I've seen that when a site has both, and sometimes the links get a little bit mixed up, then I've seen that common problem with links getting bumped down.

JOHN MUELLER: So this is something that sometimes happens with hosters, that they set up their website on HTTP and HTTPS automatically, even if the website owner doesn't have a certificate. And from our point of view, that's technically not really correct. We'll try to recognize that and point to the HTTP version in search. But from a practical point of view, that's essentially a problem that the website owner should be fixing in that either the site should be responding to HTTPS and have a certificate that's legitimate, or they shouldn't be responding to HTTPS at all if they don't have a certificate that's really valid there.


AUDIENCE: And when we're changing to Secure Socket Layer, John, do you prefer us to use page level 301, or would you prefer a rel=canonical, or is either fine? What would be better there?

JOHN MUELLER: I'd used both. So redirecting is definitely a good thing. You need to redirect if you want to use something like HSTS, where essentially you're telling everyone they should use HTTPS or not access that site at all. So that's what you need to use a redirect. But essentially, from a point of canonicalization, having a redirect helps. Having a rel=canonical helps. Having clear internal linking helps, where you're kind of not pointing to the version that's actually redirecting, but pointing to the version that you want to keep. All of that helps. And these are essentially normal canonicalizations that you would do with any move from www to non-www as well, or from one domain to a different domain. And I think most of you guys have a lot of practice with site moves. So this isn't really that much more complicated, past setting up the certificate and all that.

AUDIENCE: The certificate can take a week, by the way. So if you want certain content.

AUDIENCE: So aggregate signals are a better signal than a single signal, if I could put it that way.

JOHN MUELLER: Well, the clearer you could give us your signals, the more likely we'll take them into account. That's, I think, always kind of the case when it comes to SEO. And if you give us conflicting signals, then chances are we might not do it the way that you expect. So for example, if you have rel=canonical set to HTTP, and the redirect is pointing to HTTPS, then we're seeing, hey, he wants HTTP indexed, but on the other hand, the URL says no, well actually, HTTPS. And in a case like that, we essentially say, well, we have to make a decision somehow. We'll just pick one or the other. And if the webmaster had a strong preference one way or the other, then maybe we'll choose the wrong one. So if you can give us really clear, consistent technical signals and say I'm redirecting, I'm really confirming this link, and internally I'm also linking to this version, then that's a really strong sign for us to say hey, I think maybe he wants this URL indexed like that. He doesn't want the other one.

AUDIENCE: I understand, John, but there is a time, an amount of time, to get this [INAUDIBLE] indexing. [INAUDIBLE] use 301 redirects but still lost positions in search for going HTTP to HTTPS.

JOHN MUELLER: That shouldn't really be happening. That's something where if we see-- I mean, we do this on a per URL basis, so that when we crawl one URL, we see the different versions, we'll pick one of those and index it like that. So that's something where if you do something like a site query and look at those URLs directly, if you look at the index status information in Webmaster Tools, you'll see that for the HTTP version, it slightly goes up, and for the other one, it slightly goes out. Like a clear transition where it goes from HTTP to HTTPS. So you'll always see a slight transition there, but you shouldn't really see a drop in rankings if you're moving from HTTP to HTTPS. That should essentially be a more granular, a gradual move from one version to the other, and the other version should essentially just be picking that up normally. It might be different if you're moving domains at the same time, if you're changing your site's structure at the same time. But just moving from HTTP to HTTPS, with everything else remaining the same, you shouldn't really be seeing any drops in rankings for that kind of [INAUDIBLE].

AUDIENCE: Yeah, but the funny thing is we drop. We lose all our rankings in [INAUDIBLE].

JOHN MUELLER: I'd love to have some examples of that. Yeah. So if you can send me some examples where you're seeing that, then that would be really useful for us. My guess is that you're seeing effects from something that's unrelated to this move from HTTP to HTTPS. But there definitely shouldn't be anything where we'd say, they're doing a site move from one version of their domain to a different version of the same domain, that we would change any of the ranking signals associated with that. But please do send me some examples there if you're seeing that with your site or with the client site or something like that. I'd love to take a look at that and make sure that we're doing the right thing there.

AUDIENCE: OK, thanks.

JOHN MUELLER: All right. If a site blocks CSS and JavaScript files through robots.txt, does Google place less trust in that site, which will be reflected in the search results? And are you going to start a blog similar to Matt Cutts? I do have a blog, but because it hasn't been updated since, I don't know, a really long time, I guess nobody really knows about that. Which is probably a good thing at this point.

AUDIENCE: And your Twitter.

JOHN MUELLER: I'm on Google+. With regards to CSS and JavaScript and robots.txt, we recommend not blocking CSS and JavaScript files with robots.txt because then we won't be able to pick up all the information from those pages. So it's less of a problem in the sense that we'd lose trust in that site. But if CSS or JavaScript are being used to generate anything really unique and compelling on that site, if we can't see it, then we can't credit that site with that content. So if you're using AJAX to pull in more content, or if you're using other kind of JavaScript tricks to pull in more content or display it in a really neat way, then if [INAUDIBLE] that, then we can't really credit that site with that unique stuff that they have there. So it's not that we lose trust in that, but we just wouldn't be able to rank it for any content that might be created through those [? files. ?] "I've noticed that many large sites have a link to a short summary about them sourced from Wikipedia. If we create a Wikipedia page about our company, will it be seen as spammy or unnatural by Google?" We can't really speak for the content that's created on Wikipedia. So from my point of view, you can do whatever Wikipedia's guidelines allow on Wikipedia. But essentially, they have, as far as I know, fairly strict guidelines on which content they want to keep for the long run and which content--


JOHN MUELLER: Needs to be updated and referenced. So that's something where you probably need to check with them and not really check with us.

AUDIENCE: Now, be careful [INAUDIBLE] they can ban you for life too, so. John, that was my question just because I don't know-- the results I'm talking about, like some of [INAUDIBLE] competing terms, there's a little gray link in the search results, that you kind of hit the little arrow, and it just gives you a little summary of the company. So I don't know whether we'll do it or not, but I just wanted to make sure that we wouldn't be doing anything against Google's policies by doing that on Wikipedia. So thank you.

JOHN MUELLER: Yeah. So we do pull I think some of that at least from Wikipedia. And we kind of show a short summary of the website or the business behind that. And I really don't know what the guidelines are from Wikipedia's point of view about what you can create yourself, whereas what other people create for you, what you can update yourself. I have no idea what the guidelines are there. But double check with them or read up on what's allowed there, and see if it makes sense. See if it's something that's notable or whatever for Wikipedia. And maybe it makes sense to kind of just double check to see that there's an entry there. I really can't speak for Wikipedia.

AUDIENCE: It's just they don't want branded information, like the sources must be reliable. And it has to be written by a Wikipedia contributor, or else it will get deleted.

JOHN MUELLER: Yeah, I really don't want to recommend that everyone goes off and fills Wikipedia with a blurb for their website. I imagine that's not in Wikipedia's best interests. So you really-- just kind of be careful with those kind of things. And this isn't something where we'd say this website has an entry on Wikipedia, therefore we'll rank it higher. Whether or not we can show that information or not is something slightly different. Maybe we can pull content from various sources to get a more objective view of the site. But this is really more something where I think you need to be kind of careful that you don't just spam other people's websites in the hope that it helps your SEO [INAUDIBLE].

AUDIENCE: And even if you do it, John-- I just wanted to add to that because I have some Wiki experience-- they'll consider you as a sock puppet, and you just get deleted. So their bot is very sophisticated. So that's it.

AUDIENCE: John, what about the issue we spoke about in the last Hangout, where I showed you an example of where it's [INAUDIBLE] being abused? There was an example in the last Hangout we looked at the keyword was, what is a virtual office.

JOHN MUELLER: Yeah, I passed that onto the team. I think that wasn't actually from Wikipedia, though.

AUDIENCE: No, it wasn't.

JOHN MUELLER: That was from some other site that we pulled that from, yeah. I passed that onto the team to take a look at there. That's something where I think it's not a matter of people spamming Google to get that in there, but kind of a technical problem on our side that we need to be careful about what we show as answers in the snippets there.

AUDIENCE: Yeah. And it all comes down to the same thing over and over, and that is the reaction time to these things being resolved. This comes back to the same question I asked many, many, many months ago, and there is now, for the key term virtual office, or virtual office London, there is one business in there that is spamming. And if there's not one person who could say they're not. And yet they still exist, which takes us to the whole churn and burn thing. And it's perpetuating the success of it. And with those things not being dealt with quickly, we're seeing it happen more and more and more often in multiple niches. If they're doing it in my niche, it's happening everywhere else.

JOHN MUELLER: Yeah. I mean, this is something where we try to take the appropriate steps. And sometimes, manual action isn't the right thing to do in steps like this, where as it makes more sense to update our algorithms to be a little bit more sophisticated in that regard. So I think that's the direction we're heading with these answers in the sense that we're not going to manually vet all of these or manually go through all of these. But if we can make sure that our algorithms are doing the right things by bringing useful information up in the search results like that, then that's something we can fix not just for one search result, but maybe for millions of other search results as well. [INAUDIBLE]

AUDIENCE: [INAUDIBLE] this is what you've been doing for years, though. You've been doing this for years and years and years, ever since I've been using Google, which is from its inception, is that you fix things with algorithms. But the algorithms take six months in many cases to catch up. And the result of that is that these sites are ready. They're waiting for when their site gets pulled down, they have another one waiting. This particular business has been doing this for many years, and has been successful in staying up there with a new website and being in the top three continuously. And I've seen this in many, many industries. So that solution doesn't work. What you need is some kind of marker that you can basically say, right, remove these sites, and then let's write some code to factor them into the next algorithm. Start clean sheet again, and start marking sites up again. Because otherwise, you're just telling everybody right now, concentrate, if it's worth it for your keyword, concentrate on the churn and burn site, get up there, and then just start a new one. And there's plenty of businesses that have a lot of money where this is very beneficial for them to do so. And they will do it and are doing it. For smaller people, it's very difficult to maintain multiple sites. But believe me, for these big sites, affiliates, for people who are selling car-related, sell your car, things like that-- I've highlighted many things to you in the past-- they're doing it, they're still doing it, and they will continue to do it. And they're bad results. They're not good for the customer support. It's the wrong approach. We all want to make the results better.

JOHN MUELLER: That's good, yeah. I mean, this is something that, as you've seen in the past, it's kind of a balance between what we do on a manual side and what we do on the algorithmic side. And there are definitely some situations where we could and should be doing better, and there's lots of situations where we push back a lot on the engineers to say OK, instead of waiting another half a year to work on this algorithm, we need to find a manual solution in the meantime to make sure that at least these sites that are causing a lot of problems here are kind of taken care of. But it's tricky to find a balance there, and as you can imagine, like I said, especially with churn and burn sites, doing things manually is a lot of work. A lot of cycles that are essentially lost on things that just get flipped over to something else on the next turn. So finding a balance between manually taking out some of these and algorithmically handling them better in the long run is always a problem. And I think you'll always find these kind of situations where some of these sites are still getting through, and others are kind of waiting to be kind of let through the search results because they're doing everything right, and that's something where we're definitely working with engineers to make sure that those that are doing things right continuously also get shown appropriately in the search results. But I totally understand your frustration, and we do bring that to the engineers as well when we're working on these problems.

AUDIENCE: Yeah. Isn't the reality of the situation, if you clean up the problem right now, and you basically show people, look, you're not going to last more than a couple of days or a week, then the system itself will fix the issue? People won't have to actually do this anymore. You will have stamped out the program, and people will know not to waste their money doing it. It won't work. And therefore, you don't have a lot of overheads to deal with once you've proven your point.

JOHN MUELLER: There are a lot of sites out there. So manually doing the whole internet is really tricky. When I talk with the web spam team, I see they're busy all the time. And there are lots of people on the web spam team that are working on these kind of manual problems. But there are really a lot of sites out there, a lot of search results, that could theoretically be cleaned up manually, but it's a lot of work. So that's why we put so much effort also into creating these algorithms that catch these a little bit more broadly. But your point is definitely taken, and it's something that we do bring back to the teams regularly.

AUDIENCE: There's a lot of free workers here as well. We're all here to work for free. So don't forget to use us.

JOHN MUELLER: I can imagine you're not working for free, but I totally accept your point that we could be doing more to kind of use those spam reports a little bit better.

AUDIENCE: Thanks, John.

JOHN MUELLER: All right. "We had a link attack on our main keyword that we had ranking between one and four. When I noticed that we had this attack, I found spammed links and disavowed them, but our keyword ranking fell down to between 12 and 20. What can I do to recover?" I think, first of all, if this is something that is timewise fairly close together, then chances are those two situations aren't related. And maybe there's something else that our algorithm's picked up on your sites that we're seeing problems. So this is something where I'd first try to get help from the community as well, from other webmasters to at least vet your site and make sure that it's of the highest quality possible. So this is something where theoretically there could be some connection here, but in practice, this is something that we really rarely see. So I'd really recommend, first all, taking it to peers and having them take a look at that. If you absolutely can't find anything there, if it really looks like something [INAUDIBLE] on our side, then feel free to send that to me, for example, on Google+, and I'll pass that on to the team here to make sure that we're doing the right thing there. I can't always get back to a lot of these reports, but I do pass them onto the team to make sure that they're aware of this problem, and that they can double check. If you have two TLEs, a .com and a .couk, both using HREF lang, should links between those sites be no follow? I think you can make those links normal. This is something where if you have a handful of sites, and you're linking to a different version, [INAUDIBLE] reason [INAUDIBLE] language or the geotargeting, those kind of things, then linking between those is completely natural and not something that you'd have to block unnaturally. Obviously, if you have 200 sites, and they're all on the same topic, and they're all cross-linking like that, then that can look a lot like a collection of doorway pages, doorway sites. But if these are two sites, then I really don't see any problem linking between those two.

AUDIENCE: Yeah, John, that was my question. That was just a quick one, if I was to push people to my UK site, or push them to the .com at the bottom or something like that as a reference point, I just wanted to make sure if you're doing that, there's nofollow or dofollow issues with my internal linking because that's not the intention.

JOHN MUELLER: That's absolutely fine. And a lot of sites do this across the site, so between the individual page versions as well, in that they'll take that HREF link and also put it into the text, where you have like a flag you could switch between UK and US, or UK and global, or whatever you have. And those links are absolutely fine. That's not something that you need to block from passing pagerank.

AUDIENCE: OK. Thanks, John. Is there any benefit SEO wise? Not that I'm doing it for that reason, but [INAUDIBLE]?

JOHN MUELLER: Well, they would pass pagerank, so that's something where you'd be sharing your page rank across those two sites. So to some extent, that can make sense. It can help to get those pages indexed a little bit faster. As far as I understand, in your case, it's not a matter of indexing. So theoretically, that's totally up to you. Personally, I would just link normally between those two versions. Because they're essentially two versions of the same content. It's not something that is seen like an advertisement or that kind of thing.

AUDIENCE: OK. Thanks, John.

JOHN MUELLER: All of our website article pages have a link to the PDF version of the article for people to download. Will these PDF files be seen as duplicate content and lead to a penalty? Should the PDF files be blocked in the robots.txt file? So first of all, we don't treat duplicate content within a website in a bad way. You won't get a penalty for having duplicate content. So it's not something that you absolutely need to kind of block or take care of or handle in any special way. What will probably happen is we'll index these PDF files as well. We'll index them as PDF files and show them in search when we think that they make sense to users. Usually what will happen is for normal queries, we'll show your normal HTML pages because they're really well linked, and they give that information really well. And if we can recognize that someone is looking for something specifically like a PDF, then we'll show that in search instead. But there's no downside to doing this, essentially. The only thing I can think of at the moment is that PDF files, like images or doc files that you have on your pages, tend to change less frequently than HTML content, so we'll probably not crawl them as frequently. So if you have content on your website that changes fairly quickly, and you have all of that in a PDF file as well, then chances are we'll index the PDF file once and maybe leave it the same for a couple of months or even a year or longer, and we'll just be updating the HTML content in that time regularly during our normal crawls. So what might happen is that the PDF that we have indexed is kind of out of sync with the HTML page that you actually have on your website. Sometimes that doesn't matter. Sometimes maybe that's more of a problem, if you have, for example, a news website, and you don't want the old versions be found in search. Then maybe it makes sense to kind of block those PDF files from being indexed. But past that, there's no penalty for having PDF files indexed like that. There's also no inherent advantage where we'd say, oh, if you have all of your content as a PDF, we'll treat that as being higher quality content than your HTML content. So if you're doing this for the users, end users kind of like that, like being able to access those PDF files. Or you think people might be searching for PDF files specifically, then go ahead. Leave them indexed. If you're kind of-- if you have a very dynamic website that changes quickly, then probably having everything as a PDF file doesn't make that much sense. So it's essentially up to you. Is it recommended for webmasters to place link-worthy content like blogs in subfolders rather than subdomains? From our point of view, whatever you want. You can put it in subfolders, subdomains. It's not something that we'd say you need to do in any special way. Sometimes there are technical reasons for subdomains. Maybe you have to put it on a different host, a different server. It's essentially up to you. It's not something where we'd say, this way is better than that way.

AUDIENCE: Can we just ask a quick question about that little green thing on top of the browser?

JOHN MUELLER: That little green thing. OK, go for it. Which one do you mean?

AUDIENCE: That little thing-- is that little green PR thing, when is that going to get updated?

JOHN MUELLER: That little green-- oh, the PageRank, toolbar PageRank.


JOHN MUELLER: I have no idea. Is it stuck, or is it--

AUDIENCE: It's not moving.

JOHN MUELLER: It's not moving. So I don't know. I don't even have a browser that shows the page rank.

AUDIENCE: Well, I use a browser when I need to just look at it once in a while, and it's called IE.


AUDIENCE: [INAUDIBLE] counted, and it's 227 days since the last PageRank updated.

JOHN MUELLER: OK. I know last time we had more of a technical issue, that we basically didn't update it for a while. I don't know if there is anything similar like that at the moment. But essentially, we don't recommend using PageRank as any kind of an actual metric. It's something we've used in a browser for a while, and we've traditionally updated it from time to time, but I wouldn't be that surprised if it disappeared at some point or another. But I will definitely check with the team to see if something got stuck there or what's happening there.

AUDIENCE: My question was, that the PageRank algorithm is so old, and you guys patch your algorithms so much, you have so many patches on top of patches, is the link-based PageRank algorithm really even relevant anymore?


JOHN MUELLER: I mean, we still use PageRank as one of the signals that we use for crawling and indexing. So that's something-- it still makes sense. I mean, there have been updates to the algorithms I believe for a while. But it's still something that's being used. And we do still see links within the web as being relevant and helpful for recognizing the kind of content that we need to crawl more frequently or that we need to write differently maybe, those kind of [INAUDIBLE].

AUDIENCE: Before you said you use-- you just said a second ago you use page rank for crawling and indexing.

JOHN MUELLER: And ranking.

AUDIENCE: And ranking, OK. [? We're being ?] [? clear, ?] OK.

JOHN MUELLER: I totally didn't want to throw that in there, but I missed the word, yeah.

AUDIENCE: OK, just making sure, thanks.

JOHN MUELLER: That would have been [INAUDIBLE].

AUDIENCE: [INAUDIBLE] the headlines are coming.

JOHN MUELLER: Oh, gosh. Now I need to make sure I put my poker face on, that you don't read into anything about the words that I missed.


AUDIENCE: Hours and hours reading everything into everything that you say, John.


JOHN MUELLER: [INAUDIBLE] PageRank. We just don't think it's that much of an actionable metric for webmasters to actually focus on. So I don't know if this is something that we would update the algorithm for to kind of-- I don't, update the new toolbar PageRank, or if this is something where the engineers have said, oh, well, we haven't updated now, and since hardly anyone's using the toolbar nowadays, maybe it doesn't really make sense to keep focusing on that data.

AUDIENCE: All the SEOs are using the toolbar.

AUDIENCE: No, we're not. But John, don't you see a bit of a problem for Google's guidelines in this sense, that the PageRank algorithm was published, and then the guidelines were published, and people decided to make some links, rightly or wrongly or whatever. And then the guidelines were changed to say, don't ask for links. And I believe that's still Google's policy. But let's say, just hypothetically, Google wanted to use a different recommendation factor, like social shares or whatever. Now you're in the position where, if everyone thinks that's a factor, you're going to have to put in your guidelines, please don't ask for social shares either. You know what I mean? So you're kind of in a catch-22, where people made links, and now you have to tell them don't ask for links. And now people think you're going to use shares, and so they're going to ask for shares. Is there any point in the future where you're going to have to ask, don't ask for social shares?

JOHN MUELLER: I don't know. I don't know. Good question. I mean, this is something where-- this is one of the reasons why we use so many different factors in our algorithms for crawling, indexing, and ranking, in that we try not to rely on just one factor completely because we know that this is something that could be gamed by spammers who are trying to create content in a really bad way, and it could be confusing to those who are trying to create content in a good way, in that they don't know, what should I be focusing on? Should I focus on making my website great, or should I be focusing on this individual factor that Google has mentioned in one of their technical documents a while back? So that's something where we try to have a diverse set of factors that we use for crawling, indexing, and ranking, so that we don't have to rely on any one factor too much. But sometimes it's very tricky, and it's problematic in that some of these factors might be stronger than others, or might have different kinds of effects in different situations. So we need to make sure that we can figure out when abuse is happening in some algorithmic way, if at all possible. And if we can't handle it completely algorithmically, then maybe some manual way as well to handle the abuse, and also to give it webmasters' guidelines on what they should be doing to create great content in the right way. So for example, we have the same kind of situation when it comes to rich snippets, in that people can use rich snippets in a variety of ways. And we'll try to use that to understand the page content a lot better and to show that in search appropriately to the users when they're searching for something. And for example, recipe rich snippets shows this nice little picture of the final item that you're cooking or creating and lets you specify things like calories, cooking time, those kind of things. And we've noticed that people try to abuse that by saying, OK, well, everything on my site is a kind of a recipe, and I just want you to show this picture of my logo instead of this picture of the cookies that someone else might be creating. And they're essentially abusing the whole system to kind of promote their site in a way that we think isn't really fair, that doesn't really fit in with the rest of the web. So those are the kind of things that we try to catch algorithmically. We try to catch them manually as well where we can't catch them algorithmically. And the type of things where we put that into our guidelines as well and say, hey, you should only be using rich snippets for the primary content of your pages. And if you're abusing it in any way like this, then maybe you'll have to take action on the way that we understand your site, the way that we use the content from your site within search.

AUDIENCE: Well, I think-- I hope you get my point that if you have published guidelines of what to do and what not to do, [INAUDIBLE] play all those cards completely close to your chest because you have to at least tell us some things in the guidelines of what to do or not to do. Or if Google's going to take the public relations strategy of not publishing any guidelines, and just saying, we use all the signals. Don't ask us about signals. There are no signals. Then people are going to use all the signals. So you're kind of darned if you do and darned if you don't.

JOHN MUELLER: I think from our point of view, we try to err on the side of transparency nowadays, in that we think if we help webmasters to create really great websites, then there are some people that will actually take that information and use that in the right way to kind of create something that's really helpful and useful that we can also understand from an algorithmic point of view a lot better. So that's something where I believe if you go back, maybe 10 years or so, you would have heard almost nothing from Google, and Google would have basically said, we'll take a look at the web and index and rank it accordingly. Whereas nowadays, I think it makes a lot more sense to actually talk to webmasters and make sure that they're doing the right thing, but they understand where we're headed, where we want to show more content, where we can see, OK, there are lots of people, for example, on mobile who are looking for your website, but every time they go to your website, they get this really bad experience. You should be doing more to create a good mobile website. Whereas in the past, maybe we wouldn't have said anything about that. But by bringing it out into the open, I think we can generally do more good than we can do harm. Obviously, we can't bring out all of our ranking factors into the open because a lot of that is really hard work, and there are lots of other searchers out there that want to reuse that too. And there are spammers that want to kind of focus on individual items and really try to tweak their websites to kind of get past this one little hurdle that they find. But as much as possible, we'd really like to bring this information out to the webmaster because we think they can make a lot better sites if they have a little bit of guidance.

AUDIENCE: [INAUDIBLE] do want to thank you for these Hangouts and all the help you give to us, and just had a real quick follow-on question to that. About two months ago, after being penalized in some way for almost two years, we finally seem to have been released from the penalty that was really holding us back. And we've noticed that the rankings didn't recover completely to where they were before. A lot of stuff that was page one is now page three kind of things. And the question is, is there a explicit time factor that no matter what, now that we're out of the penalty, it's going to take x amount of time for the algorithm to take things into account, or do we need to assume that the algorithm has changed so much since then that we need to do other things to work on our quality and improvement?

JOHN MUELLER: If it's been over a year, and you say two years, then I imagine the algorithms have just changed over that time. So that's something where the algorithms are constantly evolving. I think we make over 600 changes a year. And that's probably, to some extent, what you're seeing there. For the most part, unless we block your site completely with a manual action, and we say we don't really want to take a look at this site at all, then your site is always being affected by the algorithms in some way. And there's not this period of time where things kind of fluctuate back into normal again. It's essentially always kind of in this flux. So if your site hasn't been removed completely from search during that time, then I imagine what you're seeing now is kind of the steady state of the current situation.




AUDIENCE: I'll let Barbara hop in. Thank you.

AUDIENCE: Thanks. Hey, John. I want to-- I put this question in the queue, but it's way down at the bottom, and since I'm in the US, this might be my only opportunity

to do without getting up at 4:00 in the morning. Can you hear me OK?


AUDIENCE: OK. Google considers both the quality and the quantity of backlinks. And I have a travel blog, and I'm often-- quite often, because I'm one of the old time travel blogs-- asked to do interviews, which then result in a story on some other site. And a whole lot of them tend to be low quality sites. And I'm wondering whether I should pick and choose and refuse to do the ones that might be less trustworthy.

JOHN MUELLER: I generally leave that up to you. That's not something where we'd say the site overall is low quality, therefore all links from that site are always low quality. But it kind of depends on how you want to promote your website in general. What's more problematic for us is if you were to go out and create content for these other websites just so that a link to your website could be placed. But if this is a legitimate interview, then that's essentially up to you. You can go either way with that. From our point of view, both are fine.

AUDIENCE: So if I have a whole bunch of-- we're getting back to the PR issue, but PR zero or unranked sites, it's not going to harm my ranking in the search?

JOHN MUELLER: No. That's completely fine. You don't have to look at the page rank of the sites that are linking to your site and say, oh, this is a low page-ranked site. Therefore, I don't want any links from that site. I just look at the sites naturally and say, this is a site I want to be associated with or not, and not look at artificial factors like page rank.

AUDIENCE: Great, thank you.

JOHN MUELLER: All right. We just have a couple minutes left, and a whole bunch of questions left. Let me grab a few of these and just answer them really quickly. We run a website with 2 and 1/2 million pages, of which 1.1 million have been indexed. Each week we add 10,000 pages to our site map, so we'll be releasing another country. How do we steadily build SEO without waiting before a complete [INAUDIBLE]? Essentially, you could add all of your pages to the site map file as soon as they're ready. So you don't have to gradually build up a site map file. And if you're building a website at this scale, with millions and millions of pages it's natural for some fraction to be indexed and some fraction not to be indexed, at least until we've been able to crawl and reindex a lot of content from your website. So you don't artificially need to hold yourself back. You can put everything in your site map files. At the same time, I just caution against creating millions and millions of pages just for the sake of having millions of pages. I'd really make sure that this content is actually useful and compelling. What's the best way to optimize a news website? How can we get registered into Google News? Is there someone who personally reviews a website, or is it an automated process? There's a form in the Google News Publisher Help Center where you could submit your site, and as far as I know, that will be reviewed manually. So there's some factors that Google News takes into account, and I believe they're all listed in their Help section. If certain keywords have been targeted by Penguin, does this then stop those keywords from progressing search results until I refresh, or is it the case that Google simply no longer takes into account those links [INAUDIBLE] naturally? Penguin is a web spam algorithm that generally affects a whole website, so it's not something that would be based on individual keywords or individual links there. It usually affects the whole website. What's the real important of SSL and HTTPS versus HTTP for having good positions in search? As I mentioned before, at the moment, we don't take that into account for ranking. At some point, that might change. But at least for the moment, we don't take that into account. We've experienced a drop in traffic, and we're attributing it to the latest Panda update. What's the best way to recover from this setback? Panda is a quality algorithm that looks at the quality of your site. There's a blog post by [INAUDIBLE] maybe two or three years back nowadays, with 23 questions you can ask yourself regarding high quality content. And I'd really go through that together with someone who's not associated with your website and make sure that you're covering everything as completely as possible. All right, one question left. And I think someone else is going to grab this room. Do any of you have one real quick question that I can answer for you?

AUDIENCE: Barry wanted to know, on the chat, when the next Penguin is, updated.

JOHN MUELLER: Penguin, yeah.

AUDIENCE: He can't answer that.

JOHN MUELLER: I mean, we usually announce these like every week, right?


AUDIENCE: I have a question for you, John, if nobody else does.

JOHN MUELLER: OK, go for it.

AUDIENCE: It's a quick one. You often say that there's 500 to 600 updates a year, divided between ranking, indexing, and crawling. Can you tell us what the rough [INAUDIBLE] is? Is it 200 ranking, 200 indexing, or 200 crawling? Or is it 500 updates a year on just crawling, and two updates to ranking? What's the split? What's the breakdown approximately?

JOHN MUELLER: I don't know. But essentially, everything before ranking affects ranking as well. So if we don't crawl or index something, we can't rank it. So it's really hard to say this is only something that affects crawling, and you'd never see any change from it in the search results. Essentially, everything that we do for web search, we try to make sure that it is visible somewhere in search, or at least is visible for the webmaster in the sense that maybe we crawl less frequently and still get the right amount of content. But essentially, we try to make sure that they're actually doing something. Because if they don't do anything, we'd rather just delete that code.

AUDIENCE: Except for Hummingbird.

JOHN MUELLER: I don't know. We delete that too, if we don't need that anymore. OK. So with that, we're out of time. Thank you all for your questions, and thank you all for joining. I put the new Hangouts into the calendar. So if you want to join then, feel free to jump on in. They're at the usual European-based times, but you're welcome to add your questions to the Q&A feature. And if they're voted up, I'll try to get to them. Thanks a lot.

AUDIENCE: Thanks, John.

AUDIENCE: Thanks, John.

AUDIENCE: Thanks, John.

AUDIENCE: Bye, John.


AUDIENCE: All right. Is this Josh? Arthur. Arthur, are you there?

AUDIENCE: Can you hear me?

AUDIENCE: Hey, Arthur.


So it's only your time 4:00 AM. In my time, it's--

AUDIENCE: It's 5:00 AM my time, but it's the Friday ones.


And in my time, it's 5:00 PM. It's perfect time I'm in. Ah, no, no, no.

Friday I have it at 12:00, at noon.

AUDIENCE: Yeah, but you also joined the Jim Hangout, right?


AUDIENCE: Wow. I mean, I don't know how you do it. I don't know how you do it.

ARTHUR RADULESCU: I can tell you. For me, it's in the middle of the day. So it's OK. Yeah, and it's Thursday, so it's OK. It doesn't get over the John Mueller's Hangouts. So it's quite OK.

AUDIENCE: So how's Romania right now, man? How's everything over there? Is the cost of living high? What's going on over there? Are you guys OK there?

ARTHUR RADULESCU: Well, the cost of living, it's almost the highest in Europe. But well, we're not complaining so much. I mean--

AUDIENCE: Yeah, you don't hear too much about Romania in the news.

ARTHUR RADULESCU: Yeah, because we are quite, I don't know, steady people? How can I put that? We don't do too much noise about things, you know? But the only thing I am pissed off is the price of the gas. Wow.

AUDIENCE: [INAUDIBLE] too much, you drive a very expensive car?

ARTHUR RADULESCU: No. But I drive a very expensive gasoline and diesel fuel. I mean, it's twice your price. Just think about it. And salaries are probably less than half. So the gasoline is killing everything because everything goes on the road. Food, non-food things like furniture and everything. Everything goes by the road. So everything gets expensive.

AUDIENCE: So what do you think? Is Penguin going to come or what?

ARTHUR RADULESCU: For sure. For sure.

AUDIENCE: October, right? October something, right?

ARTHUR RADULESCU: Yes, September at the end, October.

AUDIENCE: You think it's going to be a shake-up again, same shake-up like before?

ARTHUR RADULESCU: Well, no. I hope they can fix something they broke last year.

AUDIENCE: They know it was a crazy shake-up for everybody.

ARTHUR RADULESCU: Yeah. Probably they will do the same. But maybe something gets fixed. I don't know.

AUDIENCE: And then they adjusted everything back, right? So yeah. Because [INAUDIBLE].

ARTHUR RADULESCU: [INAUDIBLE] point. They did two updates in a row, you see. They did the bad one, and then they tried to [INAUDIBLE] a little bit.

AUDIENCE: [INAUDIBLE]. So basically kind of like running over you, and then kind of reversing.


AUDIENCE: That's what they did.

ARTHUR RADULESCU: They've realized they've touched too many URLs on the first one. Probably they didn't want to touch so many, but well--

AUDIENCE: You see how Gary's really mad, eh? He's mad about his site. Have you seen his virtual site, man? I don't know what happened there, but he's really pissed, eh?

ARTHUR RADULESCU: Well, mate, I don't know. At a certain point, he was pssing me. I mean, I wanted to go in private with him and tell him, hey man, just let John do his job because he's not there only for you. I mean, we're laughing today, but sometimes he's just over the top.

AUDIENCE: Yeah, [INAUDIBLE] for 15 minutes, he'll go crazy.

ARTHUR RADULESCU: Yeah, so it's not that good. I mean, OK, he comes in, and he put one question, like all the rest of us. But then he just have to stop. I don't know what's wrong with his site, but he's very pissed because his site is down on its knees. But anyway, I mean, from my point of view, if I was to have his own problems so much, I would've just dumped his main domain name and start with a new one. No, come on. I mean, he's losing a lot of money.

AUDIENCE: Eh. You don't know if he's losing a lot of money. He could be just-- you know.

ARTHUR RADULESCU: Maybe he's only teasing. But anyway, he had a lot of problems. I mean, he's complaining all the time.

AUDIENCE: So [INAUDIBLE] are you taking any clients right now or what?

ARTHUR RADULESCU: I do, yeah. That's beside my work, where I have a lot of clients.

AUDIENCE: Right on. Yeah, if there's anything, if I'm busy, I'll pass something over to you.

ARTHUR RADULESCU: OK, no problem. Do you know somebody who is working on [? the-- ?] | Copyright 2019