Reconsideration Requests
Show Video

Google+ Hangouts - Office Hours - 23 May 2014

Direct link to this YouTube Video »

Key Questions Below

All questions have Show Video links that will fast forward to the appropriate place in the video.
Transcript Of The Office Hours Hangout
Click on any line of text to go to that point in the video

JOHN MUELLER: Welcome to today's Google Webmaster Central office hour's hangout. My name is John Mueller. I'm a Webmaster Trends analyst at Google in Switzerland. So I work together with webmasters like you and with our search engineers to make sure that we're doing the right things. In this Hangout, we have a bunch of questions that were submitted already. There are a bunch of you, also live here in the Hangout. Feel free to ask questions in between or to comment on things. Welcomed to join in. As we get started, do any of you want to ask the first question?

AUDIENCE: John, I will, if you don't mind.

JOHN MUELLER: Sure, go for it.

AUDIENCE: Would you having a look at a result for me, which is related to the question that I've posted to you previously where the snippets and the cache that's showing within the results shows our site but coming up with something else?

JOHN MUELLER: Sure. Let's see. OK, yeah. That doesn't look like a great experience, yeah.

AUDIENCE: No, and it's-- I don't know if you remember our site. I know you looked at it a couple of times, but I assume this is related to the issue. And our developers think it's a DNS issues that might have happen months ago at the time of our drop. But that's where my technical knowledge ends. So I'm wondering if you see that sort of thing often, and what usually causes it.

JOHN MUELLER: So from my point of view, that kind of looks just like some scraper site. If you do something like a site query, you're going to find those kinds of results, because that's essentially what they have on their site. But that doesn't necessarily mean that they show up in the normal search results. So if I take one of the titles and search for that-- let me just drop that in the chat. I hope there's nothing crazy on the URLs. Essentially, it doesn't show that site, at least--

AUDIENCE: No, but the real problem, obviously, from our side, is that it doesn't show ours either. And it's all unique content to us. And we only found this because we found that they had scraped our analytics code. We thought-- well, and so we saw within a service that they were reporting as being the same analytics as us. So we thought, OK, well, it's not that unusual because scrapers take code all the time. And it wasn't affecting our analytics in so much as it wasn't showing a bump in visitors or anything. But it means that-- it's almost like they've replaced us. And with the Payday Loan update recently, which last year's one coincides with when our site dropped. And this happens to be a Payday Loan scraper. I put a two and two together, but I don't know whether I'm completely wrong. And that's where our issues lie.

JOHN MUELLER: Yeah, that shouldn't have anything to do with the Payday Loan updates. So I see that site more as a generic scraper site. I mean, if they're copying their content, and this is your copyrighted content, you might want to look into the DMCA process to see if that makes sense. But if they're not showing up in the normal search results. I don't know. I wouldn't bother submitting individual pages. Maybe do the DMCA to the hosters so that they can take care of it that way.

AUDIENCE: Well, we also-- their site is down. And via Webmaster Tools, we did a removal request. They're 404'd. So we submitted a remove the URLs two weeks ago, and they're back. And the cache shows-- when you look at the cached version, it's us. It says, this is the cached version of this page. That's our URL, not theirs.

JOHN MUELLER: Yeah, usually what happens in those cases is we recognize that they're the same content. And we use your URL as a authoritative version. But if you do something like a site query or a specific info query, then you might see their URL as well. But essentially, we've recognized that this content is the same as yours, and we see your URL as the one that we want to focus on. So I think that's kind of working as it should be. It might still make sense to do something like the DMCA to the hoster so that they can take care of it completely. But if you're not seeing them in search, I wouldn't worry about it. There's no exotic effects with them using your analytics code or anything in that regard.

AUDIENCE: All right, so you don't-- I mean, obviously, you can take this offline later. But you don't think it's related to the same issue I sent you.

JOHN MUELLER: No, no, that's-- yeah.

AUDIENCE: All right.

JOHN MUELLER: Something different. Some of these algorithms are things we do work together with the Search Quality team to see that we can get some of these updated a little bit faster. And I think your site might be in one of those situations. So it's tricky in the sense that there's no specific solution that you can do, like tweak an HTML or find some scraper site that's copying your content in a weird way and have them taken out. It's more a matter of these algorithms being updated and generally seeing that your site is really the absolute best of its kind for these kind of queries.

AUDIENCE: All right.

JOHN MUELLER: So I think from what I've seen from your site, your site itself is in a pretty good position. But obviously, I don't have the in depth view as the Search Quality people, with regards to what are all of our algorithms picking up on and maybe there are individual parts that they find problematic. But I think it's kind of lined up in the right direction, but it takes time for these things to get pushed and updated.

AUDIENCE: All right, but it's a year now. July last year is when it--

JOHN MUELLER: Yeah, yeah.

AUDIENCE: So all right. Well, as I said, I've still got the query outstanding with you. I don't want to hog this one.

JOHN MUELLER: Yeah, I mean keep pushing. If you see that these things don't change, we work with engineers--

AUDIENCE: Oh, I will.

JOHN MUELLER: --to let them know that people are waiting. So yeah.

AUDIENCE: OK. All right, thanks John.

JOHN MUELLER: Sure. All right. Let's grab some of the questions here. "Being an e-commerce site, there are many coupon sites that feature our products on their site and copy our content. Due to this, original content published on our site is duplicated on many other coupon sites. Will those affect us in search? What's the best way to mitigate this?" In general, if these are just your product descriptions, and these are legitimate coupons, I wouldn't necessarily worry about that. If people are searching for your product, in general, we're probably going to be showing your site directly instead of these other sites. If people are searching for coupons for your products, then probably, they'll find those sites that have the coupons. So from that point of view, that's not something you need to take care of. At a recent conference, I saw someone mentioned how some site is doing that fairly well to rank above these coupons sites in the sense that they provide coupons as well, or information about coupons. So instead of just trying to rank for your product, maybe it makes sense to put something on your site about coupons as well. Let me see. Oh yeah, Zappos. So if you search for "Zappos coupons," you'll get a page from Zappos about their coupons. So if you find that these coupon sites are problematic, maybe that makes sense to at least provide some content on your site that we can show for these search results as well. "A lot of people think that if you file a disavow file, you'll be alerting Google to the fact that you have unnatural links, and possibly triggering a manual review. As such, many people are afraid to use a disavow tool. What are your thoughts, John?" That's absolutely not the case. So that's not something that happens on our side. The disavow file submission is something that happens automatically in the sense that you submit it. It gets processed automatically. There's no message that's sent to the webspam team about these kinds of situations. So that's definitely not the case. And in any case, if you're trying to clean up an issue with regards to links, I wouldn't see that as something where the webspam team would say, oh, you're trying to clean it up. That's a bad thing. But rather, they want you to clean it up as well. So that's definitely not something where I'd hold myself back and do a sub-optimal job of cleaning things up just for fear that maybe someone will manually look at it as well. I mean, lots of people use a disavow file. Lots of people use Webmaster Tools to clean things up in other ways. It's not something that would make sense for the manual webspam team to actually use as a signal and try to penalize sites based on things they're trying to clean up and improved. So from that point of view, I definitely wouldn't hold myself back from using a Disavow tool if you have any problems with regards to links or anything that you'd like to just clean up because you want to make sure that it doesn't cause any problems in the future. "What do you know about the Panda 4 update? Was there a Penguin refresh this week as well?" I don't think there was a Penguin refresh, at least, not that I was aware of. The Panda 4 update is just generally something that's headed in the same direction as the previous updates as well. So there is nothing more specific that I can really share in that regard. "Is there any way to download Top Pages report along with the keywords from Webmaster Tools?" I think you can download the Top Pages reports separately. So if you go to that feature, there is a Download button that you can use to download the CSV file. We used to have it that you could push it directly to Google Docs. We had a small technical problem there. And at the moment, we've had to disable that. But that should be back, I think, next week. So next week, you'll be able to push it directly to Google Docs again, as well as download the CSV, if you prefer that. The Keywords feature is separate. So you can download that separately. If you're talking about the Top-- what is it called? The keywords that we found on your site while crawling, that's something where I'd kind of take that information with a grain of salt as well, because these are keywords that we found while crawling your site. It doesn't necessarily mean that these are pages that we think are important for your site or keywords that we think are critical for your site or important for your site, or anything where we'd rank your site for those keywords. It's essentially, primarily a raw feed of the keywords that we found while crawling. And sometimes, we find keywords that don't make much sense, that are common items on your site. So maybe you'll have something like "image" or "click here," or those kind of keywords mentioned there. And it's not the case that we'd see your site is being less relevant for your main keywords just because we find those unrelated ones there as well. It's just we want to give you an overview of the keywords that we found. I'd primarily really use the keywords data there to try to recognize if your site was hacked, for example, and see if we find some totally unrelated keywords on your site that you don't want associated with your site, and use that to clean up those things up as well. But apart from that, I wouldn't focus so much in that keyword data there. "If we can tag our duplicate pages with NoIndex, Follow, do we get SEO benefits from external links pointing to those pages?" SEO benefits, that's a good way to call it, I guess. "For example, multiple pages covering a mobile phone and its color variations where the content is practically the same." OK, so essentially what happens with a NoIndex, Follow is we won't index that page itself. But we'll forward the page rank to the pages that you have linked from there. So in that sense, it does forward that information. With different variations of pages like that, you'd probably want to do something like the rel=canonical instead though, which helps us to really understand that these are essentially duplicates and we should fold all of this information into your preferred version. So instead of NoIndexing pages that are essentially duplicates but that you know what to have indexed, probably makes more sense to use a rel=canonical so that we can kind of combine all of that information, all of those signals into the individual pages that you actually do what to have indexed. "Do internal website links with exact match keyword anchor hurt a website's ranking?" No. So no, they don't. "Are too many internal links with the same anchor tags likely to result in a ranking downgrade?" No, that's also not the case. Essentially, you can link within your website however you want to. That's not something that we'd see as unnatural or natural. The main thing I'd watch out here is that you don't run into a situation that you're keyword stuffing on the page. So not so much an issue about the links themselves, but more of the anchor text that you're using there. For example, if you take your whole website and you link it with the titles of each pages in your footer with a 2 point font, then that's something that, if we look at that page, we'll see, oh, all of these keywords are mentioned on all of these pages on the whole website. And it makes it really hard for us to recognize which keywords are actually relevant on these pages. So stuffing things onto the page with anchor text and links is something that looks more like keyword stuffing at some point. And so from that point of view, I'd try to avoid that. If you're just doing natural linking within your website and you happen to use the right anchor text to let users know what they should expect on the other page of your website, then that's absolutely fine. Nothing I'd really worry about there. "Can you give us some tips on writing online content, recommendations on formatting and style to be fully compatible with devices and search engines?" Oh, wow, that's a big topic. I don't have anything specific that I could point to. I'd probably recommend taking a look at our "SEO Starter Guide," which covers a lot of this information with regards to making it easy for search engines to actually read your content. If you're putting your content on HTML pages in a way that you can copy and paste that content and put it into a text file, then that means, for the most part, that we can also read it for search. So in those cases, that's perfectly fine. But what I'd try to avoid is putting important content only into things like Flash files or into images in ways that make it really hard for search engines to actually find that content. But if you're writing normal content, you're using a blog like Wordpress or some other CMS, then chances are that content is going to be probe-able and indexable. And we will be able to bubble it up in search. And at that point, it's not so much a matter of using the right formatting for the HTML or bolding or headings and those kinds of things, but really a matter of making sure that your content is really unique, compelling, high quality, is something that people will recommend when they see it. And that's not something that I'd say is a technical matter. It's really more a matter of what you actually write. "Why am I getting results from the US, UK websites when I'm searching from India?" Sometimes we have websites that are globally relevant and we'll show them up in all locations. So from that point of view, it's not something that would never happen outside of these locations. It can definitely be the case that you're looking for something that's globally relevant, and we have content that's based in the US or based in the UK. And we'll show that in India as well. Sometimes, it can also happen that we understand things a little bit wrong-- for example, that we take some city in the UK and think, oh, this sounds like a city in India as well. And we will assume that it's relevant for India even thought it isn't. And those are the kinds of situations where it could help us if you could file feedback as well. So in the search results, on the bottom, there's, I believe, a Feedback link that you can use to give us more information about that so that we can take a look at that to see what specifically is happening. But in general, some content is relevant globally, and we'll show it in all locations. Multilingual sites. "Is it enough to set up the [INAUDIBLE] in the meta section or should the multilingual URL be freely discoverable via crawlers? Well, on our site, language-specific content is usually delivered based on a cookie. Will this have a negative impact?" Yes, we should be able to recognize and crawl all of the URLs individually as well. So that's something that we need to be able to do so that we can actually crawl and index all of these pages separately. If you have content, for example, on a home page that dynamically changes based on the user's location, that's possible as well. You can set that up with a hreflang=x default attribute, for example. But we'd still need to be able to separately crawl and index the individual language versions as well. So we recently did a blog post on that topic. It's probably one of the last ones on the Webmaster Central blog regarding ways that you can handle your home page. So that has a lot of information that probably helped you in this regard. So the main thing, I think, to avoid is that you only have one URL that always dynamically changes the content. Because in situations like that, we'll never be able to see the individual language content. So we really need to be able to crawl and index the separate language versions as well. "John, why is Webmaster Tools data always few days behind? Why can't they be faster?" So usually what happens with a lot of the data that we collect here for web search, for Webmaster Tools is that we first try to collect it for a day, and then we aggregate it, kind of combine it into the relevant parts that we need to keep and show. And then we bubble that up into the individual parts of our pipeline, into the individual tools that are actually using this information. So that something where it's essentially based on the architecture that we have on our side. That data is always a few days behind, just because of the way that we have to process it and combine it. Essentially, we're not artificially delaying this data. It's not something that we're saying, oh, we have to keep this real-time data away from webmasters. It's just as the way that our infrastructure is set up, we have this way of handling the data. We've looked into ways to speed that up. But so far, it's been really problematic and really tricky to actually get something that's maybe a day old or a couple hours old. So we've tried to set our priorities, and say, OK, instead of speeding this data up by a couple of hours, maybe it makes more sense to provide another useful feature in Webmaster Tools that works for everyone that provides more value than just data that's a few hours faster. But we tried to see what we can do. And maybe at some point, the infrastructural will change a little bit that makes this a lot easier. And at that point, then, I'm sure we'll try to get to data a little bit faster. Structure data. "Is there a schema I can use for product listing pages? The product schema is intended only for individual products. What's the best way to mark up pages, for example, a page that only lists various laptops or site offers?" I'm not absolutely sure. So I'd have to check the Help Center on that as well. In general, what's worth keeping in mind is the kind of mark up you'd probably add on a page like that would be something from But that's something that we wouldn't use directly in search at the moment. So if you want something that's visible in the search results, I'd really limit myself to the information we have in the Help Center, those, I think, five or seven types of rich snippets that we would show and use those kind of markups where they're relevant on your pages. And if you want to go past that and give us and all the other search engines more information, well-knowing that, at the moment, this information won't be used to show. It's more things in search than going through and looking at the various markup types, then adding that is a definitely a good idea. For us, it's always a bit of a chicken and an egg problem. On the one hand, we'd like to show more information in the search results as rich snippets. But on the other hand, if nobody is using this kind of markup, we won't have anything to show there. So it's always a good idea, from my point of view, to go to and think about what else you could add. But if you're on a limited budget, and you need to show some kind of a return on investment to the people who own this website, then maybe it makes sense to stick to the visible rich snippets types-- at least, for the moment, until maybe you do a bigger re-design or something like that. Oh, [INAUDIBLE] added an item list, information, yeah. Maybe that's something that could be used there. "My couple of months old site is getting hit by negative SEO. Someone is pushing comments and trackback links on a lot of sites with super-optimized anchor text." In general, we'd recognize these kind of situations. It seems kind of odd, if a website is just a couple of months old, that someone would even bother doing something like this. So it may make sense to look into what's actually happening there. Maybe these are also things that took place before you started your website. Maybe this as a domain name that was used before, for example. But if you see these kind of things, using a disavow file is definitely a good way to proactively prevent those things from causing any problems. But in general, for the most part, we do recognize these situations and try to handle them appropriately from a manual and an algorithmic point of view. But like with any other problem, if you see this problem happening, instead of worrying about it too much, there are tools that you can use to solve it, to prevent it from causing any problems. "When a good article links to my site, I link back to them from the Web Mentioned section on my website as a form of bragging. Am I potentially reducing the impact of those inbound links by making them all related to my domain and reducing diversity?" I wouldn't see a problem with that, assuming that you're taking a look at the sites that are linking to your articles and not just linking to random spammers that happen to copy or link to your post as well. So in the past, there used to be a problem with trackback spam in the sense that people would set up scripts to ping a bunch of blogs in the hope that these blogs linked back to them and say, hey, someone linked to my blog posts. And they linked to So that's something to watch out for-- that you're not randomly linking to all kinds of sites. But if a good site links to an article on your website, and you think it's awesome that they linked to your article, then linking back is fine. If your site appears in the news or in a newspaper, and you're proud that you've made it into the news, linking back to that news article is perfectly fine and natural. So that's not something that I'd worry about there. Page structure. "Is ranking affected by the position of text on a page-- for example, text coming very early versus late in a page. Is it advisable to keep text at the beginning of the page so that at the time of crawl, it's read earlier than other links or information?" We do try to take a look at how the text appears on a page. But it's not something where I'd say you have to unnaturally tweak things around there, especially with regards to CSS. It could easily happen that the topmost text on your page, when you look at the page in a browser, is actually the bottommost page in the HTML code. So it's not something where you'd unnaturally need to tweak your HTML code to make it look like this text is more relevant or not. It's usually more important for us that we can recognize the context of the text on a page so that we can understand, for example, how this fits in-- so having a clean title on the page, having clean headings. All of those things make it a lot easier for us to understand the general role of the information that you're providing there. And if there's something really important and it's way at the bottom of the page, then generally, that's going to be harder for users to recognize as well, so maybe focusing these pages a little bit more. Making sure that you have one clear topic of each page and that the information on there is structured in a way that's easy to recognize-- that always makes it a little bit easier for us to understand how we should be treating this page with regards to relevance in rankings.

AUDIENCE: Hey, John.


AUDIENCE: So regarding the layout on a page, we know there's the above-the-fold algorithm issues that are important, and then ads-above-the-fold. What are some other things regarding layout that we should be careful about or problems that you typically see? In the past, one that's been discussed has been sliders. But apparently Google's doing better with sliders. But maybe those still are not so good. What are some of the layout challenges or issues you see that might also be a problem?

JOHN MUELLER: Yeah, let's see. I think sliders and tabs, in general, are something that we sometimes have a little bit of a trouble with, just because we don't understand how much of those are actually directly visible on a page, for example. So if you have, for example, really important and critical information and it's in one of the tabs on that page that isn't visible by default, than on the one hand, users going to that page might be a bit confused because they can't find that information that they're looking for. And on the other hand, when we look at that page and we see that there's a section that's actually set to display=none, then we might say, well, the section can't be that important for us in search because the webmaster's actually hiding it by default. So those are the kinds of situations where I'd say, if this is critical to the primary content that you're providing on those pages, then make sure it's visible by default. Make sure it's something that we can see when we look at the page the first time, that the user sees when they look at the page the first time, maybe that it's even above the fold so that users don't have to scroll around to try to find that information. But if this is secondary information, this is various attributes that the normal user wouldn't necessarily, primarily be looking for, then putting that into a tab can definitely make sense. So it's something where you have to balance what you're trying to do for the user there, and what you'd like to have them actually find really quickly. So if there's really important information and you have it in a tab, maybe moving it out of the tab is one possibility. Maybe letting that tab be indexed as a page separately is another possibility. So those are the things I'd watch out for there.

AUDIENCE: Another question I have related to that would be, should we be cautious about using too many-- for example, in Blogger, there's the Tags function. And you do want to be cautious about not over-- your tag should be natural and not too keyword stuffed. But sometimes, when you just use the tags in a natural way, but you're going to have a tag for page rank or a tag for certain kinds of functions, et cetera, then a lot of those tags-- and if people put those on the front page or they're automatically added on every page, those can tend to add an extra layer of keywords and be problematic for-- I mean, I've seen, for example, someone using too many of them.

JOHN MUELLER: That's-- yeah, I think that's something kind of to watch out for. But then it also kind of goes into becoming a usability issue more than actually something that we'd see as a problem with regards to web search. So just by adding five, six tags to individual articles, I don't think we'd see that as a big problem for keyword stuffing, because the articles themselves are usually still much more relevant or much more visible than these individual tags. But it's definitely something-- I wouldn't just randomly add 20 or 30 tags to every article in the hope that it becomes more relevant for those keywords. Because usually, we pick up those keywords from the content itself anyway.

AUDIENCE: All right, thanks.

JOHN MUELLER: With regards to layout, I think one thing to also mention is we're getting better and better at understanding how CSS and JavaScript and all of that work together. So if there are things that are relevant on your page, making sure that they're actually visible when a browser views that page is really, I think, also a good idea. So hiding it in the HTML and then using CSS to use display=none or putting it into a one pixel font, that's something that more and more we're picking up as well. And if we can recognize that it's in the HTML but not actually shown to users, then we might choose to ignore that as well. So if you have something that's important that you'd like search engines to pick up on, make sure it's actually visible on the page. Make sure it's something that users can find as well. "I've added author markup and it tests fine in the testing tool, but the author images don't always show in Google. Sometimes the image is visible. Sometimes, the author name. And in certain cases, nothing. Is there anything which could be done on our end?" Essentially, if the name shows up, if the testing tool shows that it's OK, then you're doing things right, then that's working as expected. We don't always show the images. For some sites, we might not show any images at all. And that's essentially normal. That's the way our algorithms handle these kind of situations. So if you see your name, if the testing tool says it's OK, then that's working as expected. We can understand that you've marked up the authorship there. We can use that for our algorithms if we need to do that. One thing to keep in mind there is, of course, that we don't use authorship for ranking at the moment. So just by adding authorship to your pages isn't going to have them bubble up higher in search results. The only place we might use that for is for the In-depth Articles feature to recognize authors who are actually commonly writing in-depth articles that we'd like to show up in that kind of feature. But apart from that, we don't use that for ranking at all. "Does Panda run on a single day to all websites, or does it only affect the site when it;s next crawled?" When we do an update of the Panda algorithm, usually that affects the whole site. And that's based on the data that we've collected for usually a longer period of time. So it's not the case that if we do a Panda update today that it will be based on the data that we crawled from yesterday. It's usually based on the data that we've collected over, I don't know, maybe a month or a couple of months. So from that point of view, if you feel that you're affected by the Panda algorithm, working to really clean up your site and making sure that it's clean, so that the next time we re-crawl and reprocess all of this information, we'll have that is always a good idea. Doing that as early as possible makes sense. Doing that across the whole website makes sense, because the Panda algorithm is something that looks at the site overall. So from that point of view, doing it as early as possible and as completely as possible always makes sense. "Good morning from London. Our website received a manual action-- unnatural links, impacts links. And I've evaluated 6,000 links pointing to our website, according to Webmaster Tools. And just before I was about to send emails, it expired." I'm not sure how you mean, "expired." Are you here, in the Hangout? OK, doesn't look like it. So what sometimes happens is that the manual actions that we do, they expire after a certain time. But usually, that's more on the side of months or years. It's not something where we'd say, we'll send you a notice of a manual action today. And in two weeks, it will expire automatically. It's really something that usually is more of a long-term issue there. So depending on what you mean with "expired," if you're not seeing this in Webmaster Tools in the Manual Action section anymore, than that essentially means this manual action has been removed. So from that point of view, that seems pretty good. On the other hand, if you have gone through your links and you found a bunch that were unnatural that you'd like to clean up, then cleaning them up is a good thing regardless. But if there's no manual action shown in Webmaster Tools anymore, then you're probably on our good side there. "Our site dropped heavily again [INAUDIBLE] Germans." Yeah, we should do that again. So one thing I've been thinking about doing-- I don't know how you guys would feel-- is to do something like a one-to-one office hours where people can sign up for maybe three, four, five minutes together with their site and me. And we'll take a look at things specifically for your website. I think that might be interesting, that might help cases like this or other cases as well. It'd be maybe a little bit similar to how people could come up at conferences and grab me for five minutes and have us look at something really quickly. OK, in the chat, someone says, great idea. So I need to figure out the mechanics of how that could work, because I imagine there are probably more than a handful of you that are interested in something like that. But I'll definitely see if I set something like that up for one of the future Hangouts. "Is my robots.txt file wrong? Googlebot keeps crawling pages with large view in the URL." I can take a quick look, but I'd probably have to see exactly what's happening there. One thing to keep in mind-- I don't know if this is the case with your specific site, Gary-- the robots.txt file is case sensitive. So if large view is a parameter in your robots.txt text file, and sometimes that parameter is uppercase, lowercase, then you'd want to list those individual case versions as well. But I can double check afterwards to see if there's anything specific happening there. "What are your recommendations for Freebase related with SEO?" I don't know. So I believe Freebase is essentially what we use for a large part of our knowledge graph information. So that's something that we compile, I believe, from various sources. And some of that can be updated by users as well. So if you find things that are completely wrong in there regarding your website, it might make sense to take a look at that and clean that up. But in general, I've seen very, very few issues where people have run across anything that would need to be updated there. But taking a look at that and double checking that things are working might be an idea. "In large e-commerce sites, sometimes content gets duplicated across multiple pages and canonicals is not set up. Should I expect serious search impact due to this? Will that impact only affect the product which has duplicate content or will it have a wider impact?" Usually we'd see this kind of duplicate content as a technical issue that we'd try to solve on our side. So we understand that this is a tough problem. We understand that for some sites, there are tons of URLs that essentially show the same content. So that's not something that we'd hold against a website. What usually happens is if we can recognize that the main page is identical, we'll fold those into one URL and index it like that. If we can recognize that parts of the page are identical, we'll index those pages separately and fold them together in the search results when someone is searching for something generic with regard to that duplicated section. So that's something where you wouldn't see your site ranking lower. You wouldn't see any kind of a quality or a webspam penalty because of that. It's essentially something technical that we'd try to solve on our side. If you can solve it on your side, and we don't have to worry about it. And that makes it easier for us. So that's always a good thing. If you have a really large website and you have a lot of this duplication, then the duplication itself could cause technical issues in the sense that we have to crawl all of these versions first to recognize that we have to fix something. So if our crawling is causing problems with your website, if we're crawling too much, and you know that you have this duplicate content problem, then I'd try to fix that duplicate content problem on your side, so that we don't have to try all of these duplicates to see what's actually found there. And especially, as you mentioned, like large e-commerce site-- it's probably worth checking your server logs to see what we're actually crawling and thinking about what you can do to simplify our crawling, which, in turn, just makes it easier for us to follow your directions. So if you have specific ideas with regards to which URLs should be findable in search, that's something you can control. "About the visual-impaired, is it better to have a separate version where people that need can enter that version or better only one version with a sort of generic optimization for visual-impaired people?" That's a good question. I haven't looked into that in detail. Offhand, I would think that the easiest solution there would be to do something of a kind of a responsive design, that maybe these users can switch to a different CSS file that displays your content in a little bit different way. If you can't do that kind of a responsive design, then I would recommend just making sure that you have the rel=canonical set up properly, so that your rel=canonical is pointing to the preferred version that you'd like to have indexed-- probably the normal, default desktop version-- so that we can crawl both of these versions. We can see the rel=canonical, and if anyone should link to that visual-impaired version, we can forward all of those links to your preferred version through the rel=canonical. It's a little bit easier with the responsive design because you don't have to worry about individual URLs and the rel=canonicals and those kinds of things. You essentially have one URL that's relevant for all of these versions. But I don't know what specifically you'd change for the visual-impaired users. If you have any examples, I'd love to take a look, and maybe we can even include something like that in a blog post in the future. Let's see. We're running kind of low on time. Do any of you have any questions to add in the mean time? No questions? Or everyone accidentally muted? All right, then we'll pile through the rest of the questions and see how far we can get. "Recently, our site got revoked from a partial penalty and we got a confirmation last week via email and a message in Webmaster Tools. But still, after a week, we can still see the partial match in the Manual Action viewer. What needs to be done in a case like this?" Usually-- I mean, what might be happening here is that part of this penalty got revoked and the other part is still there, or the penalty got updated in some way. If the message you received was clearly a message saying that this is completely cleaned up now, then what might have happened is that something just got stuck on our side. So I'd recommend sending me your URL. You can send that to me by Google+ in a private thread, for example. And I'll forward that on to the webspam team so that they can double check. I won't always be able to respond to your emails directly. But I do pass these onto the webspam team or the other teams internally, so that we can see what happened there, what we need to do. "What do you suggest is the best practice for affiliate links? Does it make sense to use a redirect URL-- like example, slash go slash site name and block it with a robots.txt and nofollow. Will this ensure that no [INAUDIBLE]?" In general, for affiliate links, we recommend [INAUDIBLE]. Someone has an echo. That you? Oh, you're here twice, Gary. In general, for affiliate links, we recommend making sure you're not forwarding page rank. You can do that by using the rel=nofollow on those links, if you want. You can use a redirect on your side that's blocked by robots.txt, which could always also help out there. It depends on what you want to do. Some people like to use a redirect because then they can track it a little bit better. Other people just use a rel=nofollow because it's easier to implement. It's essentially up to you. From our point of view, we just want to be sure that these affiliate links, since there's a commercial link between the sites there, don't forward any page rank. So whichever version you choose is essentially up to you. For a lot of the more common affiliate systems, we recognize the situation automatically and kind of automatically ignore the link there. But if you're putting up your own affiliate system or if you're using some system that's not so commonly used, making sure that it doesn't pass page rank is always a good idea. "There's some files listed in my Sitemap that are blocked by my robots.txt thanks to a plugin I use. Could this be a reason for bad page ranking?" No, unless, of course, you're trying to rank the pages that are blocked by the robots.txt, which I assume is not the case here. So just because some URLs are blocked by the robots.txt doesn't mean that the site is low quality. Usually that's a sign that you're trying to pay attention and do the right thing. And if they happen to be listed in your Sitemap file, then from our point of view, that looks like a mistake that we might want to let you know about. But it's not something that we'd count against a site. And we'll still understand the rest of the URLs in your Sitemap file. We'll still use that to optimize our crawling of your site. So that shouldn't be any case or problem there. "Regarding the manual action, it doesn't appear in my Webmaster Tools anymore. Shall I proceed and disavow the low quality links?" If you're aware of low quality or bad, unnatural links to your site, then disavowing them is a good idea, I think. If you don't have a manual action and you don't know that there's anything problematic out there, and you know that you never hired any SEO that did crazy things in the past, then, from my point of view, you don't need to dig for those kinds of links. But if you see them, if you know that they're out there, if you know that a previous SEO did some shady things, then cleaning that up is always a good idea. I never want to hold you back from trying to clean things up.

AUDIENCE: Hey, John. I have another question here.

JOHN MUELLER: All right.

AUDIENCE: In the links to your site in Webmaster Tools, I have one site which was a secondary site to mine that I had once linked to my site in the navigation, so it showed hundreds of links back to this site. But it wasn't related, and I actually took down that site. So there's been no website there for like a year now. But it still shows up at the top of Links to Your Site as the second most linked site to that website. Should I still, at this point, be concerned with disavowing those links or disavowing any links that may be-- I removed a month or more ago? And why do you think that would still show up after the site's actually been removed for like a year?

JOHN MUELLER: A year seems like a long time. It seems like we should be able to update that data a little bit better. Maybe something that got stuck there. So if you could send me the URLs, that would be useful to look at. In general, if those links don't exist anymore, if that site doesn't exist anymore, then you don't need to disavow that. Because if we tried to re-crawl those pages, we'll see it's gone. We'll take that link out of our link graph and it'll essentially be irrelevant anymore. So that's not something where you'd have any advantage by disavowing that link as well. Because if it's removed, when we crawl it, we won't need to look at the disavow file because it's already removed. But if you're seeing links there that are really, really old that comes from sites that have been removed for a year or longer, that seems more like that a data quality problem in Webmaster Tools and something that we could look at with the Webmaster Tools team to make sure that we're garbage collecting these old things that don't actually have any existence anymore.

AUDIENCE: OK, yeah, I can look that up. It has been a long time, and I was concerned because it had like 1,000 something links pointing in to the site that doesn't exist anymore.

JOHN MUELLER: Yeah. I mean, sometimes the counts can be a little bit misleading. If it's a site-wide link and we've crawled a lot of pages from a site, then we'll show a high count there. But it's not necessarily saying that this is the most important link to your site. But I understand it can be confusing. And I think, from a data quality point of view, Webmaster Tools shouldn't be showing things are so-- maybe something got stuck there. Maybe that's something we can fix in our pipelines in general. For web search, these things usually clean up a lot faster and are not something that you'd have to worry about there. But sometimes, the Webmaster Tools data, because of the way we aggregate it from search, could get stuck if we don't recognize that it actually disappeared from search. So maybe something like that is happening there. And it would be good to check that out with the Webmaster Tools team directly.

AUDIENCE: Oh, so maybe if the site was taken down, rather than the links just removed, that might have made [INAUDIBLE].

JOHN MUELLER: I don't know. To me, if you're saying this site hasn't existed for over a year now, Webmaster Tools should be able to reflect that state. So I see that more as a data quality issue than something that would be affecting your site in search.

AUDIENCE: All right. I'll send that to you. Thanks.

JOHN MUELLER: OK. "My site got hacked and got a malware attack. I immediately cleaned up and requested, through Webmaster Tools, to remove the warning. And the warning was removed immediately, but rankings dropped 10 times now for one year. What can I do?" In general, when a site is hacked through malware, that's not something that would affect your site's ranking. That's something where we just show this warning in the search results and in browsers. It wouldn't affect the ranking of your site. On the other hand, if your site gets hacked in a way that the hacker adds hidden content, hidden links, redirects to affiliate sites, maybe includes content on your website that hosts, I don't know, affiliate or spammy content, then that's something that could affect your site's rankings. So that's something where you might want to double check that this is actually completely cleaned up-- so not just the malware side of things, but also the SEO hacking that sometimes happens. Another thing to keep in mind is that maybe this drop in ranking has nothing to do with your hacking at all, and this is just a normal way that the algorithms have responded to your website in general. So if you're not sure about whether or not your site still has some remains of the hacking-- so maybe hosted pages, hidden links, those kinds of things-- or if this is more or a general drop in rankings, going to one of the Webmaster forums and asking some peers to take a quick look can help a little bit. There are also some tools out there that try to recognize different kinds of hacking that you could try out that could give you a little bit more information as well. So depending on how far you want to go using these various tools to double check that your site is actually completely clean and not hacked in other ways is a good idea. And getting feedback from peers about whether or not they can see remains of hacks or maybe they can see other issues that just generally affect your site. "What happens when we point with a rel=canonical a URL with a noindex, nofollow? Does Googlebot get crazy or does it get stuck?" So if you point a rel=canonical at a page that has a noindex, it's hard to say what exactly will happen there. What might happen is that we just index the URL that has the rel=canonical instead of the page that has a noindex. So if there's a page on your site that has a rel=canonical pointing to a different page, and the first page doesn't have a noindex, but the one you're pointing at has a noindex, then maybe we'll just stick to the one that doesn't have the noindex. It's important to keep in mind that rel=canonical is a signal for us. It's not a directive. It's not like a 301 Redirect. So it's something where maybe we'll index a page with a rel=canonical on it anyway and keep it like that because we think the rel=canonical is wrong or broken or something like that. All right. Wow, ran out of time, of time. and just a few questions left. Let me just go through these really quickly, and maybe we'll be able to make it this time. "Do you know if Trusted Stores is coming to the UK soon?" I don't have any information on that, sorry. "In the last Hangout, I asked about a manual action regarding thin content. You were going to take a quick look. The reconsideration is six weeks with no reply from Google." I'll double check that. I'm not sure which site that was. I think I lost the connection to the other question with the URL. I'll see if I can find that again. "We placed a meta noindex, follow on all Geo pages, but the nofollow is very slow to the index. What else can we do to get the manual action lifted, or get further advice to meet Google's requirements?" Oh, OK. Here's the URL. The thing with thin content sometimes is if you have a lot of pages with thin content, it's just going to take a lot of time for those to actually be re-crawled and re-indexed. And that's not something that you can trivially speed up. So using something like a Sitemap file helps us. If you send a Sitemap file with the last modification date set for those pages, then that can help us, if we can recognize the last mod date is correct. But to a large extent, it's just going to take quite a long time for us actually re-crawl a very large site that has a lot of thin pages that are now in noindex. So I definitely see something in the range of three to six months as being normal for that to be cleaned up. And once that's cleaned up, I'm sure the Manual Actions team will be able to take that into account. They may even be able to take that into account a little bit earlier if they see that you've actually cleaned things up significantly, even if they haven't been processed completely yet. "Should we expect getting a new Authorship option in Webmaster Tools?" At the moment, we only have a beta version on the Labs. I'm not aware of that changing at any time at the moment, so I'd at least use a beta version in the Labs for the moment. "If my site responded strongly to the Panda update, does that mean my content has problems when I'm being let off for them? If so, what can I do to ensure future updates don't hurt me again?" Usually, that means that the content is something that we're seeing problems with. If this is a relatively new site, it might also just be that we're kind of not sure where things are going to settle down. And it might take an iteration or two for things to really settle down at the right level. But it depends on what specifically is happening there. "Could an SEO work at Google someday?" Sure. We have people that have been SEOs, so don't let that hold you back. I think it's good to get information, to practice these things, to understand how websites work. Those are the kind of people that we like to have here at Google. "A comprehensive PDF will [INAUDIBLE] the page perfectly, like usually [INAUDIBLE] an internal link will lead to the PDF. Is Panda able to recognize its relevance and benefit for the user, even if it's password protected?" If it's password protected, we can't obviously crawl and index that PDF. If there's just a link to the PDF, then that's not something we're going to take into account that much. But if we can crawl and index that PDF and show it in search, then we'll try to do that. But like with other situations where linking to a good site seems like a good thing for users, it's not something where we'd say, just because there's a link to something good that this page is automatically good. I remember way in the beginning, spammy websites would all link to Google and Yahoo and Alta Vista in the footer in the hope that, oh, this will make my website look authoritative, because look, I'm linking to these good sites here. But actually, that doesn't really change the website overall. Just because you're linking to a good site doesn't make your content good. So first, make your content good, and make sure that you're doing the right things for users, especially if these links are things that we can't actually follow. All right, wow. I think we made it through. OK, so with that, thank you very much for your time. Thanks for joining. Thanks for all the questions. And I hope to see you guys again in one of the future ones. I'll try to set up something where people can reserve something like four, or five minute slots, and we can talk one-to-one about your specific site. Maybe that will help for some of these cases as well.

AUDIENCE: All right, thanks, John.

JOHN MUELLER: All right. Have a good weekend, everyone then.

AUDIENCE: Thanks, John.

JOHN MUELLER: Bye. | Copyright 2019