20. December 2006 · Comments Off · Categories: Google

Google is always controversial, and nowhere more than with their decision to index “all of the world’s information” and to put the content of ALL BOOKS online. That certainly is an ambitious project, and Google is known for being ambitious. But not everyone is happy with their vision, and alternative ideas are coming to the fore.

Google’s restrictions on its digital book copies stem in part from the company’s decision to scan copyrighted material without explicit permission. Google wants to ensure only small excerpts from the copyrighted material appear online – snippets that the company believes fall under “fair use” protections of U.S. law.

Since Google is planning on making their search engine the only one to be able to use the scanned books, the questions of fair practice take on greater significance.

The Open Content Alliance might have some answers. Their alternative approach is to only scan books where they have written permission, or where the copyright has expired. That does seem to be a particularily more noble goal.

06. December 2006 · Comments Off · Categories: Google

Google has made it very easy to add a sitemap to their Google Sitemaps program. All you need to know is a few basics about their XML protocol which is laid out here.

I was able to setup some rather minimalist Sitemaps which seem to have done the job. I used the “loc” tag to enter the URL.

You can go as deep as you like with the program, even going so far as to indicated the desired crawl frequency, and the desired “priority” for each page. However, obviously, Google can choose to ignore these suggestions quite easily.

For simplicity’s sake, I decided to just give them the location and let them decide the rest. After I observe what actually occurs, I’ll be in a better position to judge what the tweaks might mean for me. The program is very easy to use.

Here’s the Sitemap in XML for the crawlers that can use. I know that MSN and Yahoo now support the format, so I’ll see if they pick up on it naturally, or if I have to submit.

29. November 2006 · Comments Off · Categories: Google

I’ve seen a lot of strange Google updates this year and this latest one is very weird. On Monday, a lot of new traffic came in, and now on a few select websites I see traffic way down. This is now completely typical to me. You can expect these violent swings in traffic, and it’s one way to keep you on your toes.

Updates like this are the reason I don’t make all my websites exactly the same. They’re also the reason I’m always trying to diversify my revenue from search engines as much as possible. If you don’t, you might end up holding the bag at the end of a nasty update gone bad.

I guess this is just all part of the game, so it’s nothing to get too excited about, but this time I’m immediatedly adding content to the affect website to see if I can hasten its’ return to the index. We’ll see how it goes.

20. November 2006 · Comments Off · Categories: Google

These Japanese researchers have come up with something interesting concerning how to determine proper names in Google SERPs.

The software tool, developed by researchers at the University of Tokyo in Japan, picks apart the results of a search engine query, identifying unique identities within these results. For example, it can tell the difference between Michael Jackson the pop singer and a travelling beer expert of the same name, who also appears on the first page of results produced by Google.

How does the software accomplish its’ task? Basically through a combination of Semantic analysis and on-page analysis that checks the first 100 results from Google. The software examines more relationships in the results to determine what is actually meant by the name.

Another example of how the use of LSI can affect indexing.

08. November 2006 · Comments Off · Categories: Google

If you haven’t heard the news yet, there’s some pretty big news you should read concerning changes made at Google.

Here’s where you can find the information:

I’ve also written an analysis about these radical changes, if you’re interested in reading.

Check it all out. These are major, watershed changes that will affect all Adsense Publishers.

01. November 2006 · Comments Off · Categories: Google

I know it can change in minutes, but the results have been stable for a few days. I’m still noticing the strange blinking on and off Pagerank for my one site, but I’ve decided to ignore it. Odd occurences are nothing new in Google.

Lately my main concern has been spider frequency. The more the spiders come in, I figure, the better off we all really are. This is almost always the truth. I can’t help but noticing that pages don’t get spidered at all anymore if they’re in sections, and the sections don’t have many incoming links.

It’s a dog eat dog internet world these days, and it appears that you need deeplinks even more than you used to. I guess it comes as no surprise as it’s probably a good indicator that the content page is high-quality, if someone took the time to link to it.

It does appear that Google has restored some of their emphasis on relevance, which is always a good thing for me.

18. October 2006 · Comments Off · Categories: Google

Another strange Google update/whatever. This time I see a perfectly great blog of mine get completely wiped out. All backlinks to Blog Republic have been wiped out in Google, despite the fact that Yahoo shows 66,000 of them. The site has also been reduced to a PR 0.

I’m sure there’s no good explanation for this retarted behavior. I can’t think of any logical reasons, and I’m sure I’m not alone. Just one more example of how bad Google sucks. They can’t get one routine thing done anymore as it relates to their functions as a search engine.

I guess this would even be demotivational, if I really cared what they “thought” of my website in the first place. As it stands, I seriously suggest they fix their broken search engine, before someone builds a better one and kicks them in the ass once and for all.

12. October 2006 · Comments Off · Categories: Google

I’ve noticed for quite some time that Google is placing tons of forums posts made on Vbulletin software into their supplemantal search index. Undoubtedly Vbulletin does create a lot of duplicate content, by default, but even on websites where they use robots.txt exclusions to limit Google’s crawling, the trouble is rampant.

I’m sure the things I’m noticing aren’t related to using VBulletin software. It’s just that forums tend to have a ton of interior pages, and many of those pages won’t have deeplinks to it. Even the internal linking structure of the website will make it tough to pass around much PR, so it’s understandable that as time goes on, more unimportant pages may go supplemental.

I’ve reviewed a number of sites where this has gone on for the past few months, and I haven’t seen the slightest improvement.

11. October 2006 · Comments Off · Categories: Google

It’s been a few weeks in the making, but it looks like the most recent Google backlink and PR update has settled in. This one took a few weeks to run its’ course, and it has show itself to make quite a few changes in SERPS.

I’d say the index has been very volatile lately. I’ve seen a return of pages that have been gone since 6/22/2005, and I’ve finally seen some fixes of long overdue issues. This week has also seen the return of the crawlers in earnest, and some of the biggest deep crawl statistics I’ve seen in awhile.

I assume that Google is finally completely on the new infrastructure they’ve been moving to for over a year. At least, I imagine that to be the case. We’ll see if the results continue to shift madly, or if everything settles down.

04. October 2006 · Comments Off · Categories: Google

It’s true, despite the number of times that people “in the know” have said “PR doesn’t matter anymore”, it doesn’t really stop anyone’s fascination with the Little Green Bar. PageRank has been one of the greatest accomplishments in another form of PR (Public Relations) in the last 10 years. You take a proprietary formula and make it something that everyone’s interested in. That is the engine on which Google has built their business: free marketing through word-of-mouth.

The PageRank is hopelessly outdated by the time it hits the Toolbar. It’s a “snapshot” that is many months out of date. For this reason, it’s very tough to use the Toolbar as a visual tool to help determine rankings. It’s better to consider it to be a form of entertainment. Does this stop it from being of utmost importance to some people?

I actually don’t use the Toolbar anymore, but when the updates come, I still am curious to check. The current update is still rolling along, but should be complete soon.