Google now Indexing Flash Content

It seems Google are really on a mission to uncover the invisible web with new announcements of algorithm tweaks here and there to improve their coverage and search engine results.

Today, Google announce the ability to begin indexing Flash files. Pretty big news really, and with the promises of executing JavaScript in the near future one wonders why you should even optimise your website? At this rate, you might as well just sit back and wait for Google to solve your current issues, knowing the way some companies work this might actually be a more efficient strategy.

But it’s not quite perfect yet, there are a few issues that we still need to think about when using Flash on our websites.

First the limitations of Google
Google still can’t execute JavaScript, and when they do it will most likely be clunky, inline event handlers rather than sophisticated, external onload functions. So any Flash implemented using JavaScript and you’re out of luck.
Google only includes content embedded in the original Flash file, not external movies or text (including HTML or XML) so again, you are probably out of luck if you are pulling in your content externally. Edit: External files are indexed by Google, but as individual files not related to the parent Flash file, similar to how iframes are indexed.
Google is trawling through your Flash file looking for anything that makes sense and grabbing it from the Flash file. This may also apply for HTML websites when built incorrectly, but you can bet they can work out an HTML website (semantics, basic information architecture, etc) much more easily than a messy Flash file with the odd URL and text snippet here and there.
Then, the limitations from our side
Flash isn’t always accessible. Most Flash developers are still unaware of accessibility options available through the new Flash working environments.
Flash is still a pain when you need to update content. You have to republish the movie unless you are working with convenient external date (eg. XML, HTML) and it’s being pulled in by the movie. But hold on, now you aren’t getting indexed by Google anymore…
Flash is still overused. Just because we can use it, doesn’t mean we should. It needs to be the right tool for the job but I fear I can hear another splash page being built in the distance because, “Google indexes Flash!”

I think Google are doing some interesting work while trying to uncover more and more information on the web, despite it’s current state. Soon HTML forms, JavaScript and embedded media (including Flash and video) may be semantically parsed and interpreted. We are far away from this being perfect, but I really see Google stepping up in this area lately.

I do find it entertaining though, that every enhancement that Google makes in uncovering the invisible web, we as web developers lose a bit more of the argument arsenal we carry around in the name of web standards.

I do hope this is just a transitional, experimental stage of exploring the web. I would like to see Google uncover content that can’t be spidered, but then move onto the next stage and continue to promote best practice accessibility, information architecture, web design and development.

Let’s cross our fingers and just watch how it all unfolds.

Comments

john Allsopp says: July 1, 2008 @ 7:30 pm

Wonder what percentage of the web’s indexable text content is in flash format? Wonder what percentage is microformatted?
I wonder if this is going to give new life to 100% flash based web sites?

john

Mike Busch says: July 1, 2008 @ 10:02 pm

It probably has a better chance of giving life to sites developed entirely in Flex.

Good point about Google’s enhancements diluting the argument for web standards. I definitely agree that they should (at some point) reward sites who use best practices and make Google’s algorithm sing.

Pat says: July 2, 2008 @ 10:15 am

Thanks for this post Scott.

“…I can hear another splash page being built in the distance because, “Google indexes Flash!””

Sadly, me too. This will definitely be one of those catch phrases that tear through agencyland like a wild fire. Just like “users don’t scroll” and “the 3 click rule”. *sigh*

With regard to full Flash sites, I don’t think this will have much of an impact yet. Most people building these sites have ‘progressed’ to the point of loading in sub-movies or content from external files and are probably using SWobject to put it on the page (because these things are “best practice”). So none of it will be indexed yet.

Standardzilla says: July 2, 2008 @ 11:34 am

@john - there has been talk about that and the general consensus is that it will be great for Flash sites already out there, but really building a site in Flash is not the equivalent to HTML. Meaning if you still want to rank number one, then forget the all Flash sites for now.

@mike - still a very contentious issue with me, but no I figure they will not actively reward sites for best practice.

Patrick says: July 2, 2008 @ 1:58 pm

If Google gives me a decent “view as HTML” link for flash in search results then it works for me.

Andrea Hill says: July 3, 2008 @ 2:06 pm

I am still not sure about the whole “google can’t execute javascript” disclaimer that went with this announcement. SWFObject is incredibly widely in use - I’d argue by the people whose sites actually have good flash with content worth indexing! Although perhaps this isn’t about how the flash is embedded in the page so much as how the use of ExternalInterface may not be supported?
The ‘no-JS’ thing was one major caveat that I neglected to mention in my post about flash indexing yesterday. :(

We seem to have some conflicting information on external files, however. I understood that they would be indexed, but as separate linked files, not as embedded content. Which basicall means your content would be indexed (good from a “site-ranking” perspective) but if accessed via search results, not displayed within the flash experience (horrible from a user perspective).

I’ll definitely be interested to see how bad the initial search results are! :)

Standardzilla says: July 3, 2008 @ 3:03 pm

@andrea - you are correct, my mistake in the post above. Google does index external files as you say, but they consider them to be separate files with no relationships to the initial Flash movie. So basically you are better off than before, but it’s still very messy. Iframes are spidered in the exact same way.

re: SWFObject, all of these JS implementations use flat HTML and then inject the Flash content afterwards with JS. So really, you shouldn’t want that Flash content being indexed as HTML would have taken care of that. Would be interesting what happens when they can actually index both versions?

Maxine Sherrin says: July 4, 2008 @ 10:07 am

He he he ….. ever since I read this I have sitting down here waiting for the guys in the agency to start talking about it. 3 days and counting :)

Cheryl says: July 4, 2008 @ 11:35 am

C’mon Max, you know it takes agency guys about 3 months to get the same news the rest of us do.

I bet they’ll start banging on about it in September…

Brad says: July 8, 2008 @ 4:53 am

From what I can gather, the only big difference between indexing swiffies and indexing Flash-authored comment tags (which Google has been doing for years now) is that Google can now follow the links embedded in the initial swf.

That a win for standards. Nothing has changed in Flash’s inability to climb out of the gutter of search results, but links from a swf can now be added to an HTML page’s Page Rank.

trouwjurken says: March 4, 2009 @ 10:43 pm

Are there any examples of well-indexed flash sites yet?

Leave a Comment