Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Improved Flash indexing

Monday, June 30, 2008 at 9:31 PM



We've received numerous requests to improve our indexing of Adobe Flash files. Today, Ron Adler and Janis Stipins—software engineers on our indexing team—will provide us with more in-depth information about our recent announcement that we've greatly improved our ability to index Flash.

Q: Which Flash files can Google better index now?
We've improved our ability to index textual content in SWF files of all kinds. This includes Flash "gadgets" such as buttons or menus, self-contained Flash websites, and everything in between.

Q: What content can Google better index from these Flash files?
All of the text that users can see as they interact with your Flash file. If your website contains Flash, the textual content in your Flash files can be used when Google generates a snippet for your website. Also, the words that appear in your Flash files can be used to match query terms in Google searches.

In addition to finding and indexing the textual content in Flash files, we're also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages. For example, if your Flash application contains links to pages inside your website, Google may now be better able to discover and crawl more of your website.

Q: What about non-textual content, such as images?
At present, we are only discovering and indexing textual content in Flash files. If your Flash files only include images, we will not recognize or index any text that may appear in those images. Similarly, we do not generate any anchor text for Flash buttons which target some URL, but which have no associated text.

Also note that we do not index FLV files, such as the videos that play on YouTube, because these files contain no text elements.

Q: How does Google "see" the contents of a Flash file?
We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, entering input, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. We can't tell you all of the proprietary details, but we can tell you that the algorithm's effectiveness was improved by utilizing Adobe's new Searchable SWF library.

Q: What do I need to do to get Google to index the text in my Flash files?
Basically, you don't need to do anything. The improvements that we have made do not require any special action on the part of web designers or webmasters. If you have Flash content on your website, we will automatically begin to index it, up to the limits of our current technical ability (see next question).

That said, you should be aware that Google is now able to see the text that appears to visitors of your website. If you prefer Google to ignore your less informative content, such as a "copyright" or "loading" message, consider replacing the text within an image, which will make it effectively invisible to us.

Q: What are the current technical limitations of Google's ability to index Flash?
There are three main limitations at present, and we are already working on resolving them:

1. Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed.
2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.
3. While we are able to index Flash in almost all of the languages found on the web, currently there are difficulties with Flash content written in bidirectional languages. Until this is fixed, we will be unable to index Hebrew language or Arabic language content from Flash files.

We're already making progress on these issues, so stay tuned!

Update: Everyone, thanks for your great questions and feedback. Our focus is to improve search quality for all users, and with better Flash indexing we create more meaningful search results. Listed below, we’ve also answered some of the most prevalent questions. Thanks again!

Flash site in search results before improvements


Flash site after improved indexing, querying [nasa deep impact animation]


Helping us access and index your Flash files
@fintan: We verified with Adobe that the textual content from legacy sites, such as those scripted with AS1 and AS2, can be indexed by our new algorithm.

@andrew, jonny m, erichazann, mike, ledge, stu, rex, blog, dis: For our July 1st launch, we didn't enable Flash indexing for Flash files embedded via SWFObject. We're now rolling out an update that enables support for common JavaScript techniques for embedding Flash, including SWFObject and SWFObject2.

@mike: At this time, content loaded dynamically from resource files is not indexed. We’ve noted this feature request from several webmasters -- look for this in a near future update.

Interaction of HTML pages and Flash
@captain cuisine: The text found in Flash files is treated similarly to text found in other files, such as HTML, PDFs, etc. If the Flash file is embedded in HTML (as many of the Flash files we find are), its content is associated with the parent URL and indexed as single entity.

@jeroen: Serving the same content in Flash and an alternate HTML version could cause us to find duplicate content. This won't cause a penalty -- we don’t lower a site in ranking because of duplicate content. Be aware, though, that search results will most likely only show one version, not both.

@All: We’re trying to serve users the most relevant results possible regardless of the file type. This means that standalone Flash, HTML with embedded Flash, HTML only, PDFs, etc., can all have the potential to be returned in search results.

Indexing large Flash files
@dsfdgsg: We’ve heard requests for deep linking (linking to specific content inside file) not just for Flash results, but also for other large documents and presentations. In the case of Flash, the ability to deep link will require additional functionality in Flash with which we integrate.

@All: The majority of the existing Flash files on the web are fine in regard to filesize. It shouldn’t be too much of a concern.

More details about our Flash indexing algorithm
@brian, marcos, bharath: Regarding ActionScript, we’re able to find new links loaded through ActionScript. We explore Flash like a website visitor does, we do not decompile the SWF file. Unless you're making ActionScript visible to users, Google will not expose ActionScript code.

@dlocks: We respect rel="nofollow" wherever we encounter it in HTML.
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

180 comments:

Jennifer Mathews Somogyi said...

I saw this coming when Adobe made the pdf have recognizable text and they bought out Macromedia (including Flash) bringing recognizable text to Flash as well (though I have to admit I had trouble with clickable text buttons in Flash there for a while). Despite what my SEO colleques say I always believed that Flash could be optimized, it's just a matter of how (and when).
Keep it up!

Jeff Schiller said...

I know this is not related to Flash, but can someone point me in the direction of Google Folks I can talk to about indexing SVG textual content? Thanks,

Jeff Schiller
W3C SVG IG Chair
Email: codedread (using gmail)

npdoty said...

Should Flash websites which accept input of email addresses for newsletters expect some garbage input data each time they're crawled now?

Regarding making unimportant text into an image: is this recommended for HTML websites as well? (I assume that it isn't.) Could there be some way to mark up text in a Flash file to signify which text is important and which isn't? Will Google's algorithm start to learn which Flash text to ignore?

Andrew said...

Thanks for the info, this is exciting news. I have a couple questions:

"1. Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed."

--What about the commonly used SWFObject technique that you guys are now supporting? You can index flash loaded by that right?

"2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file."

--Wait, you aren't indexing content loaded by XML files? Isn't that like all the content worth crawling? Large Flash sites rely on this for ALL their content, can you elaborate further?

http://search-engines-web.com/ said...

When one considers that Text is vector and crosses all fonts - this mashup technology should have come along YEARS ago.

It just appears that no one felt it would deliver an ROI to research and implement - so Webmasters had to compensate by using the NOEMBED or NOSCRIPT or ALT tags


This also may have been the reason some Webmasters were using Hidden Text (although against Google's guidelines)

But one had to choose between aesthetics and SEO - so either give up on one or use strategies to compensate for the limitations in search technology

Samiq said...

pingback from samiqbits

[Google talks about how the Flash indexing will work]

dotcompals said...

Thank you google. Thumbs up.

http://www.grrajeshkumar.com said...

This is a quite a relief to several design artists. My worry is that client as well as creative artists should not start building websites with full of flash intros.

Even if it isn't for SEO, at least for better user experience.

beussery said...

Do sites using SWFObject for example, with content in "hidden" divs or other need to go back and remove this content?

Is this type of content now considered duplicate content?

Thanks for your time and this great news!

-Brian Ussery

Jeroen said...

This is good news indeed. But in the past, when a client requested a Flash-only website, I created an HTML version as well for search engines. Will these sites now be penalized for duplicate content? And what results will be shown to visitors?

Jonny M said...

I echo andrew's comments!

Andrea Vit said...

Wow, Google index Flash websites...And what about W3C principles and accessibility?

Thorsten said...

And what about the indexing of Silverlight-"documents"?
Questions over questions...

David Hulbert said...

If you prefer Google to ignore your less informative content, such as a "copyright" or "loading" message, consider replacing the text within an image, which will make it effectively invisible to us.

Nooo! Please, Google, don't tell people to do this. How are blind people supposed to read this? How can images be translated or reliably magnified?

Olaf Lederer said...

it would be nice to have a tool that will show what Google gets while indexing a flash file. This way it's possible decide if we continue with swf object (using alternative content) or that we switch to the new method.

Preeti said...

This will mean that we can actually now have more photo galleries and Google can crawl them :)

ezuk said...

Oh, no Hebrew... :( And I was getting all excited.

PingaTM ePages - - said...

Hi Google

If this is the true valued function that will work for swf indexing with its text and links than it will be very easy to seo as well as designer.

I do not find that the time since this will be in live.

Thanks

Raj
seo:http://www.pingasolutions.com

Theo said...

Why do you bother?

It just as meaningless for Google to index SWF files as it would be for Google Desktop to index EXE-files.

There are a number reasons why what you do is meaningless for Flash websites other than the absolutely most basic:

1) Most Flash websites are embedded using JavaScript. I would go so far as to say that the overwhelming majority of Flash sites are embedded using JavaScript, mostly because that is the default behaviour of the Flash authoring environment. Unless you execute that JavaScript you will not discover that a page contains an embedded SWF. You will not even get so far as to discover that there is something there to index.

2) A significant percentage of Flash sites for various reasons use a bootstrap SWF which loads the main SWF. Unless you execute the first SWF you will not find the actual SWF. This means you will se no content at all except perhaps "Loading...".

3) A significant percentage of Flash sites load their text content dynamically in the form of XML files. Unless you execute the SWF you will not index that content. Again, you will see very little content.

4) How would you make sense of this:

myTextField.text = "Hello " + (Math.random() < .5 ? "world" : "Google");

without executing the SWF? How would you make out any meaningful structure from a file format that not only can, but almost always is manipulated by scripts? How do you know in which order things will be presented to the user if you don't execute the SWF?

These are just a few things off the top of my head. This is by no means a simple problem, and definitely nothing that you can solve by indexing the SWF file format, and if I were you I would stop spreading the kind of misinformation that you do and instead start a dialouge with us, the Flash developers, to find a real solution, a solution that actually works in the real world for real Flash websites.

Shital Jethva said...

thats a great news..

Brian said...

My understanding of "2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file."

is that Google WILL index the XML content but it will treat it as a separate page of the site and not part of the Flash content it finds.

Jason said...

This is neat, but it reduces the places where we can legitimately obscure piecemeal content from Google.

For example, if we had a categorized list of our content included on all of our pages (say a nav bar), then Google would see that list as part of the content for every page of our site. So we can legitimately hide it to fight against duplicating content and improving relevancy by putting it into JavaScript or Flash ... now it's just JavaScript and I'm sure it's only a matter of time before that's no longer an option.

We need some sort of way to tell Google that parts of the page are not actually relevant to the core content within.

Old School reference, but how about earmuffs? ;) A div class or an HTML element itself?

Anyway, kudos and great work!

Brian said...

I really hope this doesn't take us back to the days of out of control all Flash sites with Skip Intro pages.

wLinkin said...

This is great news! By the way, for fun, do a google search for "loading filetype:swf". ;)

Theo said...

Brian:

The key part of the quoute is "it will not yet be considered to be part of the content in your Flash file".

That means that any searches that would match the content in the XML wouldn't match the SWF, so searchers would still not find the site.

There is also another problem with indexing the XML loaded by a Flash site: XML has no given semantic structure and any XML found will not have any meaning unless it's loaded, parsed, interpreted and displayed by executing the SWF.

XML by itself is just a random collection of words, it could contain a HTML tagged text, or it could be a configuration file, how would you know?

dsfdgsg said...

• Will I be able to disable this feature in my search preferences?

I tend to avoid Flash-based sites and I'm worried this new ability will "spam" search results with content I don't want to see.


• How will you link to content inside "100% Flash" websites?

Are days of linking directly to relevant content are over? Will users have to click through intros and flash menus to get to the subpage displayed inside Flash animation?

Pre said...

I agree with Theo's comments. However, I feel that this is the start of more complex indexing processes. This is a great improvement from where we were.

My thoughts are that the use of flash in websites will grow to deliver multi-media content such as videos, audio, images etc. More effort should be placed in tying such content in relation to content of the page.

Brian said...

Theo:

I see your point. Even if Google indexes the XML file, it's not useful information until it interacts with the Actionscript in the SWF file.

Between this and the Javascript problem, it means that the content in most professional level SWF files still won't be indexed.

I'd really like to hear some more from Google on these subjects.

shade said...

1. I don't understand why they have to execute javascript to be able to index swf content though. It seems to me that most javascript embed patterns have a pretty clearly identifiable "filename.swf" string appearing somewhere in an object property or function call. Couldn't they easily parse all script elements for any .swf referenced files and then parse those, assuming that those swf files will in fact get shown to a non-googlebot browser?

Is that perfect? NO! But it would go a long way toward making their indexing feature actually be useful. At this point, it seems like this will be pretty moot.

2. What about flash swf's (AS3) that have things like captions that appear over and in conjunction with playing FLV video. This "content" is inside the SWF player, well technically its in a loaded XML file probably, not the FLV... I wonder if that kind of content can be indexed?

mike said...

Well Google. Marks for trying. But as usual, you fall short. The development is inadequately explained to help us [your policy of non-transparency is at odds with your mission of do no evil] and then to cap it all, you don't answer any questions that are posed as a result. The exceptions you posit for the indexing of flash mean that this development would seem to essentially be 'no change'.

I'm surprised you didn't title this article 'Googlebot's Got Eyes'. That seems to be about the normal reading age level of your public statements. [http://www.blog.zoozoom.com/press/] Or perhaps you could have used some anonymous semi-official whomever [Google Guy] to write about it in an even less helpful way.

You really are unhelpful and self-interested Google.

Nathaniel Stott said...

Excellent work, we need more visability of content online. In whatever form it maybe published! Text is and will remain the essence of internet.

What I am missing however, is the integration with Web Analytics reporting tools. How will Google Analytics be picking up interaction and "naviagtion" with a flash site.

Perhaps this is the next step?

dis said...

It's great that Google is bothering to do this, but when they say "Googlebot does not execute some types of JavaScript" this is worse than useless. "Some"? Which some? Specifically, does it know what SWFObject is doing or not? SWFObject is probably the most often used script to embed Flash files, and it's even distributed through Google's code site, so Google couldn't possibly have done all this work and not know exactly what it is. I don't mean to get all negative, but how hard is it to state "It works with SWFObject" or "It does not work with SWFObject".

This is especially important because SEO is currently done in conjunction with SWFObject by presuming that search bots are reading the HTML page and NOT the Flash, while users get the actual Flash content. Is Google now also indexing some random bits and pieces from the Flash as well?

To dsfdgsg, 100% Flash sites can link to sub-pages through the use of the hash in the URL. eg, mysite.com/flash.html#subpage The world would be a better place if this were implemented on more Flash sites, and an even better better place if Adobe could work with Google and other search engines to develop a standard recommended way for Flash devs to do this so that search engines could index the separate pages.

Arul Prasad said...

>>Also note that we do not index FLV files, such as the videos that play on YouTube, because these files contain no text elements.

Technically, a FLV file can be have text metadata. Not sure if people use it, probably because there is no driving factor now, but if google starts indexing that metadata, we'll see more videos have metadata injected in. Think about youtube videos all having metadata that detail the title, author, and a link back to the video's youtube page etc. If google web search results page had a way to play a FLV file, the metadata would help in indexing videos directly

John said...

Most of our sites use flash menu text in the nav bar -- then we have a bottom menu of navigation copy as well.

Now that google reads flash, will it be too keyword heavy to include both the top nav and bottom nav on the site?

Marcos said...

From what I've read this is just the begining of what may be the solution for rich content developers who would also be able to SEO their work.

However, I feel it is still not time to celebrate, at least when it comes to dynamically created Flash content.

Most of what I do in Flash is Database and script based and the SWF contains just the code to make things happen. So, no cigar yet.
Also, 100% of all the Flash sites I do are loaded via JavaScript.

An idea: if the Flash content can be indexed and there's not a lot of fast changing dynamic text in the site, maybe a hidden scene with a copy of the text - or maybe just keywords - would help the site SEOwise.

A question: will Google Bot expose our ActionScripts?

Theo said...

shade (about why not just look for "filename.swf"):

if google looked for patterns like *.swf or *.pdf all over the place it would get an enormous amount of false positives, how would it discern a file that is embedded from a filename that is mentioned in a blog post? should all *.* patterns be treated as links?

also, not all URLs have file suffixes, dynamically generated URLs usually don't.

Theo said...

>> A question: will Google Bot expose our ActionScripts?

Answer: no. When you export a SWF the ActionScript code is compiled, i.e. turned into executable code. What the Google spider (at least claims to) do is that it runs this executable code, just like Flash Player would.

WebSite Design Orange County said...

Great news for the design community. But it seems like in the short term adding in all the Flash sites to the SERPs will cause a massive fluctuation in the rankings.

Inzearchfox said...

Great news. Keep up the good work!

Scumola said...

This is just the same thing that http://mediawombat.com has been doing for a long while now.

Shamus said...

Thirding the SWFObject question.

dleeg910 said...

Does this work the same for a Flex web application since it is basically as SWF file?

Illah said...

Forgive me for saying so, but this announcement falls a little flat IMO... SWFs seem to have been indexed to some degree for years now, and this seems to be more of a baby-step than a huge announcement. Also the issues surrounding loading text from external sources - which represent pretty much all Flash sites besides designer portfolio sites - means that this will not change much in the way of getting quality Flash sites indexed well.

Don't get me wrong, I love the progress that's being made, but I'm just saying... :-)

Brad said...

I've understood that Google can index Flash for some time now. How is this different?

More importantly, does this have any positive effect on Flash's core indexing problem, its lack of granularity? Indexing every piece of text from a swf still won't crack the top hundred results when pitted against a html pages with concise, relevant content.

Webmonkey-in-Ireland said...

I really hope this doesn't excite the blackhat SEO community.

I have visions of keyword stuffed 1 pixel SWF files, now corrupting results.

In my own opinion with SWFObject allowing alternative content in place of flash, this forced developers to properly decide when flash should be used for text, and when it shouldn't i.e. not for expanses of text.

benlong said...

Will flash written out via noscript be indexed?

Thanks.

fintan said...

will this work for legacy sites (i.e., those scripted with AS1 and AS2) or is it only relevant to Actionscript 3 sites?

erichazann said...

Theo: Did you read the Adobe release? It clearly specifies that Google has been provided with technology to execute and run the Flash files to discover all the application "states", ie pages. Therefore, they *are* executing the files. They are not just dredging through the file format to discover text in the file, they are simulating user interaction.

However, the one type of execution they are not currently indexing, XML content feed-in, is a major drawback. But at least they are aware of if and working to improve on that.

All: I think it is safe to assume that SWFs loaded via SWFObject will be indexed, since Google is now hosting/sponsoring that project.

Dis: fragment URLs and deep-linking don't solve anything as Google ignores fragments.

What we really need to know is how is Google going to prioritize the content it finds by simulating user navigation of the Flash site? From the Adobe release, it seems as if it dumps *all* the content it finds back into container HTML page. That is totally going to dilute relevancy. Imagine having your whole Flash site indexed as one page.

Secondly, it would be nice to know if Google will expand to develop a system that recognizes (with some spam/dupe content filters) application states as separate pages/URLs, and if so how would it work (both externally and internally w/i the application)? Externally, with the fragment URL & deep-linking? Internally, how would you designate a link to a new "page" (for the spider) that would not be an external URL that would load in the browser and ruin the Flash experience.

Adobe suggests multiple HTML entry pages each loading a different state of the application to trigger a certain "page" and focus relevancy. But no matter where the spider enters, if it moves through the Flash and indexes as much as it can, it will eventually dump all that content back to the container HTML, which will again, dilutes the focus. What sort of "recursion" duplicate content checks are built into indexing? Is the first state/"page" it finds given more priority for a given URL? and if so, yes, there would be a problem with bootstrapping SWFs providing no primary content, and then the spider considering additional states as secondary content.

erichazann said...

Brad: Google could extract links and text from a Flash file, but it could not execute every application state to discover all the text in the proper context. Adobe has given them the technology to run the Flash files. Also.. Google will extract the anchor text from buttons and associate it with external links!

Brad said...

@erichazann: Great explanation of the REAL search limitations of Flash.

Dlocks said...

[quote]
In addition to finding and indexing the textual content in Flash files, we're also discovering URLs that appear in Flash files, and feeding them into our crawling pipeline—just like we do with URLs that appear in non-Flash webpages.
[/quote]

And how do we use 'rel=nofollow' in the embed code for these kind of URLs?

Brad said...

@erichazann: Linking out. yes. So now the HTML that a SWF links to will gain even more page rank.

That just means that there will be even tougher competition for a Flash site vs. all of the HTML blog posts, articles, forums, and comments that overload any keywords of broad interest.

mike said...

@erichazann Your comments are really helpful. Many thanks.

My question to Google. Why aren't you answering any of these questions? $16 billion in income and your letting us answer our own questions. Your on to a great thing here Google. Say very little, let the rest of us work it out for ourselves, sell some text ads whilst we're all trying to figure out what you've said. Why can't you make a comprehensive statement and then back that up with the resources to answer the questions it generates.

And regards SWFObject, why can't they just give a straight answer.

Google you are worse than a politician.

Patrick said...

the lack of XML or HTML being pulled in is going to be a major hold up for quite a while as most "good" flash sites interface with some outside data source for all the relevant flashy reasons.

hopefully they'll fix that and it will lead to better flash content instead of not fixing it and it leading to GIANT flash files with all text embedded in the swf.

dan said...

Thank you!

( * v * ) said...

It seems that for this to really change how developers develop, it would need to be possible for the user to not just get results for .swf files, but to get to that content directly by clicking the link. Just like in an HTML site.

If you search a term that appears deep inside an all flash site, the link that appears is presumablly the embed page for the whole .swf.

The user would want to click on the link in the search results and go to that state within the flash site to see the content they searched for. I would need to be able to associate a URL with a state within the overall swf so that when the user goes to the single html page that embeds my swf, the swf will default to show that content.

It's one think to index, it's another thing to make that content available to the user.

If this ever turns out to be possible, I think we will see a return to entire websites build in Flash.

ryan said...

What we really need is a way to effectively block text from being searchable in our flash files. This is easily the most useless piece of technology ever introduced by Google, because taking simple text information out of the context of the timeline, layout, and programmatic logic and trying to generate a relevant search result from that is ridiculous at best.

On top of that, it removes an inherent layer of security flash developers had in ensuring text set in flash would not be readable by potential spammers.

Further, it makes already ridiculous search results generated by tons of irrelevant out of context words and phrases (because search engines still arent smart enough to "get" context and cannot make value judgements like humans can) even more bloated with all the text in flash files that was designed with the intent that it would never be spidered by a search engine.

Ridiculous waste and its going to have nothing but massively negative effect on the internet. Well done! (There with that last statement a search engine will obviously list this comment as a positive one)

Marcelo Lienlaf Pincheira said...

thx you google, is very good that google read flash...

Marcelo Lienlaf Pincheira said...

thank you for this great contribution to the webmaster, is very good for promoting the development of new and better sites now that this is putting stronger what is Web 2.0

bharath said...

This changes a lot of things and how SEO marketers see Flash based content.
BTW, does Google recognize the links loaded through actionscript, i think thats the only way to include links inside Flash.
How far can Google index into a programming language like Actionscript, while we still have problems with Javascript, regarding indexing.

Nisse J. Krenchel said...

Will it be possible for you to show information on how widespread the use of flash is around the internet - and the development over time?

icommercepage said...

Thank you Google and all Adobe's staff, finally our site show in the top Google page.

Adelaide SEO said...

why even worry about flash?

http://www.duivesteyn.com.au
http://www.professionalcomputing.net

Adelaide SEO said...

why even worry about flash?

http://www.duivesteyn.com.au
http://www.professionalcomputing.net

Molokoloco said...

I hope my flash sites won't appear directly in search...
Most of them don't have to be displayed at 100% but at the size i decide in the embed code
And how google treat the associated flashvars ? (Like language, allow fullscreen, script access....)

Seem to be a very hard challenge to do this....

Web Hosting Service Provider said...

Thanks for this happening..
How much aggregated content value on flash it is fine or unexpected?

Handy said...

This is really good news. even though I prefer not to browse any Flash site, because it's to heavy in bandwidth.

I like google's interactive CSS better like what you did in analytics

Handy said...

That's great news.. but I never liked Flashed website as it is heavy to load.

ledge said...

Please start allowing the googlebot to see and execute JavaScript. While the Flash search is great, you are missing out on 99% of the Flash websites out there by ignoring JavaScript that is used to embed the flash on the page.

http://thewarp.org/blog/index.cfm/2008/7/1/Adobe-works-with-search-providers-to-improve-Flash-indexing

ledge said...

Since swfObject also makes use of JavaScript it is doubtful that Google will make an exception for that. Currently they ignore all script tags.

World Wide SEO said...

So in the last after years of collaborative hard work and dedication Google and Macro media settled down to resolve issue of indexing Flash Based sites.

Well at least for me it's a good news. Thanks for sharing.

Jeremy said...

When I create a Flash site (which is rare), I usually convert all the text to outlines so I don't have to worry about fonts matching up. This renders the font as essentially vector shapes, rather than editable text.

Will Google be able to index this type of content, or will it still need to be in an editable format?

Pete said...

@andrew Excellent questions

wereweedoncewas said...

Hi all,
Is it possible to put an Adsense box at the end of an individual blog post? Sorry to ask a question unrelated to the topic, I'm a newbie. Forgive my naivety....

wereweedoncewas said...

Also, how do I add a descritpion meta tag??? Again, sorry to be a pest, but if anyone could help that would be great.
Thanks.

Stu said...

It would be laughable if after all of this time, Google finally come up with a solution for indexing flash and then completely ignore SWFObject. Every flash designer worth his salt knows that SWFObject is the only true way to embed flash into a webpage. Its also been leaked that Adobe are going to integrate it directly into Flash CS4.

If Google haven't accounted for this they are effectively kissing off 99.9% of all the decent flash websites out there. In fact, they would have been better off staying quiet about this recent development.

So lets just give them the benefit of the doubt on this one :)

mikey said...

hm..... do they realize that the newer versions of flash us Javascript to load the module?

This is the current standard because of IE's attempt at making Flash files difficult to use...

So, basically, what they're saying is that they can't index anything that's come out in the past 2 years...

ledge said...

Stu, that is in essence what they are still doing. From the above:

"Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed."

Flash has always been ignored because of this. SwfObject uses JS to function so it falls into the same boat.

ryan said...

"Google will not index files loaded by Javascript"

Great! So that means old flash content that was designed entirely under the assumption that it would never be indexed is going to be the only content that gets indexed. This gets better and better.

Ryan Miller said...

I think we may be missing the most important question. So Google finds that a scene within my SWF relates to "Beetle Juicers"...where do they return the searcher? To my home page? To some arbitrary point within the SWF? How exactly is that going to be for their search result quality and for the consumer's experience on my site?

mike said...

Still no answers from Google.

Can't you afford to pay someone to answer the questions your own press release has raised?

Oh, I forgot, you're having those discussions in private and in person with major adwords users. They get the inside skinny, can ask questions and can generally feel listened to and in some sort of process with you. The rest of us, we have to ask 'is Google Guy Matt Cutts?' and get no answer, let alone get an answer to a simple technical question.

You're non-transparency and refusal to answer simple questions whilst we know you are talking to major players and listening to them tells us you are incapable of 'doing no evil'. You're the same as all the others. You're just there to make your buck however you see fit.

I can't believe people have thanked you in these comments. I have no idea why.

Susan Moskwa said...

Hi Mike,
Actually, we think you guys have raised a lot of great questions and we're working on an addendum to try to answer them; so stay tuned.

mike said...

@susan moskwa. About time.

Will we know the job title and relationship of the person who answers our questions to the people who implement any indexing of flash? Or will we have to assume that if they have "I work for the Google' on their blogger profile they must be an official representative of Google?

Will they engage in a conversation? Will they make policy transparent? Do you even have policy at Google, I mean any policies, with respect to the inclusion of assets in your index? Will the reading age be above that of previous posts you have made on this issue?

[Note: please see this post, the most up-to-date statement I can find from Google on how they view Flash. It suggests web developers should use an animated gif over Flash and suggest Google has difficulty with Flash because 'Googlebot's got no eyes'. I am not 5 years old. What on Earth does this mean? http://googlewebmastercentral.blogspot.com/2007/07/best-uses-of-flash.html]

Or will it be an anonymous voice that gives a set answer and then disappears leaving as many questions as are raised?

Google states 'You can make money without doing evil.' [http://www.google.com/corporate/tenthings.html] and it may be true. But Google's absolute failure to build any transparency or accountability into its culture means this is highly unlikely, if not impossible by definition.

searchtools said...

"Googlebot does not execute some types of JavaScript."

That means they do execute *some other* types. I think they should clarify, but I suspect that they will be pretty reasonable.

The Adobe Q&A says "The improved SWF search also includes the capability to load and access remote data like XML calls and loaded SWFs."

(http://www.adobe.com/devnet/
flashplayer/articles/swf_searchability.html)

I'm learning a lot from people's questions, but I think this discussion would be more useful for everyone if we use concrete examples rather than general statements.

For example, if Google indexes an xml file, it will probably index all the text between the tags. I'm not sure about attributes. So, if you get a request referred by Google for that .xml file, you could do whatever you provides your site visitors with the information they seem to be looking for: deep link into the Flash, provide an HTML landing page, whatever.

ryan said...

Can you post instructions on how we can block google from indexing flash files on our server please? Also can you post how we can redirect google to only index files that we have prepared/optimized for a search engine, such as content manifest XML so that we can help make google results more relevant without giving carte blanche access to every individual file on our servers in the form of search results?

Josh said...

Thanks for the improvements! Keep on enhancing, this is a great stride forward.

mike said...

@searchtools. I agree, more specific discussion would be very useful. I assume it's as frustrating for others as it is for me to post specific discussion when you learn over time it will not get answered specifically. So I suppose others, like me, have given up. Google 'can't tell you[us] all of the proprietary details', which seems to me to say, we can't answer anything specifically. I think this is to protect their competitive advantage. With an increasingly monopoly position in the search marketplace, this lack of transparency seems at odds with their value 'do no evil'. I suppose it might also be to prevent search engine index spamming. Spamming could be prevented in other ways (a human filer for example). It seems sad to me that the cost of being transparent outweighs the benefit of non-transparency and it also seems at odds with 'do no evil'.

Some specific questions.

Question: What formats of swf files are compatible (flash authoring tool version, actionscript version) with Googlebot? [What kind of eyes does it have, can it see in color, or just black and white?]

Regards: 'We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.'

My understanding of flash development is in keeping with many of the voices here and I believe indexing any of the files loaded into a swf will often lead to a terrible user experience if they are ever followed from a search results page.

Question: Will content from external resources that are loaded by Flash files appear in the Google index as a direct link to the file itself?
(i.e. Finding a link to a swf as a search result rather than a link to the page URL of the Flash file that loads it.)

Question: If they will, what do we do to prevent Google indexing content from external resources that are loaded by Flash files as a direct link to the file itself?
(i.e. Finding a link to a swf as a search result rather than a link to the page URL of the Flash file that loads it.)

Question: Do you index swf files embedded with SWFObject?

Question: How long roughly until you can attach content from external resources that are loaded by Flash files to the URL it is loaded at? Is it your intention to head to this specific goal?

Question: Will you be able to recognize 'deep-linking' URLS as advocated on the blog of Kevin Lynch [http://www.klynch.com/apps/flashlinking/howto.html] and attach content from external resources that are loaded by Flash files (and accordingly display that state as a browser URL when they have done so) when you have completed the work below? Is it your intention to head to this specific goal?

'There are three main limitations at present, and we are already working on resolving them: ... 2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.'

Question: Did you consult any flash developers before this implementation, and if so, whom and on what basis?

Question: When exactly was the Flash indexing algorithm implemented? [A date.]

Question: have the results of the Flash indexing algorithm been incorporated into search results yet, and if so, when did that happen?

rex said...

I'm also concerned about the swfobject issue.

2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.

--so if i load an swf with the loadmovie function, inside another swf, Google will never see the loaded swf, right?

Nand said...

Great Info, Thanks

Autocrat said...

RAther than asking questions here, there are a few in the Google Discussion Group;
Crawling, indexing, and ranking >
Google - how will you handle Flash sites that have html versions as well?
http://groups.google.com/group/Google_Webmaster_Help-Indexing/browse_thread/thread/a82cd0d274ed1643#

ledge said...

I am still upset with the way that Google and Yahoo went about announcing this. Instead of first coming to the developer community and saying "Hey we have this nifty new feature, what do you think?", they went out and touted this all over the web.

Basically Google and Yahoo puts the Flash developer community in a bad light. It makes us seem like we are the bad guys. I had a major client come to me after this announcement because they had read about this in the New York Times.

I had to be the one to tell them that at this time it isn't possible for that to happen because of the fact that Google and all major search engines ignore javascript.

It makes us look like we don't know what we are t