*** I’ll better organize this post later, busy right now ***
Awesome. Jim Kremens from Flashcoders, started & involved in a thread about how to definitevly get Flash websites to be indexed by searching goes directly to the source, Google, and gets an answer.
Since searching the archives of chattyfig lists sucks major cheese now, I’m copying part of the thread here. There were many other good points brought up, so if you can scrounge up password #450-3b that you have remember, you can find the thread. Variety of opinions.
Anyway, good information from this email about duplicating Flash content inside an HTML file to help get the site indexed. Basically, you put the HTML within CDATA tags of XML, and use Flash to get it’s content from that static source (which of course can be generated from a dynamic one). There were fears techniques like this, of duplicating content, may be perceived as duplicating content with intent of fooling users and search engines, and thus get you blacklisted. One person responded this had worked for them on a few sites, and they weren’t blacklisted. Anway, for the meat, you’ll have to dig into the threads for the technique, but I’m posting just the 2 emails here.
I actually wrote to the Google team to ask them some of the questions
raised in this thread. Just wanted to share their response. Note
where they say:
“The practice of creating HTML copies of these Flash pages for our
crawler is actually our recommended solutions to this kind of issue.”
That’s in agreement with what pretty much everyone on this list said,
but in direct contradiction with what the non-Flash developers here
said. Interesting how people make up their own minds about stuff…
Thanks to all of you for your ideas.
From: email@example.com [mailto:firstname.lastname@example.org]
Sent: Wednesday, April 06, 2005 1:19 PM
Subject: Re: [#24081437] Flash and Search Engine Optimization
Thank you for your note. The Google index does include pages that use
Macromedia Flash. However, this is a new feature, so our crawlers may
still experience problems indexing Flash pages. If you are concerned
that Flash content on your pages may be inhibiting Google’s ability to
crawl your site, you may want to consider using a text browser such as
Lynx to examine your site. If features such as Flash keep you from
seeing all of your site in a text browser, then search engine spiders
may have trouble crawling your site.
The practice of creating HTML copies of these Flash pages for our
crawler is actually our recommended solutions to this kind of issue.
If you do this, please be sure to include a robots.txt file that
disallows the Flash pages in order to ensure that these pages are not
seen as duplicate content.
We hope the information we have provided above is helpful to you. Due
to the tremendous volume of information and help requests we receive,
we are not always able to provide personal attention to questions
pertaining to individual websites. For additional information, please
visit http://www.google.com/webmasters/. Also, you may want to comb
suggestions from our users and webmasters or to post a question of
The Google Team
What’s interesting is that they don’t say it’s OK to put hidden html
content on your Flash site. All they’re really advising us to do is
to make two sites: one html and one Flash.
And so, after much thrashing, my fellow developers here and I have
come up with a development plan that allows Flash and SEO to coexist.
Note that I work at a pretty big shop (idsociety.com) with some seious
back-end programmers. So some of what we’ve come up with may be
difficult for the average Flash developer to use. I’m hoping that
gaps will be filled in by future contributions to this thread. That
said, here goes:
1. Develop Flash site and html site that load content from the same
XML source(s). This way they can both be updated easily. Html site
can be as simple or elaborate as you like. It’s there for the few
people who don’t have Flash, for users with accessibility issues and
search engine robots. Per Google’s recommendation, include a
robots.txt file that disallows the Flash pages in order to ensure that
these pages are not seen as duplicate content.
2. Provide one entry point to your site. In our case, what might seem
like ‘pages’ in the site are just paths written by the Apache server
using mod_rewrite. (Google ‘mod_rewrite’ for more info). So, users
who click on a link in Google to come back to an ‘internal’ page in
your site are really just coming to the site’s entry point. There,
the server would typically use mod_rewrite to serve them up a page.
3. In this case, however, the server will do a Flash check. If the
user doesn’t have Flash, it’ll serve up the html page (duh). If they
do, it will serve up the Flash page, passing in the ‘url’ via
FlashVars. Again, the url written by mod_rewrite has no
correspondence to actual directories. I’ts made up. So, you can set
Flash to interpret it however you want.
4. Configure your Flash file to correctly interpret the mod_rewrite
path passed in and navigate to the appropriate content.
And there you have it. More work, to be sure, but you can give your
client the Flash content they (and you) want, it will be indexed by
the robots, and, if you build it right, users will be routed to the
correct location in the Flash site.
I’m sure some of you have better ways of doing this. And I’m guessing
Peter Hall’s way is among them. Also note that I know almost nothing
about the server-side stuff. So I’m curious to learn if there are
alternate ways to do this that don’t require dealing with an Apache
Kudos to Google for the swift reply.
> And a question to your “Flash/XHTML-engine” in Flash:
> how did you combine XML content and HTML code? iÂ´ve seen the
> HTML-sourcecode but donÂ´t know, when you load a further HTML site into
> Flash, how you can ignore the HTML-Tags and only read out the XML
“All is XML” in the source of the HTML document.
The trick is to
1) Make sure your HTML is put in a CDATA tag (so invalid structures are
2) Put your content in a structure you can read easily
3) “Navigate” to the right XML tag.
Again, this is my own version of Peter Joell’s thingy as presented here:
http://www.peterjoel.com/ripple/ (the slideshow does not work on my
check out www.instantinterfaces.nl/demo/htmlparser.txt
(Use the “view source” option of your browser if the text does not appear.)
It is written for the FFIE (using an XML parser with callbacks), but it
gives you some idea.
Peter Kaptein wrote
> > The trick is to publish your site both as Flash and as XHTML and include
> > your Flash-movie, then let the Flash-movie load that specific page.
I’ve put a site up using the technique as described.
The “HTML” content is presented as visible (normally you would hide it.)
Click on the links to see different “HTML” pages of this “site”
It is still a prototype of the Flash/XHTML-engine (the menu loses it’s
“active item” when reloaded, but hey!)
Click in the menu in the site to see only the content being refreshed (and
the menu when you open another group)
As you can see in the hyperlink, the HTML page remains the same, thus
utilizing both the HTML / Google-esq findability and the strenght of the
“single page model” you can do with Flash.
Click on the hyperlinks on the bottom of the screen (scroll when not
visible) to go to another page.
Open the source of the HTML to see the “XHTML” setup.
<OBJECT classid= contains the SWF call. E.g:
<NAVIGATION type="content"> contains the "XML" to build the navigation
(basically, the XML stru is scanned for the <A HREF=""> "XML" node and
passed to the menu on the left _when changed_. If the checksum of the
<NAVIGATION> items are the same as previous, no changes is in the menu and
nothing is done with it.)
<FLASHFORM type="content"> contains the XML to build the form as presented.
“FLASHFORM” is not an official thing or something, just a personal choice
for this solution.