Here is my SXSW Interactive trip report from three years ago. Over the next few days, I’ll post my thoughts on the sessions from this year.
Organizing the World’s Information — Google and Blogger.com
Google’s mission statement is “to organize the world’s information, making it universally accessible and useful.”
They complete a web crawl of the entire web every 3 to 4 weeks. There are about 2 billion pages covered in their web crawl. If you include graphics and 20 years of Usenet postings, there are 4 billion web documents.
Their ranking system is heavily weighted by how many times other sites link to a particular page. They consider their system to be very similar to a book index. Google looks at over 100 factors including font size, proximity to other words, etc. when looking for keywords within a page. They also display a snippet from most relevant part, based on the factors in their algorithm.
The hardware is cheap, cheap cheap, and they use a distributed search mechanism with 10000-plus machines running Linux. He had a funny graphic of a bunch of bowed motherboards in a rack. Okay, it was funny to us geeks in the room. Basically they have hardware failures all the time since the hardware is so cheap, but they have so much redundancy in the system it’s easily fixed in time to avoid any problems.
They have noticed that their queries in English are declining. A fascinating study is to read the trends on http://www.google.com/press/zeitgeist.html. Google is inventing new areas and researching new features. Examples:
- Google glossary
- Google API, which allowed for map of web
- content target advertising on other sites (speaker’s example was when you’re on edmunds and you do a search for an air filter, Google’s air filter ad hits are displayed on the side)
- news.google.com 4500 news-only sites crawled daily
- froogle.google.com (Love this name!) price, name of product, category, and pic
In ranking pages, Google doesn’t use metatags due to not trusting those tags to tell the real “importance” or “relevance” of a page. (People obviously load up their meta tag data in order to try to get hit by more search engines.) They are working on a natural language query, but people often don’t “talk” their questions. Apparently we’ve become so accustomed to writing in two or three words, we don’t do a good job of writing complete questions. This was particularly interesting to me, looking for search trends in online information. And, without further ado, the (seemingly) surprise presenter at this presentation was Evan Williams from blogger.com.
Blogger.com was acquired by Google this year (2003) in a surprising move that had the blogging community speculating wildly about what this meant for blogging and for Google. While he didn’t have much more information than “there was a Slate article with some good ideas” it gave a better idea of how this came about. (I believe the Slate article to which Evan referred is Are Weblogs Changing Our Culture?. ) There are approximately 3000-4000 new blogs created each day on blogger.com, and maybe a half million to 1 million blogs exist in total. A lot of information is being posted on these blogs which google can help organize and lend easier access to. There are approximately 20,000 new posts per day.
Blogger consists of 6 guys in one room — and Google added 1 or 2 more. They basically moved from one San Fran office to another, this one without a window. They are getting more hardware resources and people resources and continuing maintenance and new coding on the blogger product.
Database-driven Websites
The main thing I learned from this presentation was that many times, server-side includes are more efficient than using text that is stored in a database. Server-side includes are basically ways to imbed files within HTML so that they appear seamlessly in the browser-window, but you’re basically writing a shortcut to the other files’ content. This approach makes sense for boilerplate information used over and over again. I guess it’s similar to the “Library” items used in Dreamweaver.
The speakers were from these sites: 13colony.com, brainfood.com, rackspace.com, and sxsw.com. They recommended that you cache query results and create files out of those results for more efficiency. If you want to teach yourself about database-driven web sites, they suggested that you start with php with Postgres or Mysql on Linux, and then pick an open-source application and figure out how it works.
Putting Online Conversation to Work
The only speaker for this panel was one of the WELL’s original directors — Cliff Figallo. The other speaker couldn’t make it, I’m not sure why. He was an interesting person who had lived on The Farm and felt that too many free-riders broke the community financially. He lent this experience to his experience with thewell’s online community — if too many people are “lurkers” instead of actively posting and contributing to the online conversation, then the community doesn’t grow well and has break downs.
He observed that ” attention is energizing.” So, the people who rant and rave got attention but usually that just fed the fire.
He noted that in online conversations, these factors are influencing on the conversation:
- who is talking
- intentions
- commitment
- tolerance
- traction
They didn’t want to become ‘benevolent despots’ at The WELL, although those with time and money took over sometimes. He had us remember that in those days, you paid per minute for your online time. (In 1985, membership on the WELL was $8 per month plus $2 per hour.) So, those with the most time, money, and the best modem could over take conversations, sometimes for the worst.
He thinks that for success online, you need founders, implementers, and sustainers in any online conversation and community.
He mentions the idea of subtext – you had to read between the lines and ask, was there a lack of credibility under the surface? He thinks this happens often on corporate intranets. Often this would be shown by non-participation by specific groups.
He noted that customers are getting more say in online conversations. I.e. Edmunds.com, epinions.com, allow users/customers to give feedback to companies. One thing he suggested during the Q&A session to a question about how to build a new community is to implement a ‘full value contract’ -make participants sign an agreement that they will listen, will contribute, will make the conversation useful.
One of the audience members was a doctor on a web community site like WebMD, and he said that as soon as a doctor joins a conversation they are flooded with questions and that anyone who identifies themselves as a doctor is treated differently than other community members. I was hoping for information and tips for real-time conversations embedded into products but the session was much more geared towards message board-type communities. Still, a thought-provoking session.
Beyond the blog
dollarshort.org is Mena Trott’s weblog.
stupidfool.org is Ben Trott’s weblog (the Trotts are co-founders of Moveable Type.) This session discussed the evolution of features on blog tools like Moveable Type or Blogger.
Ideas from the session’s panelists:
All blogs today are written in reverse chronological order, like a journal. Could there be another order or method of organization?Why can’t you post a book review on amazon and also place it on your weblog? The site a.wholelottanothing.org is Matt Howie’s blog, and he has a ‘posted elsewhere’ section. Audio blogs – the general consensus is that “these suck.” How could they be improved upon?
There’s a relatively new feature called friend of a friend – basically you can use XML to build a map of relationships. One of the enhancements might be an “enemies” or “nemesis” list.You can subscribe to weblogs using a Mac-based tool.
Many of the panelists read weblogs rather than watch tv. Their claim is that it’s better than reality tv, it’s mundane everyday life, usually well-written. This type of feature could evolve into a non-professional personal news feed. The panelist envisioned feeds from 10,000 blogs coming to your desktop, all with info that the filter thinks you’d find interesting. An early example is found at http://topicexchange.com/. I came away from this session with a burning desire to write a blog entry daily or at least weekly and to play with this technology. I don’t yet know how it will be used in tech pubs, but it’s fun to watch the technology grow and evolve.
Keynote speakers
I only got to see two keynote speakers, but both were highly creative, entertaining, energizing speakers. Joshua Davis is an amazing Flash developer/creative genius who paired up with a “mathlete” to do diagrams/drawings of how things grow in nature. I can’t begin to describe what he showed us on screen. Imagine a tree and leaves growing, made out of black sticks on a white background, in an exact pattern that matches what happens in nature. Then, imagine that he has about 10 of those patterns randomized together. Well, you can’t imagine the results, but it was amazing. I was completely blown away. He has done many other things for museums and artistic types as well as for Nike corporation and Pontiac. See his portfolio at http://www.praystation.com.
Richard Florida is a professor at Carnegie Mellon who specializes in regional economics. He had a great talk about the creative class and how they shape local communities and help make them successful. Austin is one of his top ten areas of economic growth and success (ranked 2 among major regions). Read more at http://creativeclass.org/. Highly inspiring and thought-provoking.
Other links for speakers I saw: