Cote passed along SplunkBase to me a while back as an example of a wiki that houses technical information, and it looks like Slashdot got a hold of SplunkBase early in April with SplunkBase Brings IT Troubleshooting Wiki to the Masses.
I had challenged everyone on my internal BMC blog to send me good examples of wikis for technical documentation. So far SplunkBase is the most interesting and discussed example I’ve found so far.
SplunkBase is not Splunk
The name (which took some flack from slashdotters) does derive from the term spelunking, or exploring caves for fun.
To be clear, the freely available SplunkBase is not the same as the fee-based product Splunk, which indexes your log files and has a nice review and better explanation from a user here. His company has a Gig of log files generated a day, and he posted a sample log file to SplunkBase for an Input/output error from courier impad when there are FAM problems. His description also offers a fix (either install or restart portmap and fam). To me, this is a great example of users helping users through collaborative content generation.
Industry analyst Dana Gardner has a good discussion of it in a recent podcast with Chief Executive Splunker Michael Baum and Chief Community Splunker Pat McGovern (of SourceForge fame).
Livin’ in your logs
Those two talk about how sys admins live in the log files, constantly troubleshooting and walking through this highly unstructured data trapped in a log file. Lots of people have compared Splunk Base to grep and awk with a more Google-search-like interface. Search and navigation are the biggest two productivity boosts when it comes to searching through unstructured data. Couple those boosts with the power of a large user community contributing content and I think they’ve got something there. Imagine Wikipedia but for discoveries in your log files rather than a encyclopedia.
Are you kidding? Share my log content with others who might be hackers?
An immediate concern about sharing content such as the contents of your log files is keeping data scrambled and anonymous. In other words, how do you ensure that you aren’t giving away your IT infrastructure when you upload your log files as examples or broadcast to the world via a wiki page what you learned while troubleshooting your company’s IT environment. In the podcast, about halfway through they talk about how they’ve built in an event anonymizer before it’s shared with others. Most IT data is timestamps, usernames, machine names, IP addresses, that occur repeatedly, but this anonymization process scrambles that type of common data repeats in a way that you still recognize the repeated event or IP address, but you can’t reverse engineer that company’s infrastructure (except for what version of SendMail or Microsoft Exchange is used.)
In the podcast, they do congratulate Cisco on doing a nice job documenting log files, but most vendors aren’t really focusing on that information. A wiki just might be the right way to document log files. What do you look for in good log documentation?
In closing, I’ll challenge all of you as well — where are you seeing good examples of wikis or other collaborative authoring environments for technical information?