October 15, 2010

Integrating Community Content with Commercial Content

One of the challenges facing a doc team working on commercial open source products is navigating the space between community sourced content and commercially sourced content. Does it make sense to use community sourced content? How much content does the team push back into the community? What policies need to be in place to facilitate the transfer content across the boundary?
To maximize efficiency, it makes sense to incorporate community sourced documentation into the commercial documentation. Leveraging the community multiplies the number of writers without increasing costs. The community also, in many cases, is where the most knowledgeable people (users and developers) live.
Using the community content creates a number of dilemmas:
The first is the legal ramifications of using the content. Can the content be reused? What citations and notices need to be included? Does using community content mean that all of the commercially generated content become owned by the community?
Once the legal issues are resolved, the next dilemma is a product question. If the bulk of the content is community sourced, what it the value being added by the doc team? Is it just repackaging to align with specific versions and ease accessibility? Does the doc team edit the content? Or is there some percentage of content that is added exclusively to the commercial documentation?
The product team also needs to determine how much of the work done by the doc team is kept internal. For the code,, the answer is that most of it is pushed back into the community. Only very targeted features are kept back as a value add. For documentation, the question is more nuanced. Documentation is almost exclusive a value add proposition for a commercial open source offering, so figuring out how much to dilute that value is difficult. If content is being taken from the community, the doc team has an obligation, morally, to return something back. At the very least, the doc team should provide editing support to the community. Beyond that, however, what is the right amount of back flow?
Once the product team has decided the strategic approach, the technical dilemmas rear their head:
Is the community content in a format that is easily consumable by the doc team? Many open source products use wikis of one flavor or another for their documentation. While wikis are easy to edit and provide some nice community features, they are not great for commercial documentation. They have versioning problems, limited formatting capabilities, limited work flow control, and a number of other deficiencies. Commercial doc teams typically work in either a dinosaur product, like FrameMaker or Word, or an XML format, like DocBook or DITA. Some wikis have tools for exporting to XML formats with varying levels of success. Some open sour projects are willing to switch to XML. In either case, there are hurdles that need to be overcome if content is to be shared.
Many open source projects are not great about versioning documentation. They end up with a single set of documentation with a mishmash of content and a lot of "in version x, but in version y". Commercial documentation cannot function that way. How do you ensure some level of version sanity when importing the community content?
Community content is either very stale or in a constant state of flux. Stale content is easy to merge, but constantly changing content poses a problem. Is there a single person responsible for handling merges? Is there a merge schedule? What about outbound merges?
While many communities generate good quality content, it is often in need of editing and vetting. How is that handled? Are the edits made in the community version and imported? Are they made internally and exported on a case by case basis? How is the community content vetted? Does need to reviewed by internal engineers? Or can it be assumed that the community's self-policing ensures that the content is technically accurate?
FuseSource has taken a firewall approach to solving the problem. The community content is used as an information source, but not directly copied. When content is contributed back into the community, it is added to the project's wiki alongside the other content. We do provide some editing support to the community sites. There have also been cases where the product team decided that a piece of content made more sense in the community, so it was simply contributed.
Initially, we choose this approach for technical reasons. We didn't have a clean way to get content out of a Confluence wiki and into DocBook. Fintan Bolton solved that problem with his Confdoc plug-in, but we have continued the same firewall approach. Now it is for simplicity sake. Building an import/export system and a set of policies about moving content back and forth the divide seems to be of dubious value in many cases.
Much of the community sourced content is excellent for highly technical users who are comfortable off-roading. It needs some serious work to be made appropriate for the average corporate developer. In many ways, it would be inappropriate to dumb down the content in the community. Solving the versioning issues are tricky. Is it worth the effort if the community does not seem to care?
We do directly import some reference content. The import is one way. We make edits in the community and then suck the content into our repository. It works because the amount of material sucked in is massive and easy to edit. There is, however, a decent amount of post-processing that needs to be done after the content is inside the wall.
Neither method is particularly efficient. I'd love to hear how other groups solve this problem.

October 8, 2010

Commercial Open Source Documentation

I'm back working on the Fuse products again and couldn't be happier. The fact that they are commercial offerings of open source projects makes working on them more interesting than working on commercially developed software. It is not that the products themselves are necessarily more interesting (although in this case they are), it is the challenges around documenting them that is more interesting.
In a purely commercial world, the whole process is controlled. The engineers are located within the boundaries of the company. They answer to managers that you can ping. The feature sets and release cycle are well defined and mostly static. The documentation requirements are usually spelled out by the product manager with some input from the writers. They are usually well understood early in the cycle. When the product ships, the documentation is frozen until the next release is planned.
In a commercial open source world, things are different. While some of the engineers work for the company, most of them are part of a larger community that are beyond the corporate wall. Feature sets and release cycles are planned, but the plans are usually changed due to unpredictable changes from the community. Documentation requirements tend to be fluid to match the product development process. Customers have a large influence on setting requirements for documentation. There is an expectation that improvements will roll out over the course of a products life cycle.
In addition there is the ongoing struggle between what to take from the community, what to offer back to the community, and what to keep as part of the commercial value add. Do you offer cleaned up versions of the community written documentation? Do you push content written internally back to the community? If so, what? What is the process for sharing content between the community and the internal documentation team?
Coping requires fluidity and focus. Being capable of changing when needed is crucial, but so isn't staying focused on the core value of what is being delivered to the customers. If a change doesn't make sense, you need to be able to see it.
The other thing that is crucial is a dedication to quality. It is far too easy to let quality slip in an effort to meet all of the demands. When you let quality slip, you let your value slip. The community can write documentation of questionable quality without paying for a writer or an offshore writer can be hired to do some rudimentary editing. Neither outcome is good for you or the customers. In the commercial open source market, customers do read the documentation.


What's more important: technical or writer?

Lately I have seen, and heard, a number of discussions of what skills are most important in a technical writer. The conclusion reached in most of these discussions saddens me. It seems that conventional wisdom is that technical skills are considered primary. One recruiter told me that all of the positions she has open are for programmer/writers with an emphasis on programming.
I can see why businesses would want technical writers that are highly technical. It lightens the burden on the engineers because the writers don't ask as many questions. It also means that the writers can do more than just write documentation. They can do QA or possibly code. The business doesn't have to waste as much money on documentation.
Ideally, businesses, and engineers, would like to see technical writers cease to exist. They cost money, ask too many questions, delay delivery dates, and whine about usability issues. The only value they serve is to create a bunch of content that customers demand, but never read.
What I cannot understand is why technical writers believe that technical skills are primary. The "technical" in the title is an adjective describing "writer." The value of a technical writer is that they can take jargon laden technical information from engineers and turn it into something readable by the uninitiated. They can write a process in a way that makes it clear. They can distill complex technical topics into chunks that a user can digest. Writing is the primary skill.
I'm not arguing that some technical skills are not important. My background in software engineering has been invaluable to me. However, it is my writing skills that make me good at my job. I've worked with several technical writers with excellent technical skills who were terrible technical writers. Sadly, they poor quality of their content usually is overlooked because they fit in with the engineers.
Writing first; technical second.