October 15, 2010

Integrating Community Content with Commercial Content

One of the challenges facing a doc team working on commercial open source products is navigating the space between community sourced content and commercially sourced content. Does it make sense to use community sourced content? How much content does the team push back into the community? What policies need to be in place to facilitate the transfer content across the boundary?
To maximize efficiency, it makes sense to incorporate community sourced documentation into the commercial documentation. Leveraging the community multiplies the number of writers without increasing costs. The community also, in many cases, is where the most knowledgeable people (users and developers) live.
Using the community content creates a number of dilemmas:
The first is the legal ramifications of using the content. Can the content be reused? What citations and notices need to be included? Does using community content mean that all of the commercially generated content become owned by the community?
Once the legal issues are resolved, the next dilemma is a product question. If the bulk of the content is community sourced, what it the value being added by the doc team? Is it just repackaging to align with specific versions and ease accessibility? Does the doc team edit the content? Or is there some percentage of content that is added exclusively to the commercial documentation?
The product team also needs to determine how much of the work done by the doc team is kept internal. For the code,, the answer is that most of it is pushed back into the community. Only very targeted features are kept back as a value add. For documentation, the question is more nuanced. Documentation is almost exclusive a value add proposition for a commercial open source offering, so figuring out how much to dilute that value is difficult. If content is being taken from the community, the doc team has an obligation, morally, to return something back. At the very least, the doc team should provide editing support to the community. Beyond that, however, what is the right amount of back flow?
Once the product team has decided the strategic approach, the technical dilemmas rear their head:
Is the community content in a format that is easily consumable by the doc team? Many open source products use wikis of one flavor or another for their documentation. While wikis are easy to edit and provide some nice community features, they are not great for commercial documentation. They have versioning problems, limited formatting capabilities, limited work flow control, and a number of other deficiencies. Commercial doc teams typically work in either a dinosaur product, like FrameMaker or Word, or an XML format, like DocBook or DITA. Some wikis have tools for exporting to XML formats with varying levels of success. Some open sour projects are willing to switch to XML. In either case, there are hurdles that need to be overcome if content is to be shared.
Many open source projects are not great about versioning documentation. They end up with a single set of documentation with a mishmash of content and a lot of "in version x, but in version y". Commercial documentation cannot function that way. How do you ensure some level of version sanity when importing the community content?
Community content is either very stale or in a constant state of flux. Stale content is easy to merge, but constantly changing content poses a problem. Is there a single person responsible for handling merges? Is there a merge schedule? What about outbound merges?
While many communities generate good quality content, it is often in need of editing and vetting. How is that handled? Are the edits made in the community version and imported? Are they made internally and exported on a case by case basis? How is the community content vetted? Does need to reviewed by internal engineers? Or can it be assumed that the community's self-policing ensures that the content is technically accurate?
FuseSource has taken a firewall approach to solving the problem. The community content is used as an information source, but not directly copied. When content is contributed back into the community, it is added to the project's wiki alongside the other content. We do provide some editing support to the community sites. There have also been cases where the product team decided that a piece of content made more sense in the community, so it was simply contributed.
Initially, we choose this approach for technical reasons. We didn't have a clean way to get content out of a Confluence wiki and into DocBook. Fintan Bolton solved that problem with his Confdoc plug-in, but we have continued the same firewall approach. Now it is for simplicity sake. Building an import/export system and a set of policies about moving content back and forth the divide seems to be of dubious value in many cases.
Much of the community sourced content is excellent for highly technical users who are comfortable off-roading. It needs some serious work to be made appropriate for the average corporate developer. In many ways, it would be inappropriate to dumb down the content in the community. Solving the versioning issues are tricky. Is it worth the effort if the community does not seem to care?
We do directly import some reference content. The import is one way. We make edits in the community and then suck the content into our repository. It works because the amount of material sucked in is massive and easy to edit. There is, however, a decent amount of post-processing that needs to be done after the content is inside the wall.
Neither method is particularly efficient. I'd love to hear how other groups solve this problem.

No comments:

Post a Comment