Documenting It: 2013

December 30, 2013

The OT is Still Terrrible

I'm once again trying to use DITA to make content. I know that a lot of large companies use DITA to manage large content sets, so I figured that the OT would have grown up a little. It has, but it is still terrible. Just doing a few simple things I have run into functionality that should have been implemented and isn't, a number of inexplicable bugs, and a number of other things that are just harder then they need to be.

For example, why is it that using a topicgroup element means that my child links table breaks? That element is not suppose to have any bearing on the content. Or why do I get random topics that don't generate based on how the files are organized on disk? Why doesn't the repsep element actually do anything when it is supposed to?

I hope that with a little more experience and time, I will come to see that most of the problems are user error. That, however, won't make me any more generous towards the OT or DITA in general. It is clearly a system that was made for complex things and was intended to only be used for big shops with lots of money. Why offer a free toolkit that doesn't really work? It is like one of those annoying freemium sales.

Here is a taste of what you could get if you piney up a bunch of cash....

It would be better to just be up front about the fact that this will cost money to be useful.

August 9, 2013

Markdown Prototyping

I have been working on a large set of documentation for a prototype that will eventually make it into a product. One issue with the publishing system at my company is that it makes doing prototypes, or really anything not within the rigidly and narrowly scoped model, difficult to set up. It takes a lot of work just to get to the point where you can begin writing.
Since time was of the essence and writing is expensive, I decided to do the prototype outside of the publishing system. I also decided to avoid using the rigid DocBook variant we use. Instead, I figured I do the prototype in Markdown using a combination of Daedalus and Ulysses III. It gives me reasonably full featured Markdown support, flexibility to work wherever I want, easy HTML and PDF exporting, and the stack/sheet metaphor fits nicely with topic based writing. The doc set is a stack and each sheet is a topic. The other nice thing about the plan was that it would be easy to take the content bak into XML since despite the complexity of the variant we use, it is mostly formatting markup.
I wasn't sure how well the experiment was going to work when I started, but a few weeks in I think it is great. I was able to rapidly prototype fifteen topics in about ten days. The prototype pages look fairly close to what our actual system generates. I can quickly make changes to the content as needed and republish. The fact that I can work multi-platform is great. I am not chained to my desk. I can demo changes easily. I can even make updates on the fly using my iPhone.
The flexibility does come at a price. Daedalus, the mobile editor, has limited Markdown support. It does not support things like internal linking, images, footnotes, or tables. Ulysses does support all of these things, however it defers to the more limited capabilities of its mobile peer when sharing. It can also be hard to make use of all the Markdown features on an iPhone or iPad unless you are using a Bluetooth keyboard. For example, I still haven't found the backwards single quote used for code on the native keyboard.
These limitations are minor compared to the effort and time saved using the combination. I'm pretty sure that I could not have gotten as far as fast using the normal tool chain. In fact, I'm not sure I could have done it this fast using XML and a more flexible tool chain.
The fact that I am working in text that doesn't have to be structured in a rigid format makes the work flow faster. It provides flexibility for quick changes, yet also allows for topic orientation.
For final production and long term maintenance, unstructured Markdown is not a great solution. There the benefits of the rigidity outweigh the cost. The rigidity enforces uniformity that large groups of variously skilled writers need to create and maintain content at scale.
For small, fast projects or prototyping Markdown, with Ulysses and Daedalus, have proven to be an excellent solution.

June 16, 2013

Change is Funny

I changed jobs about two months ago. My prior gig was like eating ramen noodles. You survive, but have heartburn, a headache, and lethargy. In trying to identify the issue I kept coming back to the crazy way they did things. The processes were painfully rediculous and there was no will to change them. There was also the crazy need to build all the tools in house.

The funny thing is that my new gig, which is like a breath of fresh air, has all of the same problems. The current processes are, if anything, more crazy. The tools also built entirely in house.

The biggest difference is that I believe in the new company. The entire company is focused on building the best quality product possible. They believe in investing where it is needed. They take the long view on product planning. It feels good to believe.

At my previous company, didn't believe. I felt like a cog in a machine that was finly tuned to poop out good enough product as efficiently as possible. It wasn't a good feeling. When I feel like I could work at 1/2 speed and still be overachieving, I check out.

So I checked out.

May 23, 2013

I've got a Byline

Despite being a full time writer, it has been a long time since I've had my own byline. The latest issue of Adoptive Families has a story I wrote in it. The story is just a little, personal reflection on first meeting our little love bug.

Reuse Statistics and Self Justifications

A writer I know recently boasted that his team is reusing around 85% of the topics they write and that it totally justifies their move to topic based writing. I was moderately impressed until I found out that the reuse was across a single product. At that point my skepticism became full blown disbelief.
85% reuse is an incredible amount of efficiency across multiple products. Across a single product's library it is absurd, particularly when the writers claim to be following the mantra "do not repeat yourself." At 85% reuse, the writers may only be writing content once, but they are definitely repeating themselves. It doesn't matter what the content looks like from the writers perspective; it is what the readers see that matters.
Of course when someone is bragging about something like this, what really matters to them is reuse. They had some reuse goal in mind or spent a lot of money to implement reuse and needed to prove they could do it. This may make the team look good to the efficiency experts, management types, and the metric mavens, but it is a lousy way to make content.
The other telling thing about this conversation, which happened over a longish e-mail thread, was how the team responded to questions and criticism. The most common criticism was that the content was choppy, disjointed, and repetitive.
The responses were all self-justifying: It is that way by design because it fits into how Google searches land readers into the middle of pages. (I'm not saying that how Google searches land people into your pages is not a valid concern, I am saying that I have never once heard of a documentation team that designed their content around Google search results.) It isn't choppy, it is streamlined. The "flow" content is wasted effort.
There was no reflection. There was no listening. There was no attempt to address the concerns.
As someone whose sole job is to communicate information, I find the unwillingness to think about criticism about how that information is presented unconscionable. There is always room for improvement. Also, acknowledging an issue doesn't mean you have to do anything about it.
You cannot get too attached to your content in this game. It isn't your baby. It isn't a reflection of your soul. It is information that somebody else created. You are merely the conduit through which it is communicated. Learn to be the best conduit you can.

April 11, 2013

The Importance of Looking Good

One of the things that really bugs me is content that looks bad or amateurish. I don't think that looks can change the essential nature of a piece of content.
Bad content is still bad regardless of how pretty it looks.
Looks do, however, have some bearing on how seriously a piece of content is taken. If a well written or particularly interesting piece is presented in amateurish or simply ugly way, I may just skip it without bothering to find out if it is good. On the other hand, a well laid out piece of crap may get the chance to waste a few minutes of my time.
The care with which something is presented says a lot about how much the presenter values it or about the skill of the presenter. Something that looks thrown together or looks like it was pooped out by a some kid with a free Web publishing kit, why should I take it seriously? The person creating it didn't.
This is much worse when it is done by professional companies where their is their knowledge and experience. If the documentation is laid out to look like something out of the 90s or has the worst qualities of print with none of the Web goodness, what does that say about the quality of the content?
If the content lacks even the basics for ease of access, why should I trust that I will be rewarded for my struggles to find anything of use?

March 15, 2013

Map Existing Structures Instead of Using the Three Topic Types

It is not that I don't like the kernel that germinated topics. I do like the idea of breaking big ideas into smaller, more manageable, and reusable chunks. It is one of the cornerstones of good writing.
What I don't like is the reduction of all things into three containers that are both too restrictive and not specific enough. Tasks, for example, cannot contain any conceptual information despite the fact that for most complex actions a reader will need some conceptual information to ground the task and explain its purpose in the larger scheme. Also, given the context free nature of topics, a task cannot depend on any other tasks despite the fact that many tasks are meta-tasks where each step in the task is itself another task.
One way to solve the need for adding context to a task is to redefine task to include an overview block that allows for conceptual information. Another way is to define a concept type that, by definition, precedes a task to provide the required context. Both cases create a more specific, and more useful, architecture for writing.
Similarly, to solve the meta-task issue one could define a new task type that allows dependencies on other tasks. This type, called a procedure, doesn't need to have hard dependencies; it could allow for output generation without inclusion of the sub-tasks. However, it would make it harder to ignore the need for the sub-tasks.
It is not that information architects are not free to make new content types; it is that most don't. They have their three types and try to force everything into them. They ignore the fact that an existing information set will have organically developed topic types that make sense for it. In most instances the argument is that the starting point set was narrative and therefore flawed. It needs to be tamed into the three canonical types for its own good.
The mistake here is that by assuming the new model is better, they lose the native intelligence in the existing structure. They assume it has none and impose it. Unfortunately, this approach typically results in more work and no net increase in the value of the information.
A far better approach is to analyze the structures used through out the existing set and attempt to build types, based on the canonical types, into which the old structure map. This requires some upfront work, but makes the move into topics, or modules, smoother. It also retains the knowledge encoded into the existing architecture. It has grown up as a reflection of the needs of the information, the needs of the consumers, and the needs of the authors. Hopefully, the standardization of the existing structures will result in a net increase in value because it will smooth out the bumps in the existing set instead of chopping it up. It will also give the authors more investment in the task of migration and more able to spot places where it can be improved.
The other benefit of remembering that the structure of existing sets has value, is that it sparks an iterative process. The architecture can be modified as needed. New types can be introduced; old types can be refined or removed.

January 21, 2013

Just a Bunch of Books

One of the things that have been occupying my brain lately is the differences between thinking of a product library as a bunch of books versus thinking of it as a bunch of knowledge modules. In both models a library will have something called books that are used to organize the content because book is such a well understood concept for organizing written text. Readers expect to see a list of books that will contain smaller divisions called chapters. They understand how to navigate inside that abstraction.
The differences between a bunch of books and a bunch of knowledge modules is mostly a production concern. It will have an impact on the reader's experience since the resulting library can be very different, but it is not something a reader will need to have knowledge of to work with the published content. While I am an advocate of the module centric approach because I think it provides more flexibility for the production side and the potential for a richer experience for the reader, I do not believe that a library constructed using a book centric approach cannot have the same richness as a module centric one.
The major difference between the two approaches is how content is chunked into buckets. In a book centric model, the book is the primary chunk-level. Every smaller block used to flesh out a book is done so with a view to building a single entity. This may or may not lead to what is currently derided as the narrative style of writing where there is a flow from one small block to the next and each block is contextually dependent on the other small blocks in the book. It does mean that the possible small blocks are predetermined by the predetermined set of books. So library design goes something like:

* What users will our product have?

* What set of high level knowledge will they need to work with the product?

* What set of books should we create to cover the knowledge requirements of the users?

* Create a set of books

* For each book, determine what specific knowledge will the users need/expect?

* For each book, create the content to satisfy the user.

In a module centric model, the basic chunk level is much smaller. It would typically be at a level where each module contains a digestible block of knowledge that a user will find useful. It is an intentionally vague definition since I believe that for any given project, the module is best determined by the writers working on the project. This model can be used to create things that feel narrative since a module can contain content that bridges between modules or provides context that glues modules together. It doesn't, however, predetermine the set of possible knowledge modules around a set of big chunks. The books can be decided on late in the game as the content builds up and the way to organize them because clearer. Library design goes like this:

* Who will use the product?

* What will the users want to do with the product?

* What specific tasks will the users need to do to accomplish their goals?

* What knowledge will the users need to accomplish these tasks?

* What knowledge modules does this map to?

* Create the knowledge modules.

* What modules need to grouped to illuminate a task?

* Create collections that map into book-like structures.

* What glue is needed to hold the collections together?

* Create the glue.

The books are created after most of the content is written. This gives you some added agility in creating the library because you can modify the organization as new information arrives. I also gives you flexibility in terms of reusing information. There may be modules that go in more than one book and you can simply include the module without cloning or you can clone it if it makes more sense.
A collection-centric model does change the way writers work and does require some extra-discipline. Instead of working on a book and not needing to worry about what other writers are doing, writers in this model work on a set of modules and must consider how their work fits into the whole. For example, instead of writing a security guide, a writer might write all of the security related modules. Those modules may all be built up into a security guide, but a few may also be used in other books. Writers need to communicate with each other more to coordinate updates to shared modules.

January 3, 2013

Translation or Internationalization

A while back someone passed around a quote from the FireFox team that said something like "We should strive to ensure that every user, no matter what language they speak, can have a consistent experience with our product." The reason for sending it around was to prod the writing team to strive for the same thing.
I generally agree with the sentiment of the statement. Language shouldn't be a barrier to accessing knowledge or a software product. Documentation and user interfaces should be usable regardless of a person's native language. I won't attempt to argue what subset of languages are useful because that is purely a business/resourcing issue.
The statement got me thinking about the differences between making a UI available in multiple languages and making documentation available in multiple languages. I often hear translation used to describe both efforts, but I think that papers over a lot differences.
UI are not so much translated as they are internationalized. In general, making a UI available in a second language involves translating all of the labels and warning messages into a second, or third, language. So there is some translation being done, but it is fairly simple stuff. Most of the labels and warning messages are single words or short direct statements. It takes skill to be sure, but it is a pretty straight forward task.
Documentation really does need to be translated. In general, documentation requires more than a simple parsing of labels and direct statements into a second language. Yes, there are plenty of instances where documentation is little more than steps and reference tables which are just labels and short direct statements, but that is pretty low hanging fruit. I would also argue that simply because steps are short and direct that because they are part of a larger whole, they really should be treated as more than strings that can be changed without consideration of the context. With documentation, because it is a dense collection of language, you really need to consider the whole body of the work and translate it into a new language. This may mean rewriting parts of the content to be more understandable to speakers of the second language. For example, cultural references always sneak into content because they can help explain complex ideas. There are also structures like glossaries that don't always have direct mapping into the second language.
I have seen strategies for translation that attempt to stream line the process by treating the text like a collection of strings. It seems to the that while it may grease the wheels a little, it cannot produce truly good quality content. The systems all place a number of restrictions on the content originator to make sure the strings can be easily translated. Often it seems you end up with something that is mediocre in multiple languages, but is done quickly efficiently.
Wouldn't it be better to create great content in one language and then, if required, have full translations done. It may not be as efficient, but it will probably make for happier readers.