Where is the wisdom? Lost in the knowledge.
Where is the knowledge? Lost in the information.
-- T.S. Eliot
Where is the information? Lost in the data.
Where is the data? Lost in the #@$$%?!& database.
-- Joe Celko, database consultant/writer
As these authors imply, there are many kinds of information and many ways to lose the use of it. Whatever the reason, if you can't locate information when you need it, you have lost the benefits of having it. Access to information is critical to the success of any enterprise. Its loss can result in confusion, frustration, and inefficiency, to say nothing of direct financial impact.
Any significant enterprise can be thought of as a
system:
an assemblage of inter-related elements comprising a unified whole.
In computer-based systems,
large amounts of detailed information is collected
about these elements.
However, even well-managed systems may suffer
from "indirect" forms of information loss:
Model-based Documentation (MBD) addresses these problems (among others)
by providing an integrated approach
to the semi-automated generation of detailed system documentation.
A consistent "model" (e.g., set of
entities and relationships,
based on the system being documented)
assists and guides the process:
Let's look at how this is accomplished...
Systems are composed of entities and relationships.
Some entities are tangible:
buildings, computers, employees.
Others are more abstract:
bug reports, meetings, programs, releases, telephone numbers.
A system's capabilities and internal structure
are defined by its entities and the relationships between them.
Because most enterprises store their operational data on computers,
a large amount of detailed information is likely to be stored online.
Unfortunately, the information is frequently
balkanized by technical or organizational boundaries,
format incompatibilities, etc.
Even without these impediments,
the complexity of production systems can be overwhelming.
Essentially, all models are wrong, but some are useful.
MBD addresses these problems by
modeling key entities and relationships.
This mapping, like those used in
object-oriented programming and
computer-based ontologies,
takes advantage of the fact
that large numbers of instances
can be described by small numbers of classes.
For example, we can define classes for methods, data structures,
and the relationships between them.
We can then track (and report on) arbitrary numbers
of data structures, method calls, etc.
By cataloging high-level information on available data sources,
MBD makes it easier to identify and locate needed data.
We can start to think about questions that this wealth
of detailed information could be used to answer.
As information is extracted and used,
the model will increase in detail,
aiding the investigation of more complex questions.
MBD normally begins with the construction of a (rough) model
of the system in question.
This should catalog and describe the system's major entities
(i.e., components and subsystems)
and relationships (e.g., control and data flows).
Note that this isn't a physical model (as in model airplanes),
let alone a simulation.
Instead, it follows this definition:
A schematic description of a system ...
that accounts for its known or inferred properties
and may be used for further study of its characteristics ...
- The American Heritage Dictionary of the English Language,
Fourth Edition
Some parts of the model will become visible in the documentation,
providing a consistent structure, convenient navigation, etc.
Other parts will remain hidden from the user,
but will motivate and guide the development process,
allow disparate data sources to be used together, etc.
In either case, the model will improve the utility and convenience
of the documentation.
Meanwhile, the presence of the model will help
to motivate and support analytical thinking.
A software development manager might, for example,
ask for comparative frequencies of pre- and post-release bugs.
Alternatively, it might be useful to rank software modules
by bug frequency, check-in activity, and/or size.
A simple chart or table can be very illuminating,
if it shows the right information in the right way!
Of course, getting to the right view
of the right data can be challenging.
Consequently, MBD efforts
tend to be exploratory, collaborative, and iterative:
Before a report can be generated,
the necessary data must be extracted from the correct data source.
Some amount of analysis may also be needed.
In some cases, everything may be highly organized and readily accessible.
In other cases, substantial effort may be required
to extract the needed data.
It's very useful to collect all of the extracted data
into an MBD-specific
data warehouse (roughly, database).
This allows consistent access methods to be used,
reduces duplication of effort,
keeps the MBD project's storage
under its own administrative control, etc.
Views of the data can be generated and presented, of course,
before a data warehouse is in place.
In fact, this is a common approach for prototyping reports, etc.
Folding interesting data into the warehouse, however,
may expedite future inquiries.
It also opens the door to the use of existing analysis programs.
Once the data is in a consistent and easily accessible form,
mechanized analysis can be employed to find "implicit" information:
correlations, patterns, hidden structures, etc.
Some of this analysis will require special-purpose programming,
but many tools and libraries are available to reduce this burden.
Let's consider some examples...
Business Intelligence (BI) techniques (e.g.,
data mining,
OLAP)
are commonly used in large enterprises
to mechanically extract information and even knowledge from data.
With the advent of
Open Source BI tools,
even small enterprises can now take advantage of these techniques.
Documentation generators can provide detailed documentation for software projects,
reducing development time, errors, and inefficiency.
Documentation frameworks can also help to organize and
increase the consistency of programmer-generated documentation.
Extremely powerful tools are now available
for mathematical analysis and display,
using techniques from
graph theory,
operations research,
statistics,
symbolic mathematics, etc
Most enterprises have large bodies of existing documentation
and other "unstructured information".
Many tools exist to help with this material,
including document indexing and digital library suites, etc.
Presentation spans a wide range,
from specialized "one-shot" reports (possibly created interactively)
to integrated, comprehensive sets of detailed documentation
(e.g., for a software project's code base, requirements, tests, etc.).
Throughout this range, the user's "mental model" is a key consideration.
In the case of a custom report,
the user's background and interests are clear.
So, the report can be "tuned" to meet known requirements.
Detailed documentation, intended for arbitrary users,
has no such luxury.
It must be organized and presented so that
any user can easily assimilate it.
MBD addresses this problem by presenting the user
with a consistent (though possibly simplified) model
of the system in question.
As the user navigates through the documentation,
s/he will learn about the system's basic organization.
A typical MBD-based web site will have several kinds of web pages.
Some, such as tutorials and class descriptions,
will be manually edited.
Others, such as entity descriptions and indexes,
will be mechanically generated.
MBD bridges the gap
between traditional documentation and report generation techniques,
leveraging the strengths of each.
It works well for generating timely, integrated,
and detailed documentation for large systems.
At the same time,
it facilitates the rapid prototyping
of specialized documents and reports.
Like the term AJAX,
MBD describes an existing (though not commonplace) way
of using existing technologies.
It is my hope that,
by coining the term and pointing out the utility of this approach,
I can convince others to think about,
and perhaps experiment with this approach.
Extraction and analysis can fill a data warehouse,
but the data doesn't become useful
until it becomes part of a human's "mental model".
Modeling is fundamental to both intelligence and communication,
so thinking about the underlying model(s)
is an inseparable part of good documentation design.
MBD encourages the developer to keep these models in mind,
at all stages of the development process.
Next: Concepts
Systems, Data, and MBD
Model-building
(also in dictionary.com)
Extraction and Analysis
Data Warehousing
Mechanized Analysis
Presentation