Distance Debugging Logo

In part 1, I described XBRL and a bit of its background. In Part II, I'll talk more about why it is useful, and one major caveat to keep in mind.

As mentioned in Part 1, XBRL has now been mandated for use by companies filing disclosures with the SEC, starting with the biggest companies and eventually requiring compliance from all submitters. Clearly the SEC believes that the XBRL filings will be superior to the previous ASCII and HTML text submissions, and I think they are right. While I don't have a lot of visibility into the SEC's thinking process, I can speculate on the advantages XBRL brings to them:

  • Validation - Built-in to XBRL is a notion of calculation linkbases, which describe how the values of certain taxonomy concepts must balance. This is as much for the submitter as for the SEC, as it can easily eliminate bookkeeping or data entry mistakes since it will catch errors such as two values not summing to a third value, or two values not summing to zero. These checks are not intended to catch a malicious document submitter who could easily manipulate multiple values to avoid the calculation checks, but it should help cut down on errors such as dropping a zero or missing a negative sign. More of these validation capabilities are being added over time.
  • Reporting - Because XBRL documents are easily processed by computers, and must conform to the US GAAP Taxonomy, reporting is greatly simplified. You can take the underlying data encoded in the document and present it in different formats, compare and contrast different companies using different criteria, and even write queries against the data like you would in a standard database. To see some of this in action, check out the SEC Interactive Financial Report Viewer, which allows you to drill down into XBRL submissions to see a glimpse at how having data in a consistent format can unlock serious reporting capabilities. More importantly, the old EDGAR system is being replaced with the new IDEA system, which is built around the notion of businesses providing data, which can be queried and analyzed, instead of providing documents, which can only be searched.
  • Communication - Similar to reporting, the ability to communicate financial reporting data to interested parties is greatly simplified due to the standard format of XBRL and the UGT. Right now, I can go to the SEC website and pull down the 10K for Microsoft corp and I get a huge HTML or PDF file with dozens of paragraphs and numbers. This information is in a relatively standard format, but even still, this is not an ideal communications format for anyone who cares about particular aspects of their business, or desires to know particular pieces of information. SEC forms are great for the SEC purposes, but since the data is stuck in document form, it is difficult to make it serve any other purpose. Once the data has been converted to XBRL, any tool that supports the format can use it for a novel application. Mark Cuban hinted at this in his recent blog post, in which he envisions XBRL as the solution to the problem of regulating new financial instruments, and as a way for government watchdogs to more easily monitor government spending.

So the move to XBRL is going to be a major step forward, that much should be clear. However, one thing I have noted on occasion is that people who first encounter XBRL look at it and see many of the same goals and capabilities as the Semantic Web, particularly RDF/XML, RDFS, and OWL, but XBRL is a syntactic standard, and not a semantic one. It defines how things should be said, and not what they mean. Looking at a taxonomy concept, it is easy to see a term called "Cash" and assume that anywhere you see a taxonomy concept called "Cash" in any taxonomy, it must mean the same thing. However, not only is there no guarantee that all "Cash" terms are equivalently defined by the users, but information about what defines "Cash" is only lightly covered in XBRL via the reference info that names an external resource where a concept definition can be found.

We can hopefully assume that as long as different instance documents use the same element from the same schema, they have the same intent, but across taxonomies, all bets are off. Unlike with RDF, where particular concepts are defined as resources and can be referred to by any other resource, XBRL is rooted in its taxonomies and not in a set of core concepts. There are some efforts underway to bridge this gap and create more formal definitions of concepts, but as you might imagine, defining concepts in a consistent way across accounting systems is a daunting task. Someday though, perhaps individual taxonomies will begin to reference certain core, shared concept ontologies.

I hope you found this brief introduction to XBRL clear and useful. Questions? Clarifications? Criticisms? Looking for XBRL development? Drop us a line!

Technology Focus: XBRL

Distance Software takes on a very wide variety of projects. That's part of the fun of being a consulting company, and a major reason why I decided to start a business. In our work, we often become very knowledgeable about certain applications, tools, libraries, and specifications. As many of these are niche items, do not receive the same level of discussion and review that bigger technologies enjoy, I am starting up a new feature here on Shouting Distance called "Technology Focus" to help broaden the coverage of these items.

The technology in focus today is XBRL, the eXtensible Business Reporting Language. XBRL is at its core, a set of specifications for the encoding of enterprise data in XML format. XBRL can used to describe two main things:

  1. Concept Taxonomies - An encoding of a set of business concepts along with significant additional metadata for describing the presentation, validation, and source of the concepts. For instance, the taxonomy might contain a concept like "Cash and Cash Equivalents" with metadata indicating that it is an instantaneous measurement, i.e. it represents a value at a point in time, not a change over a period of time, and that it must be represented by a number.
  2. Instance Documents - An encoding of a certain set of data for a particular business that refers to a particular taxonomy. For instance, The instance document using the taxonomy described above might have an entry for the "Cash and Cash Equivalents" concept with a value of 100, with unit information indicating that the number is in millions of USD, and context information indicating that the number represents the period ending March 31, 2008. The document is an "instantiation" of the concepts described in the taxonomy, hence the name Instance Document.

XBRL got a major boost recently when the SEC mandated that all companies with a market capitalization of more than $5B would be required to file their financial disclosures using XBRL starting in 2009, with the remaining companies required to comply by 2011 (See this Gartner article for more information). The SEC spent a long time creating their own taxonomy explicitly for this purpose, the US GAAP Taxonomy (UGT). The SEC filing system has been updated to accept XBRL instance document submissions that are built using the UGT, and under a voluntary XBRL program, filers could attach an additional XBRL document (not considered part of the "real" filing) to their standard filing to test the new system. However, starting very shortly, XBRL submissions will be the standard.

The XBRL specification itself is fairly complex. While it makes use of standard XML concepts, such as that each taxonomy is a valid XML Schema Document, it also uses other standards such as XLink and XPointer to allow a taxonomy to be dynamically composed. Using an initial taxonomy document as a jumping off point, a valid XBRL schema processor must follow the various linkages and pointers to grab all of the directly and indirectly referenced documents and concepts to produce what is referred to as the Discoverable Taxonomy Set (DTS). The specification and the rules for generating the DTS can be found at xbrl.org, the international XBRL organization.

This can all seem overly complex at first glance, but ultimately, it needed to equal the complexity of the representation problem, which is only all possible financial data. Consider the following challenges, XBRL needs to:

  1. Allow for the representation of data across many types of reports, from simple balance sheets and cash flow statements to the risk/return information in a mutual fund prospectus.
  2. Support taxonomies that are generally useful, such that a large number of companies can all report on a similar set of concepts in a consistent way, but at the same time, be flexible enough to allow an individual company to represent additional data that is generally acknowledge but optional, down to concepts that are totally idiosyncratic to a single company's operations.
  3. Work for financial reporting concepts around the world. It can't just work for GAAP-based accounting, but it must work for IFRS reporting, the British system, the Australian system, the Dutch system, the Japanese system, and so on. While these different accounting systems share many common features, they each have a unique set of standards and concepts that must be supported.
  4. Be able to adapt as the standards change. Not only must it function across all financial reporting systems, but those standards change over time, often extremely rapidly.

As much as I appreciate simple, elegant standards, the world of financial reporting is neither simple nor elegant and it requires a solution that can meet these and many other challenges.

Tomorrow: Part II - What XBRL can do, and a few Caveats