How to cite this paper

Paoli, Jean. “We Created Document Dysfunction: It Is Time to Fix It.” Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 - August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). https://doi.org/10.4242/BalisageVol23.Paoli01.

Balisage: The Markup Conference 2019
July 30 - August 2, 2019

Balisage Paper: We Created Document Dysfunction

It Is Time to Fix It

Jean Paoli

CEO

Docugami Inc.

Jean Paoli is the Founder of Docugami Inc., a startup that uses AI to transform the unique document business processes of individual companies, making frontline users more efficient while giving COOs better compliance and insights – inspired by his deep belief that openness and interoperability raises all boats.

He was formerly President of Microsoft Open Technologies, Inc., and one of the co-creators of the XML 1.0 standard with the World Wide Web Consortium (W3C). Throughout his career, Jean has worked in startups: before Microsoft, with Inria, the renowned French research Labs (Gipsi S.A. and Grif S.A.); and within Microsoft creating four new startups: XML, InfoPath, opening the Office formats and MS OpenTech (Microsoft’s open source subsidiary). The startups he built created breakthrough platform technologies used today by millions.

He is the recipient of multiple industry awards for his work on XML, semi-structured data, the convergence of documents and data and openness at large.

In addition to core technical design, Jean takes deep care at building healthy ecosystems at worldwide scale. He is credited as one of the key leaders responsible for shifting in a fundamental way, under the guidance of the CEO, Microsoft’s strategy to embrace and love open source.

Copyright © Docugami Inc.

Abstract

Some of us building software need to take a hard look in the mirror. For years, we have promised that technology would solve the world’s information management problems, but 85% of business information is still dark data, with potentially useful insights lost in a rising tide of disconnected documents, emails, Slack conversations, voice-to-text messages, etc. We need an effective approach to documents and want to start a public conversation about these issues. We believe that effective solutions should be based on: Declarative Markup; AI sympathetic to Small Data; focus on company-specific documents; applying AI to documents as a whole; and solutions that do not disrupt existing workflows or require massive investment. The future is not about AI making human beings obsolete; the future is about AI making human beings and companies more productive, effective, and creative

Table of Contents

Introduction
What does document dysfunction look like?
Five principles
Starting a public conversation

Introduction

It is time for some of us building software to take a hard look in the mirror.

For years, we promised technology would solve the world’s information management problems, but 85% of business information is still dark data, potentially useful insights lost in a rising tide of disconnected documents, emails, Slack conversations, voice-to-text messages, and myriad other forms.

As the digital transformation accelerates, the sheer volume and opacity of documents make it harder to ensure quality, consistency, accountability, and regulatory compliance.

We call this problem document dysfunction, and it affects nearly every type of organization, from finance to health care to real estate to government and more, impacting millions of citizens, customers and companies.

What does document dysfunction look like?

  • It is a bank with thousands of loan documents, but zero visibility into the terms and conditions that impact the value of those loans.

  • A government agency with hundreds of project agreements that need to be audited and updated due to a regulatory change.

  • A commercial real estate firm with hundreds of contracts, but no insight into millions of dollars in underlying obligations.

  • A health care system with dozens of doctors spending pajama time every night recording and writing patient notes in a laborious and disconnected process.

Now multiply those cases by hundreds of thousands of companies and organizations around the world. That is document dysfunction.

Five principles

We see five principles that can lead us to more effective solutions:

  • First, we need to bring together multiple scientific domains in innovative and powerful ways, including diverse AI approaches and Declarative Markup https://markupdeclaration.org/.

  • Second, instead of “Big Data,” we need AI that understands Small Data– the unique sets of business documents distinctive to individual companies. government agency with hundreds of project agreements that need to be audited and updated due to a regulatory change.

  • Third, this focus on company-specific Small Data will enable us to maintain the privacy and security of each individual customer..

  • Fourth, past attempts to use AI to try to solve business data and document problems have failed because they focused on the wrong altitude — helping to complete words or sentences instead of applying AI to the document as a whole.

  • And fifth, to be truly effective, we need solutions that do not disrupt existing workflows or require massive investments in staff training, IT development, or armies of consultants.

The future is not about AI making human beings obsolete. The future is about AI making human beings and companies more productive, effective, and creative.

Starting a public conversation

Our goal is to start a public conversation about these issues. We published an Open Letter about Document Dysfunction at https://rebrand.ly/DocumentDysfunction. We beleive that the problems runs deep but at the same time we see an opportunity for the industry at large and envision how a set of new technologies can impact the core of the document industry.

Tell us your document dysfunction horror stories, or your dream for how technology could give you greater efficiency and control. Or maybe you completely disagree and have never met a dysfunctional document in your life. Or maybe you think our principles are all wrong —  we would still like to hear from you!

Author's keywords for this paper:
Document Dysfunction; AI; Small Data; Markup Declaration