Kennedy, Dianne. “Case Study: Applying an Agile Development Methodology to XML Schema Construction.” Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). https://doi.org/10.4242/BalisageVol17.Kennedy01.
Balisage: The Markup Conference 2016 August 2 - 5, 2016
Balisage Paper: Case Study: Applying an Agile Development Methodology to XML Schema Construction
Dianne Kennedy is an independent consultant and serves as the XML Evangelist Emeritus
for Idealliance, an international association providing leadership in publishing and
information technologies. Ms. Kennedy facilitates the development of XML specifications
and best practices on behalf of Idealliance to support platform agnostic, cross-media
publishing. She currently serves as technical editor for the Idealliance PRISM 3.0
Specifications, PRISM Source Vocabulary Specifications, MailXML Specification and
the Print Quality eXchange Message Specification. Kennedy serves on the Board of Directors
for the College of Graphic Communications at Illinois State University.
To assure brand integrity, brands such as Coca-Cola and Proctor & Gamble, assisted
by third party brand quality experts, receive, score and track the quality of their
print suppliers over time. Currently this is difficult and expensive because printers
use many different measurement tools and report print quality using a wide variety
of proprietary formats that cannot be directly utilized by brand scoring and tracking
systems. In 2015 Idealliance members launched an effort to develop a standard XML-based
print quality exchange message. This specification, known as Print Quality eXchange
(PQX), was developed by applying agile software development techniques to the construction
of the PQX XSD. This case study highlights how agile software development principles
can be applied to the construction of an XML schema.
The Basics; Brands, Brand Identity and the Brand Print Quality Ecosystem
A Brand is a name, term, design, symbol or other feature that distinguishes one seller's
product from those of others. Brands you might recognize are Tiffany, Coca-Cola,
Ford Motor Company and P&G. For each brand, the reproduction of its name and symbol
or logo is of utmost importance as the printed reproduction is a measure of brand
quality. While brand names and logos are reproduced in many formats, print reproduction
is high on the list. From billboards to packaging to advertising to the print representation
of a brand on a targeted mailing piece, the quality of each printed item is a reflection
on the quality of brand itself.
Since brands spend billions on print each year, it should come as no surprise that
brands wish to automate tracking the print quality of their suppliers on an ongoing
basis. Today this is difficult and expensive because printers use many different
quality measurement tools. As a result, they send print quality reports to the brand
in a wide variety of formats. And since a standard industry reporting format does
not exist, quality data cannot be directly utilized or imported into the brand's quality
tracking systems. As a result, when brands receive print quality data from their printers,
they spend a great deal of time and money to have that data converted or keyed into
their quality management systems. Today, collecting print quality data can be difficult
and expensive because different printers use different measurement tools. Tools from
different manufacturers do not interconnect and do not report print quality in the
same format. In some cases, printers must use multiple sets of tools to service different
brands, which have different specifications and requirements, reducing printer's efficiency
and increasing their costs.
Note
Brands judge print quality based on color fidelity (the accuracy of the color of a
printed image based on spectral measurements), registration (the accuracy of the alignment
of inks on a printed image) and absence of defects such as lines, smudges, streaks,
etc.). Since the methods to score quality vary significantly by printing type/sector
and vary from brand to brand, those developing PQX were highly challenged to create
a model that would work equally well for those printing packages, those printing signage
and those printing marketing materials.
Adopting an XML-based standard print quality reporting message would would be beneficial
for both the printer and for the brand. Having a single message format would enable
supply chain automation because new tools or plug-ins for existing tools could be
developed to streamline the print quality reporting and analysis process. In 2015
Idealliance began working with its printer and brand members, along with the larger
ISO Graphic Technology community to develop an XML print-quality exchange specification
to serve as the standard message format for printers to report print quality to Brands
(print buyers).
The print quality ecosystem is made up of print service providers (Printers), print
buyers (Brands) and third-party print quality service providers employed by the brands
to act as intermediaries to manage print quality on behalf of a brand. Because the
print quality requirements and reporting criteria are quite varied, we believe that
two standardized messages are required. The first is a message from the brand to
the printer that defines the requirements for the printer to conduct press runs to
report quality. This message is the Print Requirements eXchange message or PRX.
The second standard message goes from the printer back to the brand. This message,
the Print Quality eXchange, or PQX, delivers raw print quality data according to the
parameters set by the first message.
Note
PQX is an Idealliance Specification. In the fall of 2016, it will be fast tracked
into ISO TC130 WG2 to become ISO 20616 Part 2.
The Basics; Agile Software Development
Agile software development is a software development methodology in which the requirements
and solutions evolve through collaboration between development partners. Typically
agile software development methodology involves incremental, evolutionary iterations
of code.
Iterations between versions of the software often take place over a short time frame.
A highly engaged cross-functional team participates in each iteration. Typically
the tasks that occur within an iteration include:
Requirement Analysis
Code Development
Testing
Feedback
The goal of agile development is to have a new version of software produced for each
iteration. Clearly to enable a short feedback loop and adaptation of software it is
critical to establish an efficient communication / collaboration mechanism. Following
an agile software development methodology means that the software can evolve and yet
be delivered rapidly with consensus of all stakeholders.
Why Use Agile Development for PQX?
Several factors pushed the working group toward employing an agile software development
methodology to develop the XML schema. First was the urgency to develop a workable
standard for the industry. Today, with the decreasing volumes of printing being purchased,
both printer and print buyers are looking to automation to improve the bottom line.
Second was the urgency to move from the development of the XML message to immediate
implementation of that message. In the case of PQX, all parties were eager for immediate
implementation. Technology stake holders did not see new products being developed
to support PQX-based communication, but rather believed that PQX would be implemented
by a set of transforms from their proprietary reporting formats into the standard
PQX format. So our project was highly influenced by the technology vendors' programmer
staffs, who are accustomed to using agile techniques as they develop their own products.
They insisted that iterations and testing of transforms to PQX occur as the message
was developed to assure that all current reporting features in their commercial products
be included in PQX and to assure that writing transforms required to implement PQX
was as straightforward as possible.
Note
Another factor that made our iterative approach important was that we were bringing
together experts from different printing commuities. While to outsiders, all types
of printing may appear to be quite similar, in reality this is far from the truth.
Printing cartons is different than printing on items such as a can or a bottle. And
printing labels to be adhered to a can or bottle is different from directtly printing
on those items. In each case the sampling techniques, measurement techiques and observational
techniques vary. We found that the team participation for any one development session
brought their print-sector bian to the model. Only by showing the model and samples
following each session could we solicit input from other sectors to assure that the
model would work across all print industry sectors.
Step 1: Developing a Terms of Reference
The first task of the PQX Working Group was to define what we call the Terms of Reference.
Basically, the ToR is a statement of scope and mission of the working group. According
to Wikipedia, Terms of Reference is defined as:
Terms of reference describe the purpose and structure of a project, committee, meeting,
negotiation, or any similar collections of people who have agreed to work together
to accomplish a shared goal. The terms of reference of a project are often referred
to as the project charter.
It took several working group calls to develop and refine the Terms of Reference for
PQX. We wanted to develop a concise statement for the reference of the working group
and to communicate the intent of the project to the outside world. Over the past year,
we have revisted the Terms of Reference any time there was confusion about the scope
and mission of PQX. Each new version of the ToR included a more precise description
of our charter. It is interesting to note that although the statement has been updated
over time, the essence of the ToR did not change.
The current Terms of Reference:
PQX is intended to facilitate the one-way transmission of performance data for one
or more printed samples from a single press run from print service providers to relevant
stakeholders and brand owners; thus allowing brand owners to assess and track relevant
business, production, color and quality data of printed materials of all forms. PQX
is only intended to transmit raw quality data. The PQX message intentionally excludes
tolerance and evaluative information, allowing the receiver to determine acceptability
by applying their own scale and tolerance values. PQX incorporates color using the
same data containers that are defined in ISO 17972 (CxF). While PQX and CxF are different
formats with different parsing requirements, developers can use the same strategies
for reading and writing color data in a PQX file that they use for reading and writing
color data in a CxF file. PQX is compatible with both spectral and non-spectral color
data.
Step 2: Developing Formal Design Principles
Since employing an agile software development methodology means that we will be producing
new versions of the XSD as a series of rapid iterations, it was critical that we formalize
the design principles for the schema. Looking back, this was one of the most critical
factors that enabled us to succcessfully employ agile techniques. As development
proceded, there were times when we developed structures that seemed like a good idea
at the time, but that violated the design principles. In such cases, a disciplined
habit of referencing the design principles brought the direction of our schema development
efforts back on track.
The design principles for the development of the PQX schema were divided into sections
to assist with referencing and understanding the principles. All principles were
developed by consensus of the stakeholders. Again, it took some time to develop and
agree upon the design principles. But our working group agree that developing the
principles was an important ingredient in enabling agile schema development.
Note
Not all our design principles were developed before schema development began. Some
were developed along the way to guide us as we tackled a new block of the message
model. For example, when we decided to use the ISO color data specification, CxF,
we developed the principles for how we would employ CxF before we began to develop
that part of the schema.
PQX Design Principles
General Design
Structure of the PQX Message will conform to the Terms of Reference as defined by
the WG
Elements defining Tolerances and Scoring are out of scope
The PQX message will be limited to the reporting of press run results, not for returning
the requirements or specs for that press run.
Standard ANSI/ISO nomenclature (terms/definitions) will be used whenever possible
Element/attribute names will self-documenting; i.e. names will not be abbreviated
Each element and attribute will be documented in place using annotation and documentation
XSD structures
Each element and attribute will documented as development occurs to enable automatic
generation of a documentation for each schema iteration
The namespace for all PQX elements will be pqx:
CxF elements will be imported from the CxF3_core schema and will retain their CxF
cc: namespace
Either CxFX1 or CxFX4 may be used for color data. This means while spectral data
is allowed (for packaging) it is not required.
Message Structure
The root element for the Print Quality Report will be the PQX message
The Schema will be prescriptive and strictly enforceable, rather than flexible, to
allow for little, if any programmer interpretation of intent
Ordering of fields within PQX will be absolute
Cardinality will be determined by consensus of the stakeholders
Required elements/attributes must be agreed upon by all
Optional elements, within the scope of the project, will be included at the request
of any stakeholder
Message Field Order
Business data will fall first in the CxF Message
Color data for references and samples will be reported in separate blocks of the message
Employing CxF Data
CxF will be employed as the data store for reference and sample color data
The PQX message will have an attribute of CxFVersion to allow for inclusion of any
version of CxF that may be supported by a printer's color measurement tools
Only Core CxF will be employed as the data store for color data
The CxF schema will be imported into the PQX schema so that CxF elements can be included
in the PQX message in their native cc: namespace
CxF will be employed as a complete CxF hierarchy (blob) with the cc:CxF element as
the root to ensure direct importability from color measurement devices
No fragment of CxF will be allowed within the PQX model
CxF elements will be employed for only those mechanisms where the intent of CxF is
a match for the intent of PQX
The use of CxF Tags to customize CxF to fit the intent of PQX is not appropriate
The use of CxF CustomResources and CustomAttributes is not appropriate
Valid but non-appropriate CxF elements may be written into a PQX message by sending
software systems but will be ignored by receiving systems
Quality reporting objects not explicitly supported by CxF Core will be implemented
in PQX outside of the CxF data store
Note
CxF is ISO 17972. It is the standard data transport for color data. It was very
important to the working group to define exactly how PQX would carry color data using
CxF. Hence this section of design principles was developed.
PQX Iteration Process
About Our Process
Once we had the Terms of Reference and the Design Principles in place, we began the iterative process. Since one of our motivations for employing
an agile methodology was to develop the message in a relatively short timeframe, we
began with two working group sessions each week. Once we began working on detailed
models for color reporting we met only once each week with a break for winter holidays.
In the period of 12 months we completed 47 iterations before posting the schema and
its documentation for public comment as an Idealliance Specification.
Note
The PQX Working Group is made up of Idealliance members and invited experts from ISO
Technical Committee 130 (Grapic Communications). The Working Group was made up of
two Co-Chairs and 47 members, representing the United States, Canada, Great Britain,
Germany, France, Italy and Japan. I serve as the Idealliance Program Manager and
Editor of the PQX Specification.
Each working group session was conducted using the Idealliance Webex service. Each
session was recorded so that any participant that missed a session could be brought
up to speed for the next session. Each iteration consisted of the following steps:
The latest version of the PQX XSD structure was reviewed by the facilitator. Changes
from the previous version were highlighted.
The test data sample was reviewed so all could understand how their quality data would be
encoded for exchange. Transformations of the test sample to/from any technology vendor’s
proprietary format could be tested by individual software vendors participating in
the effort and faults were reported.
The schema was advanced by modification of structures in the current version or the
addition of new fields/functions.
Documentation about new elements and attributes were developed by the group to assure
a stable, working nomenclature was in place.
Following each working session, the facilitator produced a new version of the PQX
XSD, complete with inline documentation.
The facilitator also produced 1 or more samples to demonstrate the latest schema version.
Complete documentation was generated from the schema by the facilitator.
The new version of the XSD, its documentation and related samples were posted in the
library of the PQX online collaboration website, Idealliance Connect (http://connect.idealliance.org) where all members of the working group could access the materials for review and
comment before the next scheduled session. Minutes were posted to the library each
week. Connect was also used for group discussions to advance our efforts.
The facilitator prepared a slide deck to lead the discussion and advance the XSD for
the next working session.
Iteration Snapshots
The following graphics illustrate changes and advances in the PQX schema over the
development lifecycle.
Version 1
This was our first attempt to model what we imagined might be in the PQX message.
This version was developed by the facilitator based on a free-ranging discussion from
our first several sessions.
Version 2
In the second version of the schema we continued our work on the high level structure.
As you can see the fields changed and the idea of reporting color measurements, registration
and visual (defects) began to emerge.
Version 10
During the development of Versions 3-10, we focused our work on modeling the business
data that must be passed between the printer and the brand. We developed separate,
clearly defined fields for the party the evaluation samples were printed for, printed
by and which party did the measurements of these samples. At this time we were working
primarily with sheetfed offset printers and the assumption was that quality data would
be based on press sheets pulled randomly during the press run. During this time we
also decided to report color from sample measurements and began the task on how to
report color accuracy in XML. That task would be our focus for the next 12 versions.
Version 22
In Version 22 we added a block for reporting production information as an additional
business information field. Because when printers and technologists from the packaging
and labeling printing sector were added to the working group, we came to see that
quality might be judged from more than a randomly pulled press sheet. Instead we
might be reporting print quality from labeling printed on cans or bottles. Or we might
be reporting quality from labels for packaging, where it is common pracctice for labels
for several products are printed simultaneously across the substrate base. All these
new considerations needed to be factored into the iterative schema models we produced
each week.
We also discovered that a brand might require that on each printed sample, measurements
should be taken at several locations. As a result, we would need to create a model
to report the quality from each location on a sample individually. Additionally,
by the time we developed Version 22, we added a block to hold ISO standard color data
that would be dumped directly from color measurement systems into the PQX message
to send to the brand.
Version 30
During the development of Versions 23-30 the group focused on how to report color
measurements. It turns out that this is quite a complex task. As part of this task
we needed to add the ability to report the ink laydown order on the press, the substrate
color information, patch information as well as tints and the color measurement values
for the parent ink solids. Because there is no single, standard formula for calculating
and reporting tone values, we had to assure that we included all color factors in
the PQX message so that a receiving system could calculate tone values using their
formula of choice. All this was a challenge for print technologists to model for
the PQX XML message.
Version 31 - 47
Once the issue of modeling color reporting was complete, the working group turned
its focus to developing methods to report registration (data about the accuracy of
the alignment of inks on a printed image) and how to report printing defects such
as lines, smudges and something printers call "hickeys". We were additionally challenged
to support the transport of data from the new on-press quality systems that report
on every print impression using averaging techniques to minimize the reporting message
size.
Summary
Because we employed an agile development strategy to develop the PQX message, programming
teams of print quality reporting products stand ready to commercialize PQX immediately
following the publication of the Idealliance / ISO Standard. Without having engaged
these programming teams from the beginning and employing agile techniques, implementation
might not have happened for some time. As a result of the experience gained in developing
the PQX schema, our team believe that using agile techniques to develop XML schemas
can serve as a best practice for the development of many other XML schemas where immediate
software implementation is a goal.
Appendix A. Tagged Example
In order for readers to understand more about data exchanged using PQX, reviewers
have requested the inclusion of a sample. In this sample, a press run at Printer
ABC in Queens, New York is conducted for Jupiter Candy, Inc. The product is the 12
oz WoWs bag with the SKU 1234. The measurements are not being taken by the printer.
Measurements is being handled by their contractor, Q-TraX in Cincinnati. The job
was run on November 15, 2015 under the printer’s LOT Id 701-123-3331. There are 5
ink channels on the press that lay down ink in this order; Cyan, Yellow, Magenta,
Black and the “Wow Red” spot color. The data capture method for both color and registration
was by manual pressman pulls throughout the run. Defects were reported from an automated
camera system, with a defect reporting 80 percent.
Color quality is reported in the Color Report block and linked back to the Ink Channel
information using the InkChannelIdLink attribute. The Color Report consists of a Measurement
Set at position “1” on the package. In a full PQX report, multiple Measurement Sets
at Brand-specified positions on the printed item would be sent. This sample illustrates
only a single measurement set. Each Measurement Set is made up of measurements of
one or more patches. Each measurement has an indication of the patch type being measured.
All patches except the substrate patch are linked pack to ink channels using the InkChannelIdLink
attribute. Patches that are made up of overprints of more than one color are linked
back to multiple ink channels. Tint patches have tint values specified and are linked
back to their parent ink channel. This example shows coding for measurements of the
substrate, 4 solids, 2 tints, an overprint, a graybalance and a build patch.
Within a PQX message, color data must be expressed as CxF. Rather than embedding
CxF object into a color measurement, the CxF data is stored, in its entireity within
separate CxF blocks of the message. So each PQX Measurement element in the Color
Report must be linked to the Object in the CxF Sample data block. Links for each measurement
may also be made back to standard (aim) CxF data if the Brand requires a return of
that data.
The quality of print registration is reported in the Registration Report element.
The data capture method for registration was by manual pressman pulls throughout the
run. The registration reporting method used was the “Channel Registration” method
where the alignment of each ink channel is compared to the reference ink channel.
In this example the reference ink channel is the “Cyan01” which can be referenced
back as the first in print order in the Press Information. The Registration Report
in the sample is made up of two Registration Sets. In each set the alignment of non-reference
ink channels is compared with the reference on both the X (horizontal) and Y (vertical)
axis.
Reports about any printing defects are carried in the Defect Report element. . The
report was generated from an automated system that has reported at a 100% rate of
inspection. Two defects were reported. Both were determined to be defects when compared
to the proof. The first defect was a hickey in the center zone with a severity of
“4”. The count for this defect was 200. The second defect was a line in the top margin
zone with a severity of “1”. The count for this defect was 1000.