Nguyen, Quyen L., and Betty Harvey. “Agile Business Objects Management Application for Electronic Records Archive Transfer
Process.” Presented at Balisage: The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In Proceedings of Balisage: The Markup Conference 2009. Balisage Series on Markup Technologies, vol. 3 (2009). https://doi.org/10.4242/BalisageVol3.Harvey01.
Balisage: The Markup Conference 2009 August 11 - 14, 2009
Balisage Paper: Agile Business Objects Management Application for Electronic Records Archive Transfer
Process
Quyen L. Nguyen
National Archives and Records Administration, Systems Engineering Division of ERA
Program Management Office
Quyen Nguyen is currently working in the Systems Engineering Division of the ERA
Program Management Office at the U.S. National Archives and Records Administration.
Before
joining the National Archives, he has worked for telecommunications software companies.
His experience is in developing software systems for large scale deployment. He has
a BS
in Computer and Information Science and Applied Mathematics from the University of
Delaware and a MS in Computer Science from the University of California at
Berkeley.
As President of Electronic Commerce Connection, Inc. since 1995, Ms. Harvey has led
many federal government and commercial enterprises in planning and executing their
migration to the use of structured information for their critical functions. Over
the past
14 years she has helped develop strategic XML solutions for her clients. Ms. Harvey
has
been instrumental in developing industry XML standards. Ms. Harvey is a member of
OASIS
Open and is currently an active participant in the Universal Business Language initiative.
Previously she was a member of the Core Components subcommittee of the ebXML initiative.
She is the co-author of "Professional ebXML Foundations" published by Wrox. Ms. Harvey
founded the Washington, DC Area SGML/XML Users Group in 1995. She still coordinates
the
users group which is the longest standing XML users group. Ms. Harvey is also a member
of
"The XML Guild" and recently coauthored the book "Advanced XML Applications From the
Experts at The XML Guild" published by Thomson. Currently, Ms. Harvey is working with
the
National Archives and Records Administration (NARA) on developing future system evolution
for the Electronic Records Archive (ERA) system.
In order to continue to fulfill its mission in the information technology age, the
National Archives and Records Administration (NARA) has made the decision to develop
the
Electronic Records Archives (ERA) system. One of the goals is to provide to the archivists
a
modernized system with automatic workflow that can streamline the digital archive
business
process.
For an archival system, Ingest is one of the core components. As part of the ingest
process, this component would allow the record Producer to negotiate submission agreement
before transferring digital materials into the system. Within the framework of a
service-oriented architecture with business process management, the ERA system uses
XML to
represent business objects and metadata. In this paper, we will show how the synergetic
combination of XForms and Genericode makes the system agile and responsive to business
user
requirements. Furthermore, the approach fits well with ERA's design principle to use
international and industry standards, and facilitates the integration of XML business
objects and the electronic records metadata. We believe that the standard-based approach
of
XForms+Genericode exposed in this paper can be generalized to develop any e-Forms
system
with a set of control values and vocabularies.
In response to the growing usage of information technology for conducting business
by
federal agencies, NARA has made the decision to build the ERA system. Its main goal
is "to
preserve electronic records independent of the hardware and software that created
them" 1. For the ERA system to store, preserve, and provide access to electronic
records, it has to cope with the following challenges
Scope. As for the mandate, records will come from the entire federal
government.
Variety. Since the federal government deals with
different application domains, from health care, education, defense, space exploration,
energy, environmental protection, etc., records will contain various types of knowledge.
Moreover, their manifestation and representation may have different formats such as
Microsoft Office documents, relational database files, or GIS artifacts.
Obsolescence. Added to the variety of domain and
format above is the constantly changing technology and application software that were
used
to create the records. By the time they are ingested into ERA, these applications
will be
most likely obsolete or belong to old versions of the software.
Volume. It is estimated that the total volume of
incoming records will be enormous, and will continue to grow over the years. Petabyte
and
exabyte range of data is not unimaginable.
In the face of these functional challenges, the ERA architecture will be designed
in such
a way to satisfy the following system qualities:
Extensibility. New record types, data types, and
services could be added to the system without extensive redesign.
Evolvability. New technologies in software and
hardware could be inserted using standards APIs and interfaces.
Scalability. The system must have the ability to
adapt to the growth of record volume and user community.
Besides these qualities, the ERA system and its components have to be user-friendly,
secure, and highly available to protect the assets and serve the public.
This paper is organized as follows. In section 2, we will give an overview of the
authority lists and the business objects that are used to manage and govern the archival
system. Sections 3 and 4 discuss the benefits of XForms and Genericode in general
as well as
they are applied to ERA. Then, we describe our approach of combining XForms and Genericode, that fits into our
overall system architecture. Implementation process with concrete examples is also
reported.
In section 6, we show how business objects contribute to the Archival Management Taxonomy
that
can help governing the content. We summarize our design approach in the conclusion.
Archival Business Objects Management
Archival Business Objects
Before the transfer of any set of records, the Producer (usually a government agency)
has to submit two business documents, namely Record Schedule and Transfer Request
to NARA.
The Record Schedule contains instructions the general disposition and maintenance
of various
types of records, such as permanent versus temporary records, and the retention period.
Over
the lifecycle of the Record Schedules, Transfer Requests will be created for every
physical
transfer to specify:
Transfer mode with detailed information
Access restrictions.
Transfer mode can be electronic wiring or via physical media such as audio cassette,
microfilm, CD, DVD, video cassette, parchment, or photographic print. Each business
object
goes through multiple business processes and negotiation of disposition between NARA
and the
producer before records are transferred from the contributing entity to NARA. Access
restrictions may be applied to records that contain privacy data.
Within the ERA system, Record Schedule, Transfer Request and other archival business
objects (ABO) are implemented as e-Forms. Currently, ERA e-forms are encoded using
JSP (Java
Server Pages) pages One main advantage of e-Forms over traditional paper documents
is that
e-Forms facilitate deterministic processing by the system thanks to:
Elimination of free text form.
Structured fields that conform to a pre-defined data model.
System validation of input data. Such validation can be performed based on the data
model with respect to data type specification, or required/optional
characteristics.
Elaborate validation based on embedded business rules during creating/editing
business objects. For instance, fields may be dependent on each other, and only values
from authority lists are allowed.
The ERA users manage these ABOs via the Archival Business Object Management Application
defined by:
Functional requirements for CRUD (create, update and delete) operations on
ABOs.
Non-functional requirements of flexibility, extensibility, user-friendliness, open
standards, performance and security.
With respect fo flexibility and extensibility, the design should allow adding or
modifying a field to an ABO form to be done at low development cost. Moreover, changes
to
business rules that govern the forms or the fields within a form should be easily
accomodated without requiring a lot of coding effort.
Authority Lists
Authority Lists, also known as code lists consist of values used to establish normalized
values for certain key fields in a business object. The goal of Authority Lists is
to
institute a controlled vocabulary to be used in e-Forms and archival descriptions
which is
part of the NARA metadata. Examples of authority lists are:
50 states with 2-letter abbreviations.
Table I
CA
California
MD
Maryland
VA
Virginia
...
...
Federal agency names.
Table II
NARA
National Archives and Records Administration
USPTO
United States Patent and Trademark Office
NOAA
National Oceanic and Atmospheric Administration
...
...
On the access side, Authority Lists play an important role in building queries from
various forms of search terms. For instance, with the use of Authority Lists, searching
for
NARA is the same as National
Archives and Records Administration. Due to its importance and sometimes
fluidity of the data, the ERA system is required to allow privileged business users
the
ability to manage the Authority Lists. Any update to the Authority Lists should take
effect
immediately and be made available to future creation and updates of business objects
and
forms.
Functional Requirements
Notably, the design for the application should exhibit low cost for development and
maintenance. Moreover, since ERA is an Service Oriented Architecture (SOA) system
with
Business Process Management (BPM) and Enterprise Service Bus (ESB), such design should
show
an ease of integration with the workflow. The functional requirements for ABO can
be
summarized as follows:
Presentation. ABO should be viewed and browsed via W3C standard browsers. Moreover,
the system should provide a friendly format rendering and printing capability of ABOs.
Management. CRUDVS (Create, Retrieve, Update, Delete, Versioning, Search) operations
will be supported. The creation of ABOs can also be based on predefined and
pre-configured templates.
Workflow. According to the requirements, "The system shall provide the capability
to
integrate forms into workflows" 18. The system will implement a
workflow for ABOs, which consists of a simple Draft-View-Approve cycle. In terms of
governance, the ABOs will play an essential role in managing the lifecycle of electronic
records to be ingested into the system. Therefore, the management of ABOs should be
integrated with BPM and BPEL-based system orchestrations 9.
XML Format. In order to persist the ABOs for a long term and in a fashion that is
independent of specific software and hardware environment, the ABOs will be stored
as
XML documents.
The following diagram depicts the components and services of the ABO Application from
the SOA layer pattern perspective. The applications for Business Object Management,
Business
Object Review, and Business Object Approval can be linked together by a Business Process,
which can be expressed in (Business Process Management Notation (BPMN) 10
and executed by a BPM engine.
XForms
Overview
XForms 2 is a W3C specification for implementing an XML-based Web
forms. In some sense, XForms can be viewed as a next-generation of HTML Forms. While
HTML
Forms is a mix of markup for presentation and form data. XForms takes the approach
of
separating data from presentation. Data in XForms can conform to an XML schema which
makes
data validation against the schema pretty straightforward. For Ajax server-side XForms
4, real-time validation would provide immediate feedback in case of error in
user input. Therefore, the user doesn't have to finish a lengthy form to know that
an error
has occurred. Note that in the case of an ABO, a form may span more than one scrolled
page.
Unlike the HTML form, the output of an XForms is an XML instance ready to be stored
in an
XML-aware repository. On the other hand, XForms has constructs to control user input
events;
but, these controls are not dependent on the particular presentation modality. Therefore,
it
is possible to create a form that will take input from various input devices, such
as
keyboard and voice. In the paper Multimodal Interaction with
XForms6, Mikko Honkala et al. proposed XForms
Multi-modality (XFormsMM) to facilitate concurrent support of multiple modalities
using
XForms. The XFormsMM architecture comprises of the following components:
Separate abstract user interface (UI) controls from modality specific elements such
as style sheets
Interaction manager to switch and coordinate different modalities
Set of modality renderers for CSS style sheets
The application of the multi-modality is to enable a web form to support various UI
devices such as desktop, wireless handset, IVR applications, etc. In the case of our
application, this interesting feature might be exploited to develop user interface
providing
access to all users, impaired and non-impaired.
Advantages
The advantages of XForms have been identified in the literature and past conferences
45:
Data integrity. Since XForms is associated with an
XML schema, data integrity is preserved due to the compliance of the forms with data
constraints specified by the schema.
Data exchange. The output of an XForms is an XML
document itself, which can readily carry data between SOA components and services.
In
the case of WSDL (Web Services Description Language), the data will be embedded in
SOAP
messages. If RESTful Web services are used, then the URL will contain the reference
to
the XML instance.
Performance. Response time and user experience are
greatly enhanced as latency is reduced thanks to Ajax-based implementations.
Consistency. XForms specifies a construct to handle
XML errors, thus facilitates uniform and consistent error handling and error
messages.
Modularity and reuse. An XForms document can be
composed of sections. Consequently, parallel development can be planned and organized.
Building up a library of reusable XForms sections will therefore be possible. For
example, in the case of ABO, we have a section for Personal Contact, and another one
for
Organization Information. Moreover, XForms constructs to control user input events
will
definitely save development time and cost.
Low-cost system requirement. Server-side XForms
processing does not impose any requirements on end-user browsers. Note that ABO Forms
are to be used by NARA archivists and agencies' record managers, and we don't want
to
levy any configuration requirements for using the application.
Standard support. Being based on XML itself, XForms
can be easily integrated with other XML-related open standards such as XML Schema,
XSLT,
XSL-FO, XHTML, XPath, and XQuery.
Genericode
Overview
Genericode 3 defines a standardized model to manage code lists using
a defined XML schema. Essentially, the idea is for a code to have a code key, and
multiple
code values. Every XML project has controlled lists that need to be supported in the
XML
application. There are two schools of thought for controlling code lists. The Universal
Business Language (UBL) 19 is an OASIS-Open standards effort to describe
business documents, e.g. invoice, purchase orders, etc. in XML. In version 1.0 of
the
standard, all the code lists for codes such as country codes, currency codes, etc.
were
embedded in the XML schemas. As individual countries started adopting UBL, it became
apparent that placing large code sets in the schemas was a problematic approach. Some
of the
codes change rapidly and during implementation this required modification of the "standard"
schemas which theoretically meant they were no longer UBL compliant. The definition
of the
codes were in the documentation portion of schema and not readily available to the
application. Modifying schema to support changing codes became a configuration problem.
Members of the UBL technical committee developed the Genericode concept and it was
adopted by UBL and other organizations. Genericode has now its own OASIS-Open technical
committeer20 , and is currently at version 1.0.
Why is Genericode Valuable to NARA Enterprise
Besides the codes that are located in the business objects and metadata descriptions
of
records, NARA receives records from agencies in the U.S. Federal government, as well
as some
private collections. Every record set has its own set of codes. The codes themselves
can
have the same meaning but different codes across multiple records. For example, if
we look
at the records of ship manifests from the Irish Potato Famine, a code as simple as
the age
of the person can have several codes representing the age and/or event depending on
individual ship manifests.
The logical expectation is that a person"s age is pretty straightforward. However
looking at the table below, which contains actual values of codes from the passenger
list
maintained by NARA r21 it is readily apparent and a person"s age in a
record isn't as cut and dry as some would expect. For example you wouldn't expect
to see a
value of 900 to represent a person's age.
Table III
Typical Codelist Representation in NARA Records
Code
Meaning
900
Born at Sea
901
Infant in months: 01
909
Infant in months: 09
800
Unknown
1
age 01
001
age 01
2
age 02
002
age02
003
age 03
3
age 03
03
age 03
The table below shows a representation of how the Genericode is organized. The advantage
of the Genericode approach is that multiple codes can represent a single concept.
Table IV
Genericode Table Representation
Meaning
Ship1
Ship2
Ship3
Born at Sea
900
888
766
Infant in months: 01
901
.1
500
Infant in months: 09
909
909
909
Unknown
800
800
800
age 01
1
001
1
age 02
2
002
002
age 03
3
003
3
The Genericode standard has 3 major sections:
Identification: Identification and location information
(metadata).
ColumnSet: Description for each column in the Genericode
list.
SimpleCodeList: The container for the actual code list.
If we look at the table Table IV above, the Genericode representation
would be:
As stated above, enumerated data can be problematic in XML documents. One approach
is to
have allowed values for a data element be enumerated in an XML schema, so that associated
XML documents and their enumerated values can easily be validated by a standard XML
parser.
However, if there is a need to add, remove, or change values to the enumeration lists,
then
the schema has to be modified. In 12, the author proposed different
solutions to extend the enumeration lists in an XML schema, by using XML mechanisms
such as
<xsd:union>, <xsd:pattern>, or <xsd:annotation>.
The author argued that some of these solutions are advantageous because they require
only
one pass validation, thus avoiding performance penalty.
On the contrary, Genericode offers an approach that allows the management of the
enumerations to be independent from the XML schema. Although this would imply a separate
validation for the code list values, it does have some advantages 13:
Genericode is a flexible scheme to manage the code lists for applications where
business logic parsing performance is not a critical requirement. This is the case
for
the ABO Management application.
If a change is confined to the display value of a code, then any application data
using the code key will not be affected.
Adding or removing a code from a list can be done directly in the XML code list. All
forms using that code list will be changed simultaneously without requiring any
programming, as in the case of using JSP and enumeration in the schema. In our system,
we can build a simple application to allow NARA policy makers to manage the code lists
that govern the critical values in the forms for ABOs.
Another key advantage for NARA to maintain all their code lists in a "generic" common
format is the ability to create a standard NARA-wide authoring environment for developing
and maintaining code lists across the enterprise. NARA can ultimately have thousands
of code
lists when preserving and describing electronic records for the entire federal
government.
XForms and Genericode Together
Advantages
By combining XForms and Genericode, our approach can benefit the advantages of both
XForms and Genericode. Indeed, the introduction of Genericode into XForms provides
a
separation of concerns in the software and data development:
Evolvability. From the data management perspective, the XML
schemas associated to the ABOs and the controlled enumerations can evolve independently
of each other. Thus, their maintenance will be more efficient. In business practice,
the
code lists would experience more changes than the schema. Moreover, with respect to
software maintenance, it would not be desirable to change too frequently XML
schemas.
Modularity. From the software engineering
perspective, we can easily design and develop two separate modules: one to process
the
XForms, and the other to manage the code lists. Due to the separation of data, changes
to the code lists would not affect the XForms processing module.
Separation of Control. The modularity of software
fits the business organization of NARA, where the group responsible to control the
code
lists is different from the one managing the ABOs. Access to each of these modules
can
be implemented using RBAC (Role-based Access). Note that the potential users of the
ABOs
include NARA archivists as well as record managers from all federal agencies.
Development Process
The development process for XForms-based approach consists of the following
steps:
Develop XForms model based on XML Schema, which conforms to and ERA conceptual data
model.
Develop XForms input control for the data elements in the XForms model.
Develop XForms data validation rules based on the business rules provided by the
record managers and processors. The implementation makes use of XForms binds derived
from the XML schema constraints.
Develop error handling, and error message in order to provide consistency, hence
user-friendly and ease of maintenance.
Develop CSS (Cascading Style Sheets) for each form. This phase would involve
interactions with end-users in order to get their feedback and suggestions.
Define SOAP messages used in Web Services that implement business workflow to
process XForms instance upon XForms submission.
Implementation
Our implementation of XForms/Genericode uses the Apache Tomcat application server
as the
infrastructure. The Orbeon XForms Server is used as a platform for managing the forms.
We
chose Orbeon over other XForms applications for the following reasons:
A large user base
Capability for support in the future
Well deployed
Ability to easily integrate with XML repository
XInclude support
We chose to use a standard XML database as the repository for storing XML components.
Initially we used eXist Open Source repository then moved to MarkLogic (commercial
software)
for the prototype that integrates XForms and BPM section “XForms and Business Process Management (BPM)”:
Reusable XForm components
Genericode code lists
Converted code lists used for consumption in the form
XML business objects saved from form
The figure below shows the interaction of the forms to the Genericode.
Maintaining the code list external to the form provides the ability for the use of
the
same code list in multiple forms or multiple applications. When a code list is updated,
all
the forms that consume the code list will be automatically updated without any recompilation
of code.
Populating Fields from Code List Lookups
Certain fields can be automatically populated based on the selection of a single code.
There are several places when this becomes important for NARA. A good example is the
"Record Group ". NARA classifies all records by a number
which represents a title of the record group. A record group number can represent
an agency
or a collection records. For example the record group 21 represents"
Records of District Courts of the United States ". Once a record group is
assigned it never changes. Every federal agency is associated with one or more record
groups
(usually just one).
In an ABO form, the user selecting his/her agency will only be presented with the
record
group(s) associated with his/her agency. Once they select the record group, the record
group
title is automatically populated in the XML. Below is the XForms code which provides
this
functionality:
The example above shows that the pull-down list is being populated from the AgencyRecordGroup-instance XML codelist. This codelist is set in
the XForms model section using the <xforms:instance> element shown
below.
The codelist is being pulled from an eXist XML repository. An major advantage to having
the codelist maintained in a external XML file is that if an agency name changes (and
they
do quite often), the form does not require modification. Once the code list is updated,
all
forms using the code list are automatically updated. The record title gets populated
by
using an XForms <xforms:bind> function using an XPath statement. The attribute
calculate attribute is used to set the value.
The XPath statment in the calculate statement above
basically says get the value of the record group title from the code list instance
called
"RecordGroupTitle-instance" where the <RecordGroupNumber> in the list matches
the "RecordGroupNumber" variable. The $RecordGroupNumber has been set previously in
the
form.
Business Rules
The ABO forms must follow business rules that dictate the dependencies between the
fields within a form, and also between the fields in one ABO to another. For example,
if the
"Required" indicator of a group of fields is checked, then valid values must be supplied
to
all fields within that group.
Some of the rules for the interaction of code lists in forms can be quite complex.
The
use of XForms and XML code lists allow the ability to define these rules using XPath
statements.
Figure 3 show a pull-down selection bar for access restrictions. The form components
change based on the selection in he pull-down list. The values of the pull-down list
are
populated by a code list for access restrictions. There are business rules associated
with
what information needs to be completed based on the value of the access restriction.
For
instance, if the user selects " Presidential Records Act (p)(3)
Statute " the user must select a statutory citation. (See Figure 4)
The XForms construct that controls whether the field is displayed based on the selection
is below:
The XPath above states that if the access restriction contains "FOIA (b)(3) Statute"
or
"'Presidential Records Act (p)(1)" then display the field.
Figure 5 shows the user selecting "PRMPA- Personal Privacy (D)".
You will notice that in Figure 6, nothing is provided the user.
Content Governance
As mentioned earlier, the ABOs serve to administer the transfer of archival records
into
the open archival system. Therefore, the ABOs will provide provenance and management
metadata
to the digital objects ingested and stored in the system according the Archival Information
System (OAIS) information reference model 14. In ERA, the metadata of a
digital object is embodied in an Asset Catalog Entry (ACE).
Asset Catalog Entry
An ACE is represented by an XML document that conforms to a well-defined XML schema.
At
the high level, the structure of an ACE is compliant with the various kinds of metadata
as
described in the OAIS information reference model 14. The following
aspects have been considered in the design of the ACE structure:
Information type. From the OAIS model, an ACE should contain
information about its associated digital object in terms:
Reference for uniquely identifying the digital object. Usually,
this identifier is location and protocol independent.
Provenance for preserving the history and chain of custody of
the object.
Context for recording the circumstances of the object's
creation.
Fixity for storing authenticity mechanisms of records such as
digital signature, and checksum.
Descriptive used for object search and discovery.
Standard integration. Given the diversity of data, business and
knowledge domains as well as types, one standard alone cannot cover all the information
types of an ACE. Therefore, the structure of an ACE should facilitate the integration
of
different XML standards. For instance, the ACE's schema should incorporate PREMIS
(Preservation Metatdata Maintenance Activity) standard 16 for
preservation metadata. If the digital object is a still image, then MIX standard 17 will be included to describe the technical metadata of the image
object. NARA has its standard called Lifecycle Data Requirements Guide 15 that archivists have used to convey descriptive information of an
object.
Metadata aggregation. The processing of an archived digital
object can be performed by different archivist groups using varied archival processing
systems and technologies. Associated metadata will be collected at each stage, and
finally accumulated in the ACE stored into the ERA system. Within this environment,
the
ACE structure should be designed in such a way to facilitate easy and efficient import
of metadata generated by the processing applications. In order to achieve this, the
ACE
can be divided into slots. Each slot is reserved for an archival processing system,
and
will have disjoint data elements to avoid duplication and complex crosswalk.
Technical aspect. One last challenging aspect to consider is
that some metadata is unchanged since the time of ingest into ERA such as file type
and
size. Other metadata of the same object will certainly undergo changes such as
description, and access rights. Therefore, two types of data management systems have
to
be combined to accommodate these two modes of metadata, mutable vs. immutable parts.
Archival Management Taxonomy
The design of ACE structure should allow the classification of digital objects in
ERA
according to different taxonomies. Archival Management Taxonomy is a special taxonomy
that
is mostly used by archivists, or record managers from federal agencies. A public user
would
be most likely interested in other taxonomies associated with the domain of the content
of
the data. In order to support the Archival Management Taxonomy, an ACE must include
information extracted from the ABOs (Record Schedule and Transfer Request) that were
created
before the digital object represented by this ACE got ingested. With this organization,
we
can develop an application that allows business users to browse all digital objects
grouped
under a set of related ABOs.
XForms and Business Process Management (BPM)
Recently the NARA ERA Systems Engineering Division developed a prototype to determine
the
feasability and challenges of interjecting new technologies based on standards into
the
current system. The ERA system requires that workflow processing be flexible and incorporated
into the system in a timely fashion by avoiding hard-wired and high maintenance cost
business
orchestrations. The goals of the prototype were:
Find the best of breed BPM tool that supports modeling human interactions with the
business process compliant with BPMN (Business Process Modeling Notation) standard.
The BPM workflow should integrate seamlessly with our Forms Management (XForms)
solution.
Find the best of breed system orchestration tool compliant with BPEL (Business Process
Execution Language) r22. BPEL orchestrations run on ESB should be able
to integrate seamlessly with the BPM workflow above.
One of the challenges was the current lack of support for XForms solution among BPM
vendors. All BPM vendors have their own forms management system internal to their
software.
Since the business objects (forms) are an integral part of the ABO Application capability,
it
would be desirable to have a plug and play architecture. It would be difficult to
replace BPM
software with another BPM if the forms were tied up in a proprietary format. Therefore,
we
only consider products that support the abstraction of the form from the application
layer.
The XForms/BPM prototype simulated business process capability of XForms interacting
with
various BPM candidate software packages using a web service call to the BPM. The same
web
service was used to interact with these packages for testing and evaluation. In the
current
system, LDAP was used to authenticate the user and their roles in the portal.
The graphic below demonstrates the flow and interaction of the various components.
For illustration, we are showing in the following diagram a BPMN Workflow for the
creation
of a Transfer Request (TR). A TR is normally required every time a Producer wants
to transfer
electronic materials to the system. This business object must be created and approved
before
any actual transfer can occur. It should be noted that the workflow integrates different
key
technologies presented in this paper: XForms, Genericode, BPM, BPEL, and XML database.
Thanks
to the component and service architecture as exhibited in the diagram, the design
is very
flexible and offers low cost maintenance, should we need to modify the workflow. Moreover,
the
actual implementation and evolutio of components and services involved in the workflow
should
not affect each other as long as the interface is maintained.
Conclusion
In this paper, we have described an agile approach which is integrated into our XML-based
stack from presentation, business logic, to persistence store for managing Archival
Business
Objects.
Our approach leverages the synergy of various XML standards and technologies at multiple
layers of the software architecture:
Presentation layer with XHTML, XSL-FO
Form layer with XForms
Data layer with XSD and Genericode
Persistence storage layer with XML database.
We have shown that the combined application of XForms+Genericode provides such a
flexibility that the software process can easily adapt to changing and evolving business
requirements at NARA. Although the paper was based on our experience in developing
ERA, we
believe that the scheme described herein can be generalized to other applications
that need a
flexible management of business objects via web forms with code lists. Furthermore,
the
integration with BPM and BPEL greatly enhance the flexibility of the whole design.
[3] OASIS. Code List Representation
(Genericode), Version 1.0, Committee Specification 01, 28 December 2007.
Available: HYPERLINK
"http://docs.oasis-open.org/codelist/cs-genericode-1.0/doc/oasis-code-list-representation-genericode.pdf"
http://docs.oasis-open.org/codelist/cs-genericode-1.0/doc/oasis-code-list-representation-genericode.pdf
[4] Eric Bruchez. XForms: an
Alternative to Ajax?. XTech 2006, Amsterdam, The Netherlands.
[5] Richard Cardone, Danny Soroker, and Alpana Tiwari.
Using XForms to Simplify Web Programming. WWW 2005, May
10-14, 2005, Chiba, Japan.
[6] Mikko Honkala, and Mikko Pohja.
Multimodal Interaction with XForms. ICWE’ 06, July 11-14, 2006, Palo Alto, California
[7] R. Bourret. XML and
Databases. Available: http://www.rpbourret.com/xml/XMLDBLinks.htm
[8] Orbeon. Available: http://www.orbeon.org
[9] OASIS. Web Services Business
Process Execution Language Version 2.0. Available:
http://docs.oasis-open.org/wsbpel/2.0/wsbpel-v2.0.pdf.
[10] Object Management Group/Business Process Management
Initiative. http://www.bpmn.org/.
[11] eXist, Open Source XML Database. Available: http://exist-db.org/
[12] W. Paul Kiel. Extend enumerated
lists in XML schema– Explore options for your extension solution. Available:
http://www.ibm.com/developerworks/library/x-extenum/.
[13] G. Ken. Holman. Introduction to
Code Lists in XML (Using Controlled Vocabularies in XML Documents). Available :
http://www.xmlprague.cz/2009/presentations/G-Ken-Holman-Introduction-to-Code-List-Implementation.pdf.
[14] The Consultative Committee for Space Data Systems.
“Reference Model for an Open Archival Information System (OAIS)”, 2002. Available:
http://
public.ccsds.org/publications/archive/650x0b1.pdf [Feb. 16, 2009].
[15] National Archives and Records Administration.
Lifecycle Data Requirements Guide. Available:
http://www.archives.gov/research/arc/about-arc.html#descriptions.
W. Paul Kiel. Extend enumerated
lists in XML schema– Explore options for your extension solution. Available:
http://www.ibm.com/developerworks/library/x-extenum/.
G. Ken. Holman. Introduction to
Code Lists in XML (Using Controlled Vocabularies in XML Documents). Available :
http://www.xmlprague.cz/2009/presentations/G-Ken-Holman-Introduction-to-Code-List-Implementation.pdf.
The Consultative Committee for Space Data Systems.
“Reference Model for an Open Archival Information System (OAIS)”, 2002. Available:
http://
public.ccsds.org/publications/archive/650x0b1.pdf [Feb. 16, 2009].
National Archives and Records Administration.
Lifecycle Data Requirements Guide. Available:
http://www.archives.gov/research/arc/about-arc.html#descriptions.