Hymers, Jessica, and Qinqin Lin. “Retractions and Corrections at Scholars Portal Journals.” Presented at Balisage: The Markup Conference 2023, Washington, DC, July 31 - August 4, 2023. In Proceedings of Balisage: The Markup Conference 2023. Balisage Series on Markup Technologies, vol. 28 (2023). https://doi.org/10.4242/BalisageVol28.Hymers01.
Balisage: The Markup Conference 2023 July 31 - August 4, 2023
Balisage Paper: Retractions and Corrections at Scholars Portal Journals
Jessica Hymers
Metadata Production and Electronic Access Specialist
Scholars Portal Journals, a service of the Ontario Council
of University Libraries (OCUL), is an XML based repository that
hosts e-journal content for universities in Ontario. As part of the
OCUL mission to provide accurate and up-to-date access to scholarly
research, Scholars Portal has recently taken on a project to improve
how article corrections and retractions are handled in our workflow.
This paper introduces the new process of utilizing the JATS metadata element
<related-article> to link between articles and their corrections and
retractions so that users are immediately aware that there have been changes to
the article they are viewing. This paper will also discuss challenges in this
process including the inability to handle articles that have not been registered with
a
Digital Object Identifier (DOI) and difficulties with inconsistent
use of attribute values.
Scholars Portal (SP) was founded in 2002 to provide the shared technology infrastructure
for the Ontario Council of University
Libraries (OCUL). A main service is the Journals platform, an XML-based digital repository
hosted in a MarkLogic database which
contains over 66 million articles from 26 thousand journals. The platform provides
access to these articles for faculty and students
from universities across Ontario. Additionally, in 2013 SP became the first Canadian
Trustworthy Digital Repository (TDR) certified by
the Center for Research Libraries, ensuring the long-term preservation of the journals
purchased by OCUL libraries. As a province-wide
access point and preservation service, SP has a significant responsibility to ensure
the accuracy and clarity of our content.
To meet this responsibility and to fulfill OCUL's mission of advancing research by
seeking out
innovative strategies for preserving and curating research resources
(OCUL),
we have implemented a new process of utilizing the <related-article> element of the
Journal Article Tag Suite (JATS) metadata
schema to connect articles with their retractions and corrections. This paper presents
the results of the implementation of this
process as well as challenges and future opportunities. It will also discuss how our
project fits into other standards and communities
of practice, introduce some history of handling retractions at SP, and present four
case studies of the process in our database.
Background
When a published article is found to have an error, publishers will issue a correction.
This is commonly in the form of a published
statement in the journal that published the original article. When an error is found
that calls into question the validity
of the research results, publishers will instead issue a retraction. Retraction is
the process through which a published article
is removed from a journal, and thus the scholarly record, and involves publishing
a retraction notice. Other reasons for retractions
include cases of duplicate publishing or cases of academic misconduct.
Much of the existing research in this area has a stronger focus on the decision to
retract an article and how the publisher accomplishes this,
than on how to handle the retraction in distribution, access, and preservation infrastructures.
This focus on publisher responsibilities is seen
in the retraction guidelines created and maintained by The Committee on Publication
Ethics (COPE) (COPE).
In the past few years, a number of communities have begun to address the role that
aggregators such as SP can have in this space. A detailed
report titled Recommendations from the Reducing the Inadvertent Spread of Retracted Science: Shaping
a Research and Implementation Agenda Project (Schneider et al., 2021) outlines the importance of continued work in this area and provides a number of
recommendations for future directions.
Additionally, NISO created the Communication of Retractions, Removals, and Expressions
of Concern (CREC) working group
(CREC) in the summer of 2022 to guide the creation of standards and best practices for
what to do after the decision to retract
an article has been made. Our project has aimed to align with the early findings and
suggestions of these groups by indicating the status of retracted
and corrected articles in our database and displaying that status for human users.
As the development of standards in this area is ongoing, we have
also aimed to keep our project flexible and adaptable to accommodate future changes
in standards and partner workflows.
Scholars Portal either receives data from publishers via FTP or uses scripts to pull
content from publishers' FTP sites.
Most commonly this content is received in the JATS metadata format. JATS is the Journal
Article Tag Suite and is a
standard of NISO Z39.96-2021 that
provides a common XML format in which publishers and archives can exchange journal
content
(NISO).
The SP team then uses programs referred to as "loaders" to ingest the content into
the database. An
average of 12,000 articles are loaded each day, often as soon as they are published.
Upon initial loading each article is
assigned a collection to determine user entitlements and a corresponding record is
created in a secondary database as part of
the Trustworthy Digital Repository (TDR). This TDR record contains basic bibliographic
information about the article as well
as information about its preservation history. The TDR ensures content is preserved
and reliable and also serves as a method by
which an article's history in the database can be investigated.
Through this loading process, SP has adopted a number of strategies to deal with retracted
content. A small number of publishers
provide occasional lists of retractions that we then delete from the database or replace
with metadata indicating the retraction. We also
have a process for DOI verification to prevent duplicates within the database. With
this process, if publishers issue a correction or retraction
notice with the same DOI as the original article, the notice would overwrite the original
article in our database. All other cases were handled
by periodic database cleanup projects which involved searching text fields such as
Title or Abstract for strings that could indicate a retraction
such as "retraction:" or "retracted:". These strategies are labour-heavy, do not have
clear results for SP users, and are limited to retractions.
Any retractions missed during this process, and all corrections, remained in our database
with no indication of the changes.
To improve this process, SP has begun utilizing the <related-article> JATS metadata
element which is meant to be used for the
description of a journal article related to the content but published separately
(JATS). For this process, the <related-article> element must also
include the ext-link attribute ="doi" and the xlink:href attribute to create the link
between articles, and the related-article-type
attribute to identify the type of relationship. JATS has described some suggested
usage of the related-article-type attribute, but
inconsistent use of this attribute created some challenges.
Another possible approach could have been cross-referencing SP holdings against retraction
databases such as RetractionWatch (RetractionWatch) or open-retractions (open-retractions).
To ensure reliability, save processing time, and conserve server space however, it
is preferable to work with data already in the SP database. Additionally
these external retraction databases identify retractions through many of the same
methods that SP utilized prior to this project (Cheng 2019).
Not only would this approach exclude other types of related articles such as corrections,
it would also miss anything that is not labeled explicitly as a retraction by the
publisher.
Case One
The most straightforward case is when both of the paired articles are sent to SP with
a <related-article>
element in the metadata. This is a common case when publishers retract an article
and then send SP a retraction
notice along with a revised PDF with a "retracted" watermark. In this case, the data
needed to create the link
between the articles is already present in the metadata and only needs to be displayed
by the website. The website
code reads from an XML mapping file to determine what to display based on the related-article-type
and xlink:href
attributes. Additionally, all display text has been translated into French to allow
for bilingual service.
The mapping file for the HTML display indicates the English and French text that will
be displayed for articles that contain
each related-article-type attribute in each <related-article> element
<dataset-type name="PublisherA">
<related-article type="retracted-article">
<DisplayText lang="en">This article is a retraction notice for:</DisplayText>
<DisplayText lang="fr">Cet article est un avis de rétraction pour:</DisplayText>
</related-article>
<related-article type="retraction-forward">
<DisplayText lang="en">This article has been retracted:</DisplayText>
<DisplayText lang="fr">Cet article a été retiré:</DisplayText>
</related-article>
</dataset-type>
The DOI in each <related-article> element indicates which article in our database
it will be linked to.
On the platform this is displayed as:
A limitation found in this case is that due to publishers' inconsistent usage of values
for the related-article-type attribute,
it was impossible to iterate across all <related-article> elements at once. Articles
were checked manually to determine how the
attribute was used by each publisher. A <dataset-type> element was then added to the
mapping file to distinguish a specific publisher
and <DisplayText> for each related-article-type attribute value was nested under <dataset-type>
so that different display texts could be
specified based on each publisher's use of the attribute value. Additionally, creating
the link between articles is only possible if
both articles have DOI. Possible next steps for this project could investigate how
to recreate this process with other publisher
specific IDs.
Case Two
Case 2 occurs when only one article of the pair includes the <related-article> element.
In order to create the link between articles,
the <related-article> element must be inserted into the record of the corresponding
article. This is a common case when publishers retract
an article and send a retraction notice without making any changes to the original
article.
For example, an original article is loaded to the database with no <related-article>
element. The publisher then retracts this article
and sends SP a retraction notice that includes the <related-article> element to indicate
the relationship to the original article which it is
retracting.
Element in retraction notice:
<related-article related-article-type="retracted-article"
id="d24e93"
ext-link-type="doi"
xlink:href="10.7759/cureus.6741">
<article-title>Comparison of Oral versus Intravenous Proton Pump Inhibitors
in Preventing Re-bleeding from Peptic Ulcer after Successful Endoscopic Therapy
</article-title>
</related-article>
To create the link between this retraction notice and the retracted article, SP has
created a program to insert the <related-article> element
into the article that is indicated by the DOI in the <related-article> element of
the retraction notice.
In order to determine the related-article-type attribute to use in the inserted <related-article>
element, a <matching-article-type> element was
added to the mapping file
<dataset-type name="PublisherA">
<related-article type="retracted-article">
<DisplayText lang="en">This article is a retraction notice for:</DisplayText>
<DisplayText lang="fr">Cet article est un avis de rétraction pour:</DisplayText>
<matching-article-type>retraction-forward</matching-aritcle-type>
</related-article>
<related-article type="retraction-forward">
<DisplayText lang="en">This article has been retracted:</DisplayText>
<DisplayText lang="fr">Cet article a été retiré:</DisplayText>
<matching-article-type>retracted-article</matching-article-type>
</related-article>
</dataset-type>
A limitation found in this case is due to the fact that the use of each related-article-type
attribute is sometimes inconsistent within a single
publisher. This places limitations on the specificity and the accuracy that is possible
with our labels. For example, if a publisher uses the
related-article-type attribute "retracted-article" for all retraction notices, correction
notices, and retracted and corrected original articles, it
is impossible to differentiate these in the display. In this case, the general display
text "Additional materials:" is used.
Case Three
Because the data in the mapping file is separated by publisher, it was also necessary
to create a default case as a catch-all for publishers that begin
to use this element, or start using related-article-type attributes that are not yet
added to the mapping file.
After the program is run, these default cases can be located in the log file, analyzed,
and manually added to the mapping file.
A limitation found in this case is that it does not account for changes to a publisher's
usage of the related-article-type attribute after that publisher
and attribute value have already been added to the mapping file. In this case, we
rely on user feedback to identify these errors.
Case Four
Unlike the first three cases which involve only two articles, this case describes
a connection between three articles. In this example an original article
was delivered to SP as usual. The publisher then issued an expression of concern to
indicate that the article was under review for retraction and then later
issued a retraction notice to indicate that a decision was made to retract the article.
The <related-article> element in the expression of concern creates a link
between it and the original article, and then the <related-article> element in the
retraction notice creates a second link to the original article.
Example:
Original article is received with no <related-article> element and is loaded to the
database as usual.
SP then receives an expression of concern that includes a <related-article> element
with an xlink:href attribute with the DOI of the original article.
A matching <related-article> element is then added to the original article with the
related-article-type attribute of "object-of-concern-forward" from the mapping and
the DOI from the matching <related-article> element in the expression of concern,
creating a link between the two articles.
SP then receives a retraction notice that includes a <related-article> element with
an xlink:href attribute with the DOI of the original article.
A second <related-article> element is then added to the original article with the
related-article-type attribute of "retraction-forward" from the mapping and the DOI
from the matching <related-article> element in the retraction notice, creating a second
link.
The final results of this project include an XML mapping file which is loaded to
the MarkLogic database, a program to insert <related-article> elements into article
XML
records that is run once every three months, and some alterations to the code of the
SP website to display the link and descriptive text based on the mapping file. Maintaining
data in the mapping file instead of directly in the code of the program or the website
allows for ease of changes and updates as well as permitting faster loading and processing.
The program to insert <related-article> elements is currently being run for six of
the 24 publishers that include the element in their metadata. Nine publishers either
do not
send their data in the JATS format or do not include the <related-article> element
and required attributes, and so are not included in this project.
The mapping file includes 26 unique related-article-type attribute values (Appendix 1) but due to varied usage, many are repeated under different publishers resulting
in 73 unique
mappings. To display these, the mapping includes 10 unique DisplayTexts which are
included in both French and English (Appendix 2).
Other attribute values for related-article-type that are used by publishers but are
not included in the mapping file due to inconsistent usage or definitions outside
the scope of
this project include:
article
article-reference
author-rejoinder
author-response
continues
data-paper
editor-report
in-focus
in-this-issue
journal
letter
other-specified
patientsummary-article
point-of-view
preprint
refers-to
related
reply-article
see-also
subset-article
wiki
These values are not yet fully evaluated but include relationships between articles
such as peer review, related datasets, letters to the editor, and companion articles.
Until
these can be further explored, they are set to display as "Additional Materials:/Matériaux
additionnels:"
Conclusion
Scholars Portal loads an average of 12 thousand articles each day and serves over
500 thousand users across the province of Ontario and so has a responsibility to
ensure accurate and reliable content. Improving the clarity of the connection between
retractions, corrections, and original articles not only ensures SP users are
receiving the correct information but also offers users transparency into the process
of scholarly publishing. Utilizing the <related-article> JATS metadata element
was an effective approach to this project because it already exists in the metadata
of many of the corrections and retractions that are received from publishers. The
challenges included automating the process so that it could be implemented within
existing workflows for the high volume of data that SP handles daily, and dealing
with the inconsistency in attribute usage. Increased use of this JATS element, and
improved consistency in usage of attribute values would allow for increased accuracy
in
this project as well as expansion to include other types of related content such as
peer review information, letters to the editor, and comments.
To improve results and further align with CREC and RISRS recommendations, future steps
to this project could include adjusting the search function to filter out retracted
articles and investigating
how this work applies to other SP domains such as datasets and other supplementary
materials.
Acknowledgements
We would like to thank Sabina Pagoto and Jonathan Dorey for their French translations,
and Wei Zhao for her priceless historical knowledge of the SP database.
This Appendix contains a list of the unique related-article-type attribute values
that are included in the SP mapping file
addended-article
addended-article-forward
addendum
addendum-article
companion
companion-forward
concerning-article
concerning-article-forward
corrected-article
correction
correction-forward
default
default-forward
expression-of-concern
expression-of-concern-article
object-of-concern
object-of-concern-forward
original
republished-article
retracted-article
retraction
retraction-article
retraction-forward
update-to-article
withdrawn-article
withdrawn-article-forward
Appendix 2. Display Text
Note
This appendix contains a description and English and French DisplayText values of
each of the 10 unique DisplayTexts included in the SP mapping file
Table I
Unique DisplayText values
Description
DisplayText lang=”en”
DisplayText lang=”fr”
Default value for any related-article-type that has not been mapped - temporary until
evaluated and added to the mapping
Additional materials:
Matériaux additionnels :
An original article for which an addendum has been published - link leads to notice
of the addendum
There has been an addendum published for this article:
Un addendum a été publié pour cet article :
Published notices of addendum - link leads to the original article
This article is an addendum to:
Cet article est un addendum à :
Notices of correction or published comments - link leads to the original article
This article is a correction notice or comment for:
Cet article est un avis de correction ou un commentaire pour :
An original article for which comments or corrections have been published - link leads
to the comment or notice of correction
There have been published comments or corrections to this article:
Des commentaires ou des corrections ont été publiés pour cet article :
Published corrections - link leads to original article that is being corrected
This article is a correction notice for:
Cet article est un avis de correction pour :
An original article for which a correction has been published - link leads to notice
of the correction
There has been a correction published for this article:
Une correction a été publiée pour cet article :
Published notice that an article is under review for potential correction or retraction
- link leads to the original article
This article is an expression of concern for:
Cet article est une manifestation de préoccupations pour :
Published notices of retraction - link leads to the original article that has been
retracted
This article is a retraction notice for:
Cet article est un avis de rétraction pour :
An original article that is currently under review for potential correction or retraction
- link leads to the expression of concern
There has been an expression of concern published for this article:
Une manifestation de préoccupations a été publiée pour cet article :
An original article that has been retracted - link leads to the published notice of
retraction
This article has been retracted:
Cet article a été retiré :
An article that was retracted and then corrected and re-published - link leads to
the original version that had been retracted
This article is a corrected and re-published version of a previously retracted article:
Cet article est une version corrigée et republiée d'un article précédemment rétracté
References
[Cheng 2019]
Cheng, Y., Parulian, N., Hsiao, T., Dinh, L., Sarol, J., Schneider, J. (19-23 October
2019). ReTracker: Actively and Automatically Matching Retraction Metadata in Zotero.
82nd Annual Meeting of the Association for Information Science and Technology, Melbourne
Australia.
doi:https://doi.org/10.1002/pra2.32
[Schneider et al., 2021]
Schneider, J., Woods, N.D., Proescholdt, R., Fu, Y., & RISRS Team. (2021, July 29).
Recommendations from the Reducing the Inadvertent Spread of Retracted Science: Shaping
a Research and Implementation Agenda Project [online] [cited 13 July 2023].
doi:https://doi.org/10.31222/osf.io/ms579
Cheng, Y., Parulian, N., Hsiao, T., Dinh, L., Sarol, J., Schneider, J. (19-23 October
2019). ReTracker: Actively and Automatically Matching Retraction Metadata in Zotero.
82nd Annual Meeting of the Association for Information Science and Technology, Melbourne
Australia.
doi:https://doi.org/10.1002/pra2.32
Schneider, J., Woods, N.D., Proescholdt, R., Fu, Y., & RISRS Team. (2021, July 29).
Recommendations from the Reducing the Inadvertent Spread of Retracted Science: Shaping
a Research and Implementation Agenda Project [online] [cited 13 July 2023].
doi:https://doi.org/10.31222/osf.io/ms579
Author's keywords for this paper:
Scholars Portal; scholarly communication; article retractions; article corrections; JATS; XML