The quality of the machine-readable metadata associated with academic journal articles matters virtually as much as the quality of the research itself. That’s because scholarly databases and web browsers rely on metadata to ingest and interpret information about content so they can serve it in search results. Without rich metadata, research is hard to find.
Most journal publishers know the importance of article-level metadata. But, producing it can be challenging for those with limited resources.
In that camp? You’re not alone.
At Scholastica, we believe all journals should be able to have rich metadata without high costs, technical hassles, or unsustainable manual work. So, we’re making metadata production easier. Here’s a quick breakdown:
- Scholastica Peer Review System: Auto-generates JATS XML article-level metadata journals can import to Scholastica’s publishing solutions or export to an external system.
- Scholastica Production Service: Includes full-text XML article files in the JATS standard with rich metadata journals can import to Scholastica’s hosting platform or export to an external system.
- Scholastica Open Access Publishing Platform: automatically generates JATS XML and HTML metadata from article creation form inputs, with the ability to import metadata from other Scholastica solutions.
In this blog post, we overview the role of metadata in article discovery and how Scholastica is helping journals produce and disseminate the machine-readable metadata they need.
JATS XML metadata with the latest PIDs for ALL Scholastica users
Among the primary discovery outlets for journal articles are scholarly abstracting and indexing databases (A&Is), such as MEDLINE and PubMed Central in the life sciences, many of which require machine-readable metadata deposits to ingest content. So, if you want your journal to be discoverable in relevant A&Is, producing XML metadata is essential.
Journals using any of Scholastica’s three products — our peer review system, production service, or OA hosting platform — don’t have to worry about formatting XML. Scholastica automatically generates article-level XML metadata files with the core fields required by publishing standards organizations and OA initiatives like Plan S, including:
- Publisher
- Journal title
- Article title
- Author names
- Abstract — in line with the Initiative for Open Abstracts (I4OA)
- Citations — in line with the Initiative for Open Citations (I4OC)
- Persistent Identifiers (including ISSN, DOI, ORCID, and ROR
- Related article DOIs
- CRediT fields (via Contributor Roles Taxonomy integration)
- Copyright license
- Funding information (e.g., Crossref Funder ID)
- Journal issue details (e.g., publication date, volume, and issue number)
Scholastica formats XML files in the “Journal Article Tag Suite” or JATS standard developed by the National Information Standards Organization, including in the Crossref and DOAJ DTDs. We also offer integrations with Crossref and the DOAJ for journals using our OA hosting platform.
Full-text JATS XML for production customers
Journals that use the Scholastica Production Service also get full-text XML versions of all articles, enhancing their discovery potential further since some scholarly databases like PubMed Central (PMC) require full-text. The full-text XML of articles typeset by Scholastica’s production service is formatted to comply with PMC’s indexing criteria with the option to automate PMC deposits.
With full-text XML files, text and data mining of articles also becomes possible (i.e., using online scripts or machine-learning tools to analyze content). For example, a scholar might employ text and data mining to compile an aggregate of articles that reference a particular subject or to compare related data sets. As more scholars incorporate (meta)data analysis into their work, the value of producing full-text machine-readable files is increasing.
HTML metadata optimized for web browsers and Google Scholar for OA Publishing users
The other key to journal discoverability is having a search-optimized website. Many search engines like Google and Google Scholar index content using “web crawlers,” which are automated internet programs that systematically “crawl” websites to identify and ingest information about them. When crawlers come to a journal website (or any website for that matter), they look for HTML meta tags that provide descriptive metadata in a format they can parse. So, it’s imperative to have them! It’s also critical for crawlers to be able to quickly locate individual articles, so each article should be hosted on its own webpage with a search-friendly URL structure.
Journals that use Scholastica’s OA publishing platform can adhere to the above search optimization best practices with no added work. OA publishing customers get a website template structured to enable search engine indexing (including Google Scholar requirements) with HTML metadata for all articles. Scholastica ensures that article pages are available to web crawlers and that they’re easy for search engines to find.
Article-level metadata is searchable via Scholastica journal websites
Scholastica’s OA journal hosting platform also includes built-in search functionality, so metadata applied to published articles is searchable via journal websites, including the option to apply granular combinations of specifications.
For example, a reader visiting a technology journal website might specify that they’re interested in articles that include the keywords “Artificial Intelligence” and “Machine Learning” and were published after June 2024, to see the most current content on those subjects.
Easy metadata imports and exports across products and services to save you time
Another aspect of metadata management that’s critical is moving it throughout the publishing lifecycle, which we know can be tricky. No editorial team likes having to manually apply the metadata collected with article submissions to articles. At least not any that we’re aware of!
Moving metadata is easier with Scholastica. Journals can seamlessly transfer metadata between our peer review and publishing solutions. For example, journals that use the Scholastica Peer Review System can import all their accepted articles and accompanying metadata straight from our peer review software to our production service and/or OA publishing platform.
Scholastica can also support metadata imports to and exports from external publishing tools and services in many cases, including integrating with the Silverchair Platform (we automatically generate SCJATS). If you’re interested in any of our products and want to know if you can import metadata from an existing peer review system or export it to an external production or hosting platform, please schedule a demo to have an informational call with a member of our team. Every case is a bit different and may require development work because structured data/XML can be highly variable across systems and journals, so our team will need to do a standard technical assessment to determine the best next steps.
Helping to make journals more discoverable: Metadata and beyond
At Scholastica, we’re committed to helping journal publishers meet the latest industry standards efficiently and sustainably. Producing rich machine-readable metadata for journals using all of our solutions is just one of the ways we’re doing that.
To learn more about how Scholastica is supporting sustainable publishing and helping journals meet Open Access standards like the Plan S guidelines, visit our Product Roadmap: Plan S, Core Open Access Publishing Standards & Scholastica.
This post was originally published on August 29, 2019, and updated with new feature information on June 18, 2024.