Finding the Tipping Point in Automated Markup During Up-Translation (2017)
In July 2017, I presented a discussion of the challenges of automatic markup in publishing software at the Balisage pre-conference symposium, Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions. Balisage in Washington, D.C. Balisage is an annual conference for markup theoreticians and practitioners, data modelers, designers, and architects. My paper is summarized below:
Up-translation can be accomplished automatically or manually: automatic translation introduces errors and misses content; manual translation introduces different errors and is time-consuming. The best results are obtained by finding a middle ground between automation and manual tagging. However, finding that middle ground is a challenge unto itself. Addressing that challenge requires carefully balancing investment in software development for automation, automatic flagging of suspect cases for manual review, and designing a tagging and quality assurance workflow that is robust and efficient. Balancing automation with manual review is the key to dealing with the inevitable inconsistencies, ambiguities, and “gotcha” moments found when up-translating scholarly manuscripts to models such as JATS and BITS.
Citation: Gebhard, Caitlin. “Finding the Tipping Point in Automated Markup During Up-Translation.” Presented at Symposium on Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions, Washington, DC, July 31, 2017. Proceedings of the Symposium on Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions. Balisage Series on Markup Technologies, 20 (2017). https://doi.org/10.4242/BalisageVol20.Gebhard01.
Wrangling Math from Microsoft Word into JATS XML Workflows (2016)
This paper clarifies the different forms of equations that can be encountered in Word documents and discusses the issues and idiosyncrasies of converting these various forms to MathML, LaTeX, and/or images in the JATS XML model. It also touches on workflow alternatives for handling equations in various rendering environments and how those downstream requirements may affect the means of equation extraction from Word documents. Originally presented at Journal Article Tag Suite Conference (JATS-Con) in April 2016, this paper was later accepted and published in the journal Learned Publishing later that year.
Citation: Gebhard, Caitlin and Bruce Rosenblum. “Wrangling math from Microsoft Word into JATS XML workflows.” Learned Publishing, 29 (2016): 271-279. https://doi.org/10.1002/leap.1058
Citation: Gebhard C, Rosenblum B. Wrangling Math from Microsoft Word into JATS XML Workflows. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings 2016 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2016. Available from: https://www.ncbi.nlm.nih.gov/books/NBK350572/