How to modify a pandoc lua filter to handle multiple citekeys separated by semicolons?

97 Views Asked by At

I have a DOCX with references formated as citekeys for citeproc. I know little of Lua filters and regex but I managed to write a filter that can identify and handle one citekey enclosed between []. How can I adapt it to handle more citekeys enclosed between []?

With the LUA filter below, the string "Some text by [@van_der_voet_environmental_2019] was relevant." is correctly matched and transformed into "Some text by van der Voet (2019) was relevant.". I expected that the same filter worked for "Also [@van_der_voet_environmental_2019; @another_2022; @your_ref_2018] were relevant..." but that string remains untouched. I suppose I might be missing something in the matching statement or something that handles the multiple input.

function Str(el)
  local citekey = el.text:match("%b[]")
  if citekey then
    local s = string.gsub(citekey, "[%[%]@]", "")
    local citation = pandoc.Citation(s, 'AuthorInText')
    return pandoc.Cite({pandoc.Str(s)}, {citation})
  end
end

Bonus point: If the citekey is followed by a ".", then it dissapears with the current filter. What am I missing?

1

There are 1 best solutions below

0
koyaanisqatsi On

string.gsub() should be enough.
Example

t = [[With the LUA filter below, the string "Some text by [@van_der_voet_environmental_2019] was relevant." is correctly matched and transformed into "Some text by van der Voet (2019) was relevant.".
I expected that the same filter worked for "Also [@van_der_voet_environmental_2019; @another_2022; @your_ref_2018] were relevant..." but that string remains untouched.
I suppose I might be missing something in the matching statement or something that handles the multiple input.]]

t:gsub('%b[]', function(m) local m = m:gsub('[%[%]%@]', '') print(m) return(m) end)

The Function inside gsub() prints out what it has done with the Match m.
This Function has to return only the transformated string, not the count.
Therefore local and return() syntax is used.

Also with such a gsub() Function for a match you are able to extend it for example handling the semicolons.
This example has t already defined and shows only the print() output

> t:gsub('%b[]', function(m) local m, c = m:gsub('[%[%]%@]', ''):gsub('%;%s', '\10') print(m) return(m) end);
van_der_voet_environmental_2019
van_der_voet_environmental_2019
another_2022
your_ref_2018

Thats why i love string.gsub() as a Method for Strings it can be chained (with the :) on m (and/or t).
So keep in Mind: For a matching Pattern in gsub() you can use a String, or a Table (Key = String (Patternmatch), Value = String (Replacement)) or a Function
string.gsub(Pattern, [String, Table, Function])