Regular Expressions for q

By default, q’s string search and replace function ssr does not handle regular expressions. However it is possible to obtain regular expression capability by writing a q extension to use the PCRE library.

Here is my implementation: re_q-20080714.bz2. To see it at work:

q)sub:`re 2:(`sub;3)
q)s:"quick brown fox"
"quick BROWN fox"
"|q|u|i|c|k| |b|r|o|w|n| |f|o|x|"
"*quick* *brown* *fox*"
"uickq rownb oxf"

It can run faster than the ssr function:

q)\t do[1000000;sub[s;"brown";"BROWN"]]
q)\t do[1000000;ssr[s;"brown";"BROWN"]]

I’ve used the C++ interface of PCRE as most of the work is already implemented in pcrecpp’s GlobalReplace() function.

20080715: Attila pointed out that the speed difference between sub and ssr is not so surprising since ssr is not a built-in function of k. Rather it is defined in q.k as:

ssr:{,/@[x;1+2*!_.5*#x:(0,/(0,+/~+\(>':"["=y)-<':("]"=y))+/:x ss y)_x;$[100>@z;:[;z];z]]}

