Set regex flags

Hello,

I was wondering if there is a way to set the regex flag i for insensitive in Kobo. I want to use my regex to account for the uppercase and lowercase of URLs. e.g. it must accept www.google.com and also WwW.google.com

The www is where i’m having issues.

Should it support either www or WwW only?

No, it shouldn’t. It should also accept .CoM or HtTps or .orG etc. Basically any letter in the URL can be uppercase or lowercase, but it must still be a proper url.

Is this a finite list or an infinite list?

There is no user-settable flag on the XPath regex() function to make it case insensitive. But if you post your current (case-sensitive) URL regex expression, it can be readily modified to make it case insensitive.

1 Like

Thanks, this is what I have written so far:

regex(.,‘https?://(?:www.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9].[^\s]{2,}|www.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9].[^\s]{2,}|https?://(?:www.|(?!www))[a-zA-Z0-9]+.[^\s]{2,}|www.[a-zA-Z0-9]+.[^\s]{2,}’)

To be clear, URL (eg “http://WwW.google.com/path/TO/resource.TXT”) or Domain Name only (eg “WwW.google.com”)?

It also doesnt hurt to post a few examples of what you do/not want to match…

1 Like

The regex should accept both the url and domain name.
A few e.g.:

http:// WWw.google.Com
http://Www.google.com
https://WWw.google.Com
HttPs://Www.google.com
gooLE.com or Google.coM - for websites that don’t use www
hTTp://www.google.COM
HtTp://WWW.google.com
Https://www.google.com/Path/TO/resoUrce
Https://wWW.google.com/pAtH/to/rEsoUrce
wWw.google.COm/PaTH/tO/resourcE

These examples should represent all the use cases.

Try this regex:

^([hH][tT][tT][pP][sS]?:\/\/)?[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)+([\/?].+)?$

You can test it here: https://regex101.com/

[note: a few negative examples doesnt hurt… A regex that is too permissive is as useless as one too restrictive; eg the following will also happily accepts all of the above: ^.*$ ]

1 Like

Thank you very much. This is the solution I was looking for, it works perfectly.

1 Like