Culturally aware initcap
Culturally aware initcap
Any version
SQL
Nothing
Written by Peter Geoghegan ("sternocera")
The following is an implementation of what I call "culturally aware initcap", which is usable as a drop-in replacement for initcap, provided input is in English. It solves a number of problems I've often encountered when using regular initcap, particularly with possessive apostrophes, but also with various common contractions (isn't, can't, won't, I'm, you've, you're). It does this without stepping on the toes of people with Irish names like O'Shaughnessy and O'Sullivan.
with regular initcap: SELECT initcap($$O'SULLIVAN'S friend DOESN'T LIVE HERE anymore$$)
output: "O'Sullivan'S Friend Doesn'T Live Here Anymore"
with cul_initcap: SELECT cul_initcap($$O'SULLIVAN'S friend DOESN'T LIVE HERE anymore$$)
output: "O'Sullivan's Friend Doesn't Live Here Anymore"
CREATE OR REPLACE FUNCTION cul_initcap(input_val text) RETURNS text AS
$function_body$
SELECT
replace(
replace(
replace(
replace(
replace(
replace(
replace(
regexp_replace(initcap($1),
$$'(([MSTD]|Ve|Re|Ll))([^[:upper:][:lower:]]|$)$$,
$$'{@*#!\1!#*@}\3$$,
'g'
)
, '{@*#!M!#*@}', 'm')
, '{@*#!S!#*@}', 's')
, '{@*#!T!#*@}', 't')
, '{@*#!D!#*@}', 'd')
, '{@*#!Ve!#*@}', 've')
, '{@*#!Re!#*@}', 're')
, '{@*#!Ll!#*@}', 'll');
$function_body$
LANGUAGE 'sql' IMMUTABLE STRICT;
One notable limitation of my implementation is that it will incorrectly convert "magical brace enclosed literals" into their corresponding output values: '{@*#!T!#*@}' ,'{@*#!S!#*@}', '{@*#!M!#*@}', '{@*#!Ve!#*@}' (and the rest) become 't', 's', 'm' and 've' respectively. I doubt that this is much of a concern generally. If it is a concern, use more elaborate magical braces that are even less likely to occur in strings that exist in production.