Soundex
From PostgreSQL wiki
Compute string soundex
Works with PostgreSQL
Tested on 9.0, 9.1. Probably works in earlier versions
Written in
PL/pgSQL
Depends on
Nothing
Soundex is a simple algorithm for comparing/normalizing English words based on phonetic similarity.
PostgreSQL 8.3 and later include a soundex(text) function in the fuzzystrmatch extension. Note that this function behaves differently from some other implementations, as it does not skip similar consonants that are separated by 'w' and 'h'.
PL/pgSQL version
If you can't install the fuzzystrmatch extension (due to limited privieges or other reasons), here is an alternative version in PL/pgSQL. This function behaves identically to the fuzzystrmatch implementation. (Tested on PostgreSQL 9.0 and 9.1, using dictionaries in many different languages).
CREATE OR REPLACE FUNCTION soundex(input text) RETURNS text IMMUTABLE STRICT COST 500 LANGUAGE plpgsql AS $$ DECLARE soundex text = ''; char text; symbol text; last_symbol text = ''; pos int = 1; BEGIN WHILE length(soundex) < 4 LOOP char = upper(substr(input, pos, 1)); pos = pos + 1; CASE char WHEN '' THEN -- End of input string IF soundex = '' THEN RETURN ''; ELSE RETURN rpad(soundex, 4, '0'); END IF; WHEN 'B', 'F', 'P', 'V' THEN symbol = '1'; WHEN 'C', 'G', 'J', 'K', 'Q', 'S', 'X', 'Z' THEN symbol = '2'; WHEN 'D', 'T' THEN symbol = '3'; WHEN 'L' THEN symbol = '4'; WHEN 'M', 'N' THEN symbol = '5'; WHEN 'R' THEN symbol = '6'; ELSE -- Not a consonant; no output, but next similar consonant will be re-recorded symbol = ''; END CASE; IF soundex = '' THEN -- First character; only accept strictly English ASCII characters IF char ~>=~ 'A' AND char ~<=~ 'Z' THEN soundex = char; last_symbol = symbol; END IF; ELSIF last_symbol != symbol THEN soundex = soundex || symbol; last_symbol = symbol; END IF; END LOOP; RETURN soundex; END; $$;
Example
db=# select column1, soundex(column1) from (values ('Robert'), ('Rupert'), ('Rubin'), ('Ashcroft')) data;
column1 | soundex
----------+---------
Robert | R163
Rupert | R163
Rubin | R150
Ashcroft | A226
db=# create table words(word text);
db=# copy words from '/usr/share/dict/words';
db=# create index on words(soundex(word));
db=# select * from words where soundex(word)=soundex('hotel');
word
--------
hotel
hotly
huddle
