Normalize whitespace

From PostgreSQL wiki
Jump to navigationJump to search

Library Snippets

Normalize space

Works with PostgreSQL

Any version

Written in

SQL

Depends on

Nothing


This SQL function normalizes the space in a string, removing any leading or trailing space and reducing any internal whitespace to one space character per occurrence. It can be useful in creating domains.

CREATE OR REPLACE FUNCTION normalize_space(TEXT)
RETURNS TEXT
IMMUTABLE
LANGUAGE SQL
AS $$
SELECT regexp_replace(
    trim($1),
    E'\\s+',
    ' ',
    'g'
);
$$;
CREATE OR REPLACE FUNCTION normalize_space(text) 
RETURNS text LANGUAGE plperl AS $$
    $_ = $_[0]; 
    s/(^\s+|\s+$)//g;
    s/\s+/ /og;
    return $_;
$$;

So, if you want to update/insert to a table and normalize the text previously this action, you might create a trigger before insert for each row, like this:

 
CREATE TABLE pepito (wordy text, moody text, bar text);

DROP IF EXISTS trigger n_space on pepito;

CREATE TRIGGER n_space BEFORE INSERT OR UPDATE ON pepito FOR EACH ROW EXECUTE PROCEDURE n_space();

This function trigger, looks up fields of text type and touch NEW values with the replace expression.

CREATE OR REPLACE FUNCTION n_space() 
RETURNS trigger
VOLATILE
LANGUAGE plperl
AS $$
	my $rs = spi_exec_query(<<"    EOT");
    SELECT
        a.attname as name,
        format_type(a.atttypid,pg_attribute.atttypmod) 
    FROM
        pg_class AS c
    JOIN
        pg_attribute AS a
        ON (
            a.attrelid = c.oid
        AND c.oid = '$_TD->{table_name}'::regclass 
        AND a.attstorage <> 'p'
        AND format_type(a.atttypid,pg_attribute.atttypmod) = 'text'
    EOT
        
	my $rs_rows = $rs->{processed}-1;
	foreach my $rn (0 .. $rs_rows){
		my $row = $rs->{rows}[$rn]->{name};
		$_TD->{new}{$row} =~ s/(^\s+|\s+$)//g;
		$_TD->{new}{$row} =~ s/\s+/ /g;		
	}	
	return "MODIFY";
$$;
TRUNCATE TABLE pepito;
INSERT INTO pepito
VALUES
    ('asd asd AS     asd asf      ',' asd  4t45gr g er    ergt',' asd sa'),
    ('asd asd AS     asdbhfg',' asd  4t45gr g er    ergt',' asd sa'),
    ('    asd asd AS     asd asf      ',' asd  4t45gr g er    ergt',' asd sa');
SELECT * FROM pepito;