Difference between revisions of "Don't Do This"

From PostgreSQL wiki
Jump to: navigation, search
(Created page with "A short list of common mistakes. == Don't use psql -W == Don't use <tt>psql -W</tt> or <tt>psql --password</tt>. === Why not? === Using the --password or -w flags will tel...")
(No difference)

Revision as of 23:58, 23 May 2018

A short list of common mistakes.

Don't use psql -W

Don't use psql -W or psql --password.

Why not?

Using the --password or -w flags will tell psql to prompt you for a password, before trying to connect to the server - so you'll be prompted for a password even if the server doesn't require one.

It's never required, as if the server does require a password psql will prompt you for one, and it can be very confusing when setting up permissions. If you're connecting with -w to a server configured to allow you access via peer authentication you may think that it's requiring a password when it really isn't. And if the user you're logging in as doesn't have a password set or you enter the wrong password at the prompt you'll still be logged in and think you have the right password - but you won't be able to log in from other clients (that connect via localhost) or when logged in as other users.

When should you?

Never, pretty much. It will save a round trip to the server but that's about it.

Don't use timestamp

Don't use the timestamp type to store timestamps, use timestamptz (also known as timestamp with time zone) instead.

Why not?

timestamptz records a single moment in time. Despite what the name says it doesn't store a timestamp, just a point in time described as the number of microseconds since January 1st, 2000 in UTC. You can insert values in any timezone and it'll store the point in time that value describes. By default it will display times in your current timezone, but you can use "at timezone" to display it in other time zones.

Because it stores a point in time it will do the right thing with arithmetic involving timestamps entered in different timezones - including between timestamps from the same location on different sides of a daylight savings time change.

timestamp (also known as timestamp without time zone) doesn't do any of that, it just stores a date and time you give it. You can think of it being a picture of a calendar and a clock rather than a point in time. Without additional information - the timezone - you don't know what time it records. Because of that, arithmetic between timestamps from different locations or between timestamps from summer and winter may give the wrong answer.

So if what you want to store is a point in time, rather than a picture of a clock, use timestamptz.

More about timestamptz.

When should you?

If you're dealing with timestamps in an abstract way, or just saving and retrieving them from an app, where you aren't going to be doing arithmetic with them then timestamp might be suitable.

Don't use money

The money data type isn't actually very good for storing monetary values. Numeric, double or even integer may be better.

Why not?

lots of reasons.

It's a fixed-point type, implemented as a machine int, so arithmetic with it is fast. But it doesn't handle fractions of a cent (or equivalents in other currencies), it's rounding behaviour is probably not what you want.

It doesn't store a currency with the value, rather assuming that all money columns contain the currency specified by the database's lc_monetary locale setting. If you change the lc_monetary setting for any reason, all money columns will contain the wrong value. That means that if you insert '$10.00' while lc_monetary is set to 'en_US.UTF-8' the value you retrieve may be '10,00 Lei' or '¥1,000' if lc_monetary is changed.

Storing a value as a numeric, possibly with the currency being used in an adjacent column, might be better.

When should you?

If you're only working in a single currency, aren't dealing with fractional cents and are only doing addition and subtraction then money might be the right thing.

Don't use timetz

Don't use the timetz type. You probably want timestamptz instead.

Why not?

Even the manual tells you it's only implemented for SQL compliance.

The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness. In most cases, a combination of date, time, timestamp without time zone, and timestamp with time zone should provide a complete range of date/time functionality required by any application.

When should you?


Don't use upper case table or column names

Don't use NamesLikeThis, use names_like_this.

Why not?

PostgreSQL folds all names - of tables, columns, functions and everything else - to lower case unless they're "double quoted".

So create table Foo() will create a table called foo, while create table "Bar"() will create a table called Bar.

These select commands will work: select * from Foo, select * from foo, select * from "Bar".

These will fail with "no such table": select * from "Foo", select * from Bar, select * from bar.

This means that if you use uppercase characters in your table or column names you have to either always double quote them or never double quote them. That's annoying enough by hand, but when you start using other tools to access the database, some of which always quote all names and some don't, it gets very confusing.

Stick to using a-z, 0-9 and underscore for names and you never have to worry about quoting them.

When should you?

If it's important that "pretty" names are displaying in report output then you might want to use them. But you can also use column aliases to use lower case names in a table and still get pretty names in the output of a query: select character_name as "Character Name" from foo.

Don't use char(n)

Don't use the type char(n). You probably want text.

Why not?

char(n) is a fixed width type, and any string you insert into it will be padded with spaces to that fixed width. That's probably not what you actually want.

The manual says:

Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way. However, trailing spaces are treated as semantically insignificant and disregarded when comparing two values of type character. In collations where whitespace is significant, this behavior can produce unexpected results; for example SELECT 'a '::CHAR(2) collate "C" < E'a\n'::CHAR(2) returns true, even though C locale would consider a space to be greater than a newline. Trailing spaces are removed when converting a character value to one of the other string types. Note that trailing spaces are semantically significant in character varying and text values, and when using pattern matching, that is LIKE and regular expressions.

That should scare you off it.

It being a fixed width type does waste space, but doesn't make operations on it any faster.

When should you?

When you're porting very, very old software that uses fixed width fields. Or when you read the snippet from the manual above and think "yes, that makes perfect sense and is a good match for my requirements" rather than gibbering and running away.

Don't use varchar(n) by default

Don't use the type varchar(n) by default. Consider varchar (without the length limit) or text instead.

Why not?

varchar(n) is a variable width text field that will throw an error if you try and insert a string longer than n characters (not bytes) into it.

varchar (without the (n)) or text are similar, but without the length limit. If you insert the same string into the three field types they will take up exactly the same amount of space, and you won't be able to measure any difference in performance.

If what you really need is a text field with an length limit then varchar(n) is great, but if you pick an arbitrary length and choose varchar(20) for a surname field you're risking production errors in the future when Hubert Blaine Wolfe­schlegel­stein­hausen­berger­dorff signs up for your service.

Some databases don't have a type that can hold arbitrary long text, or if they do it's not as convenient or efficient or well-supported as varchar(n). Users from those databases will often use something like varchar(255) when what they really want is text.

When should you?

When you want to, really. If what you want is a text field that will throw an error if you insert too long a string into it, and you don't want to use an explicit check constraint then varchar(n) is a perfectly good type. Just don't use it automatically without thinking about it.

Don't use rules

Don't use rules. If you think you want to, use a trigger instead.

Why not?

Rules are incredibly powerful, but they don't do what they look like they do. They look like they're some conditional logic, but they actually rewrite a query to modify it or add additional queries to it.

That means that all non-trivial rules are incorrect.

Depesz has more to say about them.

When should you?

Probably never. If you're doing something like "ON SELECT DO INSTEAD SELECT" you can probably use a view instead.