MySQL Chapter Five (Administration)

Documentation Version: 0.95
MySQL Version: 3.20.29

Granting Access to a Database

The MySQL database system uses a special database to grant access privileges to its databases. These privileges can be based on hosts and/or users, and can be granted to a single database or many.

User accounts are passworded. When connecting to the database the password is encrypted to assure that it can not be intercepted and used by others.

The MySQL database contains three tables. They are defined below.

Database: mysql Table: db
Field Type Null Key Default Extra
Host char(60) PRI
Db char(32) PRI
User char(16) PRI
Select_priv char(1) N
Insert_priv char(1) N
Update_priv char(1) N
Delete_priv char(1) N
Create_priv char(1) N
Drop_priv char(1) N
File_priv char(1) N

Database: mysql Table: host
Field Type Null Key Default Extra
Host char(60) PRI
Db char(32) PRI
Select_priv char(1) N
Insert_priv char(1) N
Update_priv char(1) N
Delete_priv char(1) N
Create_priv char(1) N
Drop_priv char(1) N

Database: mysql Table: user
Field Type Null Key Default Extra
Host char(60) PRI
User char(16) PRI
Password char(8)
Select_priv char(1) N
Insert_priv char(1) N
Update_priv char(1) N
Delete_priv char(1) N
Create_priv char(1) N
Drop_priv char(1) N
Reload_priv char(1) N
Shutdown_priv char(1) N
Process_priv char(1) N
File_priv char(1) N

The following is an example of adding new users.

$ mysql mysql mysql> INSERT INTO user VALUES ('%','monty',password('something'), -> 'Y','Y','Y','Y','Y','Y','Y','Y','Y','Y'); mysql> INSERT INTO user (host,user,password) values('localhost','dummy','') ; mysql> INSERT INTO user VALUES ('%','admin','','N','N','N','N','N','N','Y','N','Y','N'); mysql> quit $ mysqladmin reload

This makes three new users:

monty: Full superuser, but must use password when using mysql.

dummy: Must be granted individual database privileges through table 'db'.

admin: Doesn't need a password but is only allowed to use 'mysqladmin reload' and 'mysqladmin processlist'. May be granted individual database privileges through the 'db' table.

NOTE: You MUST use the password() function when creating a user that has a password. The MySQL database system expects passwords to be stored encrypted.

Attributes assigned in the user table over-ride attribute assigned to a user in a given DB. If you are paranoid or have a server that provides multiple databases you probably want to create users in the user account with no privileges, and then assign appropriate privileges on a database by database basis.

If you are unfortunate enough to be using the MIT threads package please note that localhost will not work for you since the MIT threads package does not support socket connections. This means you must always specify your hostname when making a connection.

The host, and db fields may contain the SQL REGEXP characters % and _. It is NOT valid to use SQL REGEXP characters in other fields.
A host may be localhost, a hostname, IP or SQL expression. An empty host entry means any host.
A db is a database or a SQL expression.
An empty username means any user.
An empty password means that any password will do. The user table is or'ed to the db table. You can create a 'super-user' by setting all privilege fields to 'Y' in the user table when creating the account. This user will be able to do ANYTHING, regardless of the values you have placed in the DB table!
The host table will only be checked when there is an entry in the db table that contains a blank for the host field.

All tables are sorted according to host-user-db if possible.

user Sorted by host and user.
db Sorted by host, user and db.
host Sorted by host and db.

Starting with version 3.20.19 of the server there is one other thing to consider when trying to figure out how a given user will access a given database. Assume you have a user by the name of joe that logs in under the following rule.

INSERT INTO user VALUES ('%.external.domain.com','','','N','N','N','N','N','N','N','N','N','N');

Checks against the db table will be made using user '', rather than user joe, even if user joe has an entry in the db table.

Creating user accounts is probably the most confusing aspect of the MySQL database system when you first start out, so don't be surprised if it takes you awhile to get it right. Use of the mysqlaccess program will generally make things clearer.

In general it is best to avoid the use of regular expressions in the host and db fields when you first start out. This will make debugging significantly easier.

More specifically, set all host field entries to '%' and leave the host table empty. Once you have things working this way experiment with adding host table entries if necessary.

If you get an 'Access denied' error it means that you successfully connected to the mysqld daemon, but have the wrong information in your user table.

How Passwords Work

The scrambled password is stored in the mysql 'user' table.

On connection the server sends a random number to the client.
The client scrambles the password to get what the server has and calculates a number from the random number and the scrambled password. This number is sent to the server.
The server calculates a number from its saved password and the random number. If this number is same as the client number the connecting is accepted.

Running mysqld (At System Boot)

Solaris

Copy the mysql.server script from the mysql bin directory and into /etc/init.d.

Verify all paths in the mysql.server script. Also check safe_mysqld.

ln /etc/init.d/mysql.server /etc/rc2.d/S99mysql.server

You should be set.

Linux

On RedHat Linux systems you will have to add a line to your /etc/rc.d/rc.local file similar to the following.

/usr/local/bin/safe_mysql &

Modify the path to reflect wherever you have installed the MySQL distribution.

Running mysqld (After System Boot)

The mysqld program can be run as any user. Generally it is run as root. If it is run as some other user please be sure to set all file permissions appropriately.

su to whatever user owns MySQL and type the following command. (This assumes that the mysql binaries are in your path. If not, give the full path to the mysql binary directory)

safe_mysql &

Compile Time Options

There are several values that can be "tweaked" at compile time to modify the behavior of MySQL.

Disabling Alternate User Login

If you do not wish people to be able to use the -u option to mysql to login as a user other than themselves you can comment out the following line in global.h

#define SAFE_USER 1 /* Comment this if you are paranoid */

Disabling Automatic Defaults

By default MySQL will automatically assign a default value to fields that are not implicitly declared. MySQL will do this based on type. If you would prefer fields with undeclared values to be NULL then add the following to the global.h file and recompile MySQL.

#define DONT_USE_DEFAULT_FIELDS

Changing The Usable Key Length

By default MySQL will only look at the fist one hundred and twenty bytes of a key. You can increase this by editing the file nisam.h and changing the value of N_MAX_KEY_LENGTH. Note that this may lead to lower performance. There is no reason to make this value shorter, as it is possible to specify a character based key as being shorter than the field it represents. See the section on Keys in Chapter 3 for more details on this.

Changing Sort Order

By default MySQL will sort strings according to ISO8859-1 (latin1). You may change this when running configure by specifying the following option.

   --with-charset=[charset]

Where [charset] may be one of: latin1 latin2 koi8_ru dec8 dos hp8 swe7 usa7

For clues on how to add new character sets examine the strings directory in the source distribution.

Note: If you change character sets after having already created a database you will have to run isamchk with the -e and -q options on all the tables or MySQL will not work correctly.

Adding Functions To MySQL

If you understand C++ and Yacc it is relatively easy to add function to the MySQL database system.

First add a line to sql_lex.cc defining the function name in the sql_functions array.

Next add two lines to the sql_yacc.y. The first will define the preprocessor symbol yacc can define. You should add this to the start of the file. Next define the function parameters and create an 'item' with these parameters.

In item_func.h declare a class that inherits from Item_num_func or Item_str_func, depending on whether your function returns a number or a string.

In intem_func.cc add:

double *Item_func_newname::val()

If you are defining a numeric function or

String *Item_func_newname::Str(String *str)

If you are defining a string function.

You should also define the following function:

void Item_func_newname::fix_length_and_dec()

This should at least calculate 'max_length' based on the given arguments. max_length is the maximum number of characters the function may return.

If the function can't return a null, you should also set 'maybe_null=0'.

For string functions the 'String *str' argument provided is a string buffer that may be used to hold the result. A string function should return the string that holds a result.

All current string functions try to avoid alloc'ing any memory unless absolutely necessary!

Fast Data Load

The fastest way to load data into MySQL is by way of the LOAD DATA INFILE command. This should be three to twenty times faster than using INSERT statements. For even more speed you can use the following procedure to disable key building on your table while the data is loading. You will need to either be cd'd into your MySQL database directory, or provide full paths to your database.

isamchk -rq -k0 [database_name]/[table_name] Remove keys

LOAD DATA INFILE ... Read in your data

isamchk -rq [database_name]/[table_name] Recreate table keys

mysqladmin reload. Tell mysql to use the new table

Tuning mysqld

When you issue the -O option by itself you will be given a list of all tunable variables and their current values. This should look similar to the following.

$ mysqld -O

Possible variables to option -O are:

Name Default

back_log Specify the size of the listen queue for incoming tcp/ip connections. Be sure to set this higher when you expect a large number of connections in a short period of time. Some Operating Systems may have a max of 128 or 256 for this value. The man page for listen(2) should have more details.

keybuffer Buffer to hold all recently used keys. A big buffer gives the best performance. With the -Sl switch only one buffer is allocated.

max_allowed_packet The server connection buffer may be resized up to this if the client gives long commands. The initial buffer is 'net_buffer_length'. One buffer is allocated per connection.

net_buffer_length Initial size of the connection buffer. One buffer is allocated per connection.

max_connections The maximum number of connections that mysqld can have open at the same time.

table_cache Maximum number of tables kept open on the server. Tables are kept open to give better query speed on frequently used tables.

recordbuffer Buffer used to read records sequentially. One buffer will be allocated per connection.

sortbuffer Buffer used when sorting. One buffer will be allocated per connection.

max_sort_length Maximum time for sort?

Things to Know

All threads share the same base memory.
MySQL does not currently use memmap. This may change.
You may specify a key buffer size when starting mysqld. The key buffer will cache all keys in all tables on a FIFO basis. (Variable keybuffer)
Each connection uses some thread space, a stack and a connection buffer (variable net_buffer_length).
Each request doing a sequential scan over a table allocates a read buffer (variable recordbuffer).
Each request doing a sort allocates a sortbuffer and one or two temporary files. The maximum disk-space needed is (sort_key_length + sizeof(long))*2.
All joins are done in one pass and the result is saved in a memory hash table. Doing a full join and a sort on two large tables can take up a large amount of memory.
Almost all memory used when parsing and calculating is done on a local store which may be freed in a big hunk: No memory overhead is needed for small items and the normal slow memory allocation/freeing is avoided. Memory is only allocated for unexpectedly large strings (This is done with malloc()/free().
Index files are opened once. Data files are opened once for each concurrently running thread. Each concurrent thread is allocated a full table struct and field struct plus memory for three full column rows. A BLOB would take five to eight bytes.
If a table has BLOB's a buffer will be allocated to read the BLOB data. This buffer will grow as necessary, and will end up being as big as the largest BLOB the table contains.
When a table is used it is saved in a cache. This cache will be loaded and flushed on a FIFO basis. By default mysqld will cache 64 tables. If multiple threads access the same table there will be one entry per thread. IE, if two threads access the same table there will be two entries in the cache.
A mysqladmin reload closes all tables that are not being used and marks all used tables to be closed when the running threads finish. This will effectively free most used memory.

If you have an application that will require a large number of threads it is a good idea to lower the values of recordbuffer and sortbuffer to somwhere in the neighborhood of 32K. You can even lower recordbuffer to 8K without to much problem. In addition you might want to look at lowering the keybuffer size to 512K. This will save you a large amount of memory, and lead to better performance.

Other Performance Hints

After you have loaded a fair amount of data into your database you can run 'isamchk -a' Over your tables. The isamchk program will collect statistics on your tables that the optimizer can use. Generally you will only have to run isamchk once on a table. This assumes that the data in the table at the time isamchk is run is representative of the typical distribution in your table.

Query Optimization

What optimizations are done on the WHERE clause?

Here's a partial list:

Brace removal (All unnecessary braces are removed)
((a AND b) AND c OR (((a AND b) AND (c AND d)) -> (a AND b) OR (a AND b AND c AND d)
Constant folding
(a<b AND b=c) AND a=5 -> b>5 AND b=5
Constant condition removal (Needed because of constant folding)
(b>=5 AND b=5) OR (b = 6 and 5 = 5) or (B=7 and 5 = 6) ->= B=5 or B=6
Return zero rows if the select is impossible.
Find all keys that may be used. Use the key that finds the least records. A key is used for the following cases:
=, >, >= <, <=, BETWEEN and a LIKE with a character prefix 'something%'

key=1 or A=10                          -> NULL (Can't use key)
key=1 or A=10 and key=2                -> key=1 OR key=2
key_part_1=const and key_part_3=const  -> key_part_1=const

Read all constant tables
A constant table is:
1. ) A table with only 1 record.
2. ) A table which a uses only other const tables and constants on a full unique key.
const_table.key = constant
const_table.key_part_1 = const_table2.field and
const_table.key_part_2 = constant

Find the best join combination to join tables (By trying all possibilities :( )
For each table use a ranged key if possibly to read records. Each table index is queried and if there exists a key range that spans < 30% of the records then a key is used, else a quick table scan is used.
Before each record is output, skip those that match the HAVING clause.

Optimizing Table Space Usage

There are several things you can do to minimize the amount of space your tables take. First of all, use NOT NULL whenever possible. This will speed up queries, and save 1 bit per field.

You can save a lot of space by using the smaller integer values. For instance, MEDIUMINT is often sufficient.

In general you should choose the smallest data-type you will need if you are concerned with space.

One thing to keep in mind is the fact that fixed length fields are significantly faster in mysql. By avoiding variable length fields ( VARCHAR, BLOBs) it is possible to improve performance by 10 to 200%.

Type Mismatches On INSERT

Currently the 'INSERT statement' is relaxed and only gives an error when inserting a NULL in a NOT NULL column.

When using INSERT, or command that may affect many records the following are done.

Values are converted to the right type (from integer/real/string to integer/real/string) before storing.
Strings that are to long are truncated and the 'warnings' variable is incremented.
If the value is below the minimum value for the column, the minimum value is inserted and the 'warnings' variable is incremented.
If the value is above the maximum value for the column, the maximum value is inserted and the 'warnings' variable is incremented.
When converting a string to a integer, the value is truncated after the first non digit. If there are any non digit characters after the number the 'warnings' variable is incremented.
When converting a string to a float, the 'warnings' is incremented if there are any extra characters after the real number.

To hold down the client/server communication 'warnings information' is only returned when using 'ALTER TABLE' or 'LOAD DATA FROM...'

What MySQL Version Numbers Mean

Here's an explanation of the version numbering scheme for MySQL

111.222.333

111 If the first digit changes it means major modifications have been made. This may include new file formats, a new protocol, etc. Be prepared to make some modifications to your applications before doing and upgrade. At the very least you will need to dump and reload your tables.

222 Significant new code/functionality. May include small changes to the SQL syntax supported or to client interfaces. Check your You shouldn't have to make any major changes to your applications, but check carefully before committing yourself.

333 Small generally safe changes that shouldn't break anything. The biggest problem here is generally the addition of SQL keywords, which then conflict with existing table names. If you avoid using keywords for the SQL-92 standard for your table names you should be OK.
It's a good idea to always check the NEWS file when you down-load a new version. This should give you a good idea of what has changed since your last update.

The Client/Server Protocol

Key:

< = client 
> = server
[] is one byte
[2: ] is two bytes packed low byte first
[3: ] is three bytes packed low byte first
[4: ] is four bytes packed low byte first
[string: ] is a null byte terminated string
[length: ] This is a length coded integer. Check
           net_store_length() for this one!
           The main idea is to get most frequent length bytes in 1
           byte but to allow longer data (and NULL)

The Gory Details

Every sent and received packet has the following prefix.

[3: packet length] [packet number] packet data

The packet reader adds a [0] after each received package (to get easy handling of error strings)

The packet number starts from 0 and increments for each send/receive packet. Its a char (modulo 255)

Packet Types

:connect

> [protocol version] [string: password crypt seed]

< [2: 0] [3: max_allowed_client_packet_length] [string: user name]

[string: scrambled password]

> ok packet

Check the file password.c from the MySQL source release for the method used to encrypt a password. Password should be empty if the user has no password.

:ok

[0] [length: affected_rows] [length: unique id]

:error

(May come at any time)

> [255] [string: error message]

:command

< [command number]

enum enum_server_command {SLEEP,QUIT,INIT_DB,QUERY,FIELD_LIST, CREATE_DB,DROP_DB,RELOAD,SHUTDOWN,STATISTICS, PROCESS_INFO,CONNECT}

:query

< [QUERY command] [string: query string] (The end null is not sent)

> [0] [length: affected rows] [length: insert id] (Insert, delete...)

> [length: column_count] (a query result)

> column field data packets

> row data packets

:data

: This is done repeatedly until a packet which only consist of a single [254] is found. Beware there may be packets which starts with 254. Each data packet has 'column count' fields.

Format for each field:

[:length] == NULL_LENGTH -> NULL field

[:length] [length data] (Not null terminated strings!)

The column data package consists of 5 columns with the following data:

[:string table name]
[:string column name]
[:3 create length of column (may be larger in a few cases)]
[:1 type (as of enum_field_types)]
[:1 flag] [:1 decimals] (a 2 byte binary column!)

When using the command list_fields there is a sixth column:
[:string default]

Look at libmysql.c, net.c and password.c for more information.

Table Details

To get a description of a a table use the following command. (You'll have to be cd'd to the directory that contains the table you wish to look at. Generally this will be $DATADIR/[dbname], where dbname is the name of the database that contains the table.)

prompt> isamchk -d table_name

ISAM file:     table_name
Data records:               215  Deleted blocks:              0
Recordlength:                85
Record format: Packed

table description:
Key Start Len Index   Type
1   37    14  unique  text packed stripped 
2   1     30  multip. text packed stripped 
3   51    4   multip. long                 
4   31    2   multip. short                
    33    2           short                   
    35    2           short                   

For a more verbose description of the table
try the following options.  (Explanations follow.)

isamchk -d -v table_name 

ISAM file:     Customer
Isam-version:  2
Creation time: 1996-12-17 21:40:04
Data records:                 0  Deleted blocks:              0
Datafile: Parts:              0  Deleted data:                0
Datafilepointer (bytes):      4  Keyfile pointer (bytes):     3
Recordlength:               374
Record format: Packed

table description:
Key Start Len Index   Type                 Root Blocksize Rec/key
1   2     2   unique  short                  -1      1024       1
2   4     80  multip. text packed stripped   -1      1024       1
3   2     2   multip. short                  -1      1024       1

Explanation: (As the example above shows, you won't always see all of these.)

ISAM file Name of the ISAM file.
Isam-version Version of ISAM format. Currently always 2.
Creation time When was the data file created?
Recover time When was the index/data file last reconstructed?
Data records How many records/rows.
Deleted blocks How many deleted blocks still have reserved space? (See the examples in section on isamchk above for information on getting rid of the unused space.)
Datafile: Parts For the dynamic record format this shows how many data blocks there are. For a just optimized table without splits this will be the same as Data records.
Deleted data Total bytes taken up by non reclaimed deleted data.
Datafilepointer How many bytes is the datafile pointer? This is generally 2, 3 or 4 bytes. Most tables manage with 2 bytes but this cannot be controlled from MySQL yet. For fixed tables this is a record address. For dynamic tables this is a byte address.
Keyfile pointer How many bytes is the datafile pointer? This is usually 1, 2 or 3 bytes. Most tables manage with 2 bytes but this is calculated automatically by mysql. This is always a block address.
Max datafile length How long in bytes can the data file (.ISD) for this table get?
Max keyfile length How long in bytes can the key file (.ISM) for this table get?
Recordlength How much space does each record/row take?
Record format What format does each record/row have?
table description A list of all keys in the table. For each key some low level information is presented. (More information in next table.)

Key This keys number.
Start Where in the record/row does this index-part start.
Len How long is this index-part? For packed numbers this should always be the full length of the field. For strings it may be shorter than the full length (not supported from MySQL yet).
Index unique or multip.. Can one value exist multiple times in this index>
Type What data-type does this index part have? This is a C data-type with a optional packed or stripped.
Root Address of the root index block.
Blocksize The size of each index block. By default this is 1024, but it may be changed at compile time.
Rec/key This is a statistical value used by the optimizer. It tells how many records there are per value for this key. A unique key always has a value of 1. This may be updated after a table is loaded and somewhat stable by using isamchk -a. If you do not do an update, a default value of 30 is used.

The MySQL FAQ has more extensive information on this topic.

About Keys

The following gives some low level information about how keys are stored and used in MySQL.

The current block size for B-tree is 1024 and all blocks (except the root) are at least 2/3 full. For a typical id (= 4 bytes) this means 1024/(4+4)*2/3 = 85 keys per record and thus one can find the position of the real record in 5 seeks.

If you wish your B-tree indexes to be 100% full run isamchk with the -rq options over your table files.

Only the first key part can be prefix compressed. If two keys have the same prefix the other key part will take 1 byte. All string parts longer than 4 chars have end-spaces removed.

If the key are fixed size then nisam uses a binary search in each block, otherwise it uses a sequential search.

All key blocks are read/write through the key_buffer (default 1M) so the root block is generally always buffered.

Read next is optimized so the library doesn't have to consult any other block when reading from a leaf block. This gives very good performance when you have to read a group of records with the same key.

MySQL Future Plans

The following are a list of goals posted my Monty for the future of MySQL. I've paraphrased and added my observations.

In version 3.21.0 all field types will be written as classes and there will be a new WHERE optimizer.
In version 3.21.0 there will be a method to lock tables.
In version 3.22.0 a simple update language will be added. This will include support for loops and subroutines and will make atomic updates possible.
The addition of a built-in update language in 3.22.0 will make it possible to define stored procedures.
Features that would involve a significant performance hit will be implemented in such a way that they can be excluded from the server during the build phase. This will make it possible for people to pick and choose what they need.

The inclusion of table locking and stored procedures should go a long way towards giving people the functionality they need without making a serious impact on the performance of MySQL.

Features such as views, rollback and triggers will be optional additions at some point. These features incur a lot of overhead.

Inner and outer joins will hopefully make it into the 3.21.0 release.

Linux Threads

When MySQL is running on a Linux system and using Linux threads you will see a minimum of three processes running. These are in fact threads. There will be one thread for the Linux Threads manager, one thread to handle connections, and one thread to handle alarms and signals.