Procedure for MySQL

Picasso Database Query Optimizer Visualizer

©Indian Institute of Science, Bangalore, India

(The following setup procedure is known to work with MySQL 5.1, 5.4.1 and 5.5.9 on Windows with InnoDB as storage engine.)

Install MySQL: Run setup.exe and follow the wizard.
Create the schema. If the schema to be created is in a file, run the following command on the command prompt:

mysql --host=localhost -u UserName -p Password --database=DbName < “filename”

For example,

mysql --host=localhost -u root -p pass123 --database=TPCH < “c:\\tpch\\dss.ddl”

Note: escape sequence “\\” required on windows while giving file path.

Load the data into the database tables. If you are loading data from a file, run the following command after opening MySQL Command Prompt as mentioned in above step with --host –u and --database options:

load data local
INFILE '<filename>'
INTO TABLE '<tablename>'
FIELDS TERMINATED BY '<delimiter>'

For example, if the data is pipe-separated:

load data local
INFILE 'customer.tbl'
INTO TABLE CUSTOMER
FIELDS TERMINATED BY '|'

Create the integrity constraints (primary key, foreign key and indexes) on the data. If you are using a file for this, run the following command in the command prompt:

mysql --host=localhost -u UserName -p Password --port=portno --database=DbName < “filename”

For example,

mysql --host=localhost -u root -p pass123 --port=3308 --database=TPCH < “c:\\tpch\\dss.ri”

Whenever the data is changed, update statistics for tables that may participate in the query templates, using the following command in the MySQL Command Line Client:

ANALYZE TABLE table_name;

You can optionally set the following parameters to fine-tune the optimization environment (the diagrams on the Picasso website were generated using the default values for all these parameters):

a) optimizer_search_depth

b) optimizer_prune_level

For detailed explanation of optimization parameters, see

http://dev.mysql.com/doc/refman/5.5/en/controlling-query-plan-evaluation.html

Notes:

1. PSP should be on an indexed attribute.

2. Query Template should not involve any form of sub-query. Valid query templates include SPJ queries which may contain aggregation functions, GROUP BY, ORDER BY and HAVING clause.

3. For compiled plan tree diagram, cost for intermediate nodes is not available. Only final cost is available.

4. As the histograms are not natively supported by the database, we have used the equi-depth and value based techniques to construct them. The type of histogram depends on the number of distinct values. If the distinct values are fewer than number of buckets (default 75), a value based histogram is constructed otherwise an equi-depth histogram is constructed.

5. The diagrams on the Picasso website have been generated with index on the varying attribute and InnoDB as the storage engine.

Documentation Home