Picasso
Database Query Optimizer Visualizer
©Indian
Institute of Science,
DB2
(The following setup procedure is known to work with IBM DB2 v8, v9 on Windows.)
1.
Install DB2 with complete support for Java
a.
Run setup.exe and follow the wizard.
b.
In the Configure DB2 Instances screen, you can
customize the Protocol settings (including port number and service name) and
Startup settings of the instance.
2.
If the DB2 service is not already running, start it
with the command db2start.
3.
Open a command window using Start → Programs → IBM DB2 → Command
Line Tools → Command Window
4.
Create a database by giving the command: db2 create database <dbname>.
Connect to it using the command: db2 connect to <dbname>
5.
Create the schema.
If the schema to be created is in a file, run the command: db2 –tvf
<filename>
6. Load
the data into the database tables. If
you are loading data from a file, use this command:
db2 import from <filename> of del
commitcount <value> insert into <tablename>
For example,
db2 import from nation.tbl of del commitcount 100 insert
into NATION
7. Create
the integrity constraints (primary key, foreign key and indexes) on the
data. If statements are in a file, run
the command: db2 -tvf <filename>
8. Create
the explain plan tables. These are the tables in which query execution
plans are stored by the optimizer. To create these tables, after connecting to
the database, run the following command in the location \IBM\SQLLIB\MISC:
db2
-tvf EXPLAIN.DDL
9. Create
statistical summaries for all table
columns that may be used as Picasso predicates in the query templates, with the
command:
db2
runstats on table <fully qualified table name> on columns (column_list)
with distribution
For example,
if statistics have to be collected on the columns N_NAME and N_COMMENT of
the table NATION
belonging to schema ‘admin’, the command is:
db2 runstats on table admin.NATION on columns (N_NAME, N_COMMENT)
with distribution
To improve
the scope for optimization, you could optionally create statistics for all columns of each table. This can be done using the command:
db2
runstats on table <fully qualified table name> with distribution on all
columns
where
with distribution specifies
that distribution statistics are requested, and
all columns updates statistics on all the
columns of the table.
Note:
1. The diagrams on the Picasso website have been generated with
statistical summaries created on all
table columns.
2. With recent versions of DB2 (e.g. v9.5), most of the above-mentioned
operations can be conveniently performed through the DB2 Control Center GUI.
3. The explain plan tables and the
data tables should be created under the same schema name.
10.
Optionally, run the following commands to enhance the scope for
optimization:
db2set
db2_extended_optimization=on
db2
update database cfg for tpch using sortheap <memory allocation>