(E0 261) Database Management Systems

BACKGROUND ASSIGNMENT

Total Questions: 15   Total Marks: 100

  1. A stipend accounting application, a mailing label application, and a hostel bill statistics application are running on the IISc student database management system. The physical layout of student records is








    SRNoNameAddressGPAStipendHostelNameMessBill







    To improve operational efficiency, the structure of student records is changed such that all numeric fields appear in the beginning and all character fields appear in the end. Explain how each of the above applications would have to be altered to operate correctly under the new format.

    (5 marks)

  2. Alarmed by the rapid growth in private TV channels, the Government has asked the Information and Broadcasting Ministry to keep a close watch on the activities of this industry. The I&B minister requests you to develop an appropriate database and provides the following description:

    Each private channel has a name, a license number, and an assigned broadcasting frequency. Foreign-owned TV channels are required to have a tie-up with a local channel. A channel is allowed to operate from only a single state in the country but may broadcast its programs to other states by using repeaters (these repeaters are shared by all the channels) set up by the government. Repeater stations may be linked by either satellite or by microwave or by fibre-optic cables. Each of these technologies has an associated signal amplification factor and a signal distortion factor. To project long term growth, it is necessary to keep track of the time delays between repeater stations which is a function of the distance between the stations and the technology used to transmit between these repeater stations. Each TV channel has a president and several reporters and announcers, All channel staff are required by law to belong to the Broadcast Workers Union, which assigns them a membership number. The Union keeps track of member names and addresses, their current salaries and designations.

    TV channels are required to broadcast programs in units of standardized one hour time slots (that begin on the hour). Programs are classified into five types: comedy, mystery, educational, adventure, and romance. Channels are required to keep information about the producers of the programs, the cost of production and the royalty fees. The advertising rates charged by a channel for a specific time slot is based on the time slot, the program aired during this slot, and the day of broadcast. The Advertising Council of India, an independent industry watchdog, monitors these rates to ensure fair prices. Government regulations do not allow program names to be repeated across channels. The number of programs that are currently being broadcast by each channel is monitored by the I&B Ministry.

    Programs are typically composed of several episodes with the episodes numbered consecutively (starting from one). Each episode has an episode director and a list of the actors who feature in that episode is maintained. The audience viewership rating for each episode is received from MARG, a company specializing in market surveys. On the day the episode is to be aired, the channel assigns a staff member as the supervisor for ensuring the proper broadcast of the episode.

    Draw an efficient E-R model for the above domain description. Clearly specify keys, roles, and structural constraints. Note any unspecified information, and list the assumptions you make to complete the specification.

    (10 marks)

  3. In what ways do the relations represented by the following tables differ?







    SRNo Name GPA    Name SRNo GPA






    237091Sridevi 7.1 Aamir 3372926.2
    337292Aamir 6.2 Madhuri3371927.9
    337192Madhuri7.9 Sridevi 2370917.1
    237091Sridevi 7.1






    (5 marks)

  4. The Employment Bureau of the Labour Ministry maintains a database with the following schema:

    COMPANY (cname, type)     PROJECT (pname, cname, pstate)
    WORKER (EBno, wname, wstate)     EXPERIENCE (EBno, cname, years)
    INTERVIEW (cname, EBno, date)    HIRE (cname, pname, EBno, salary)

    The HIRE relation contains information about the current employees hired by companies, whereas the EXPERIENCE relation stores information about the past employment of employees. Is it possible to write an SQL expression to answer the following query: Identify, in lexicographically sorted order, the names of private “fully–centralized” (all projects in a single state) companies and also private “fully-distributed” (maximum of one project per state) companies that, for their projects, hire only local workers who have been previously employed in exactly three companies ?

    If yes, give the expression. If no, explain why.

    (10 marks)

  5. For a relation scheme R = { ACDGLPRSTX }, the following functional dependencies hold:

    S ALX C P DRT C DPT R DST R
    CS G R T L A GR DS

    Your manager claims that the following set of relations is a lossless-join, BCNF, dependency-preserving decomposition of R:

    CP RT AL CDR LSX DGRS

    Do you agree? Explain.

    (10 marks)

  6. Your boss decomposes relation M using a set of functional dependencies, F, and decomposes relation N using another set of functional dependencies, G. One decomposition is BCNF, the other is 3NF. However, your boss forgets which is which and requests your help to identify the decompositions. Could you use just one of the following tests to decide? (Assume that the closures of F and G are available).

    a) Dependency-preservation b) Lossless-join c) BCNF definition d) 3NF definition

    If yes, identify the test and how you would use it to decide. If no, explain why and identify the minimal set of tests you need to make the decision.

    (5 marks)

  7. A B+-tree is constructed from the following set of insert and delete operations (I = Insert, D = Delete):

    I 2, I 3, I 5, I 7, I 11, I 17, I 19, I 23, I 29, I 31, I 9, I 10, I 8, D 23, D 19, D 29, D 11

    Show the final tree after these operations. Assume you start with an empty database, and the maximum number of pointers in a node is 4. Also, assume that the B-tree is constructed with left pointers corresponding to < and right pointers corresponding to .

    (5 marks)

  8. For the same set of operations as above, show the resultant extendible hash index structure. Assume you start with an empty database, the hash function is h(x) = x mod 8, and buckets can hold upto 3 records. Also assume that the MSB bits in the hash value are used for identifying buckets.

    (5 marks)

  9. Let relations R(A,B,C) and S(C,D,E) have the following properties: R has 20000 tuples, S has 45000 tuples, 25 tuples of R fit into one block, 30 tuples ofS fit into one block.

    Estimate the number of block accesses required, using each of the following join strategies for equi-joining R with S on attribute C, which is uniformly distributed over a large domain. (For the first four strategies, assume that 3 buffers are available from the system).

    1. Simple Iteration (assume unclustered data)
    2. Block Iteration
    3. Sort-Merge Join (include the cost of sorting the relations)
    4. Index Join (assume B+tree index on join attribute C exists only for relation R and that upto twenty pointers fit in an index page )
    5. Hash Join (assume that a pair of hash partitions can simultaneously fit in memory)

    (10 marks)

  10. Consider the SQL query on the student database
          select T.cname  
          from Course T, Course S  
          where T.credits > S.credits and S.dept = "CSA"

    Write an efficient (basic) relational algebra expression that is equivalent to this query. Justify your choice.

    (5 marks)

  11. The left outer join, denoted by the symbol LOJ, for relations R and S is defined to be the union of (a) the natural join of R and S, with (b) the set of tuples from R that do not join with any tuples in S, each concatenated with a “null” S tuple (i.e. an S tuple composed of all null fields).
    1. Is there a relational algebra equivalent for LOJ ?
    2. Is LOJ associative ?
    3. Is LOJ commutative ?
    4. What is the minimum cardinality of R LOJ S ?
    5. Ordinary joins can be processed using nested loops, sort-merge, and hash-join algorithms. Which of these algorithms can be used to compute LOJ ?

    (10 marks)

  12. You have implemented a log-based recovery manager for your company’s database system. However, your boss recommends that you remove it since she has been given a guarantee by the vendor that the computer hardware and software will never fail – that is, the database system is permanently operational. Assuming that the guarantee is valid, how would you respond to her request? Explain. Would your answer change based on whether the dbms implemented immediate updates or deferred updates?

    (5 marks)

  13. Your classmate implements the following log-based recovery manager: Log records of data updates are always “flushed” (i.e. synchronously written to the log disk) before the corresponding updates are written to disk. At transaction commit time, any data updates that are still memory-resident are written to disk. The transaction commit record is then written to the log disk.

    Does your classmate’s system provide recoverability? If yes, explain the undo and redo operations that are made during recovery. If no, explain why.

    (5 marks)

  14. The local branch manager of State Bank of India informs you that all transactions at his branch are composed of the following four basic operations:



    OperationMeaning


    R(a,v) Returns in v the current balance in account number a
    S(a,v) Sets the balance in account number a to be value v
    D(a,v) Deposits amount v to balance in account number a
    W(a,v) Withdraws amount v from balance in account number a


    Given that a separate lock is available for each operation, draw a lock compatibility matrix. Justify your matrix entries.

    (5 marks)

  15. Transactions T1, T2 and T3 arrive in that order to a dbms and are executed concurrently according to the (partial) schedule shown below. Based on this schedule, determine which concurrency control algorithm, if any, from among 2PL, Timestamp Ordering and Validation Technique, produced it.

    (Ri(x) means that transaction i reads data item x and Wi(x) means that transaction i writes data item x.)

    Schedule: R2(d) R1(a) W1(a) R3(b) R3(c) R2(a) W2(a) R2(c) W3(c) R1(d) R3(d) W1(d)

    (5 marks)