CS 505, Intermediate Topics in Database Systems

Instructor: Dr. Victor W. Marek
Time: 12:00-12:50, MWF
Place: 243 MMRB (Mine and Mineral Research Building, Rose Street)
Office: 303 Davis Marksbury Blg.
Office hours: 10:30-11:30, MWF.
Office phone number: (859) 257-3496.
email: marek@cs.uky.edu.
Midterm: TBA
Final examination (comprehensive): date/hour: TBA The exam will take place in the classroom.

The general idea of CS505 is that it prepares a Computer Science students for a future profession of a Database Administrator, in particular providing them with the basics needed to understand not only the simplest aspects of databases, but also additional, "intermediate" topics that they will need to use in their future careers as database professionals. All we teach in this class is subject to this general consideration. This said, one needs to realize that databases evolve fast, and this evolution is driven by applications that almost always are based on some database support. The topics we cover in this course deal with a wide panorama of issues facing modern database administrator.

While there are numerous database books, I do not like what I see. For that reason I will distribute to the students my own slides (which are numerous, maybe 400 or more). Innovation in Database area resulted in many new topics not present, say, 10 years ago. We will attempt to discuss these (and various legacy topics) and as much as it is possible prepare students to future database work. Certainly the older SQL-based databases do not disappear, and likely will not disappear in a foreseeable future. But new applications require use and understanding of other classes of databases, and we will discuss these in our lecture as well.

Course Prerequisites: We assume that the student is familiar with the introductory issues in Databases such as SQL and database design. Likewise, Data Structures and Algorithm course (CS315) is assumed to be taken by the participants.

Course format: Following the current University practice, and the fact that the course is open to undergraduate students, the lecture will be divided into two parts. Each of these parts will have a separate grade, and the final score will be the arithmetic mean of both scores. According to the University rules the midterm grade will be recorded in student's record along with the final grade (for undergraduate students).

After completing each topic there will be 20-25 minutes quiz covering material presented during the corresponding lectures. The results of that quiz directly contribute to the grade. The two parts of the lecture will be subdivided further, see below.

  1. Part I
    1. Modern databases (what is going on in databases)
    2. Security in Databases
    3. Transaction Management and Crash Recovery
  2. Part II
    1. Big Data and important examples of their applications
    2. Database Architectures, cloud databases, parallel and distributed databases and graph databases.
    3. Additional topics, depending on the progression of the course. Those topics could include OLAP, Analytics, etc.

Part I will take app. 8-9 weeks, with the topics (1) and (2) taking 5 weeks, and topic (3) taking 4 weeks. In part II, the first topic will take app. 3 weeks, and the second and third 5 weeks.
These are only approximate time spans. The reason is that every audience is different and the amount of material I will be able to cover depends on the cooperation of students.

As mentioned above, I will provide (large number of) slides (in pdf format) covering the entire material.

Expected outcomes of the course. I expect that by the end of the course the student will understand several fundamental issues related to data processing in database systems. These include:

  1. Understanding the role of databases in modern society
  2. Understanding the security issues in DBMS, including SQL support for Discretionary Access Control and the issues related to Mandatory Access Control and other security models for databases
  3. Understanding the fundamentals of transaction management and concurrency control including theoretical foundations of concurrency control, and crash recovery
  4. Understanding issues related to information extraction from data stored in databases, thus so-called Big Data, with several examples of applications
  5. Understanding current architectures of databases, cloud databases, graph databases and basics of parallel and distributed databases

Credit for this course:
The credit for the first part of the course will be determined as follows:

The credit for the second part of the course will be determined as follows:

The final grade will be the average of the results of Part I and Part II.

Grading Scale:

However, individuals who will get at least 94% in the first half of the course will have an option not to take final exam. This incentive is designed to encourage students to spend significant amount of effort in our course, in particular in Part I.

Relationship with the first Database Course: We reiterate that this course assumes familiarity with SQL and other basics of DBMS as taught in the course CS405. CS505 is not a course where we teach SQL, basic issues in design of databases, JDBC etc. (in fact, these days we expect the student to know how to connect to a database from a programming language such as Java, Python, PHP, or Perl). The goal of CS505 is to provide the student with the understanding of "behind-the-scene" DBMS issues, as well as modern applications of Databases.

While students could take their first DB course anywhere, I will assume that they are familiar with the material covered in our program CS405 course. During my lectures students should expect reference to the entire material covered in CS405 course, not only SQL proper.

Database Management Systems Following the practice common these days, I expect students to have a database management system(s) installed on their own machine. There are several such systems freely available on the Internet. SQL free systems include: Oracle, MS SQL Server, MySQL and other systems. In an unlikely case when the student does not have a personal computer with a DBMS installed, the Department offers the access to MySQL on Departmental machines. Just in case, accounts on Multilab machines and on the MariaDB available there will be created for all students of CS505. I expect to discuss a graph database system, OrientDB or Neo4J, and possibly use it for the second project. These systems are, again, freely available for all major operating systems. I expect this system available on Multilab, as well.

Projects: A significant part of the credit (30%) is assigned for projects. One consequence is that it is not possible to get "A" in this course without decent performance at projects.
Projects will be implementation of some specific topic such as unusual access control mechanism for tables of database, simple transaction scheduling mechanism or the use of a graph database.
Project descriptions will be available through the First project page and Second project page. (but at time of the publication of this syllabus specific topics of the projects are not yet decided).

Communication with the instructor: I am available during my office hours, and by appointment. Additionally, there will be a class mailing list. Students can also communicate with me through e-mail to the address marek@cs.uky.edu. Please do not expect the instantaneous answer esp. during weekends. Normally, I read my mail both during my working hours and also during evening hours.

Attendance: I check attendance in class, in recognition that attendance in class and the personal contact with the instructor is what defines non-long distance learning. Absence in class will be noted and after three unexplained absences DUS/DGS will be notified.

Obligatory Office Hours: I utilize a mechanism called "OOH", or "obligatory office hours." When I find (either through a quiz below 60% of available credit, or midterm exam result, or project result) that the performance of a student falls below expectations, I call that person to my office hours (or special appointment if the student has classes during my office hours period). This is an obligatory attendance and non-attending (but called) students will be referred to DUS/DGS. The reason is that I believe that issues caught early can be resolved before they become a major impediment. I also believe that OOH practice adds value to the course through personalized instruction.

Getting acquainted with the instructor:

I intend to meet every student in the class during the first two weeks of the semester. For this purpose I will bring a signing list to the second class (i.e. Friday, January 12) and expect that each student will sign for one of available slots.

Yet another strange thing: I believe in "active teaching". That is not only the audience addressing questions to the instructor, but also the instructor asking attendees questions to control understanding. In other words, my classes are a form of a dialogue between the class and myself. Again, while some students find such teaching style intrusive, my experience is that students learn more.

Electronic equipment in class:. I expect the students to turn off their cell phones while in class. Computers can be used for taking notes, but using it for some other purpose (see Distracting behavior below) will result in referring to DUS/DGS.

Final Remarks

  1. Computer science evolves fast, and often new topics, not covered in the slides, or covered insufficiently need to be discussed during the coursework. Consequently, I always extend the material that is in the handbook (if used at all). The same will happen during this course. I make my slides available to students. These slides are for use of the students who are registered in CS505, only. While I publish slides and allow for copying them for individual use, I maintain the intellectual property rights to my slides, and further copying or using them for tutoring, or as a material for teaching other individuals, or as a reference during future employment is not permitted, unless separate arrangements are made.
  2. Cheating and Plagiarism will be pursued if they occur. See Student's Rights and Responsibilities for applicable penalties.
  3. Group work on quizzes, and exams is not acceptable.
  4. Projects can be done by two-person teams or individually. Individuals in a team will get the same grade, hence membership in a two-person team involves trust in the partner. Teams may change composition after the first project is completed.
  5. Students with appointments always have priority over those without appointments, even during office hours. I encourage questions be email.
  6. Distracting behavior. I discourage eating, drinking, and other types of distracting behavior during my classes (but drinking pure water, or tea, coffee etc. is, in my view an acceptable activity). Reading newspapers, cruising the web, reading email, solving problems for other lectures etc. should not take place in the classroom.

Set up: 12/02/17

Last modified: 01/09/2018