CSC 448: Bioinformatics Algorithms FAQ


CSC 448: Bionformatics Algorithms is a Computer Science technical elective offered annually. The course instructor is Alex Dekhtyar. In the past, students who took this course without realizing what is involved in it have struggled. This FAQ is for people considering enrollment in CSC 448. Please read carefully. If you have any outstanding questions afterwards, feel free to contact dekhtyar at calpoly.edu.


Questions

  1. What are the prerequisites for CSC 448?
  2. What will the course cover?
  3. Do we need to know biology?
  4. What's the deal with BIO 441?
  5. How will CSC 448 and BIO 441 work together?
  6. What are these joint labs and how will they work?
  7. How will the labs be graded?
  8. What else is going to go towards the final grade in the course?
  9. Is the class hard? What do students who took the class say?
  10. Why should I take the class?
  11. Why shouldn't I take the class?


Question 1: What are the prerequisites for CSC 448?

Answer. The current (Fall 2014) listing for the course specifies CSC 103 as the prerequisite. This is incorrect. The current official prerequisite is a holdover from an old version of the course taught at the turn of the century. The new version of the course, taught in Spring 2012, Spring 2013 and scheduled for Fall 2014 has a more serious prerequisite: CSC 349: Introduction to Algorithms.

The prerequisite change will become official in the 2015-16 academic year, when the new catalog comes out. However, it is the true prerequisite for the upcoming Fall 2014 offering.

CSC 448 is essentially an advanced algrithm design and development course. We will be covering a variety of algorithms (primarily operating on strings) with very important immediate implications to DNA sequence analysis and similar problems. Base knowledge of algorithms, and specific knowledge of categories of algorithmic techniques, such as greedy algorithms, dynamic programming, and a variety of tree- and graph-based algorithms is very important for CSC 448.

In the past offerings of the course students who did not take CSC 349 struggled in the class and their difficulties affected not only their personal performance, but also performance of their team. In CSC 448 team-based labs are extremely important (see answers to questions below) as they affect not only students from CSC 448 but also students from our sister course, BIO 441: Bioinformatics Applications.

Because of this, the CSC 349 prerequisite will be enforced. As an exception, for the Fall 2014 offering, I will allow students to take CSC 349 concurrently with CSC 448.


Question 2: What will the course cover?

Answer. The course covers a range of algorithms and computational tasks associated with analysis of biological data. While bioinformatics and computational biology are large fields with a wide range of problems that can be studied, CSC 448 concentrates on the basic problems associated with DNA analysis. While the coverage of specific topics may differ from quarter to quarter, the core algorithms studied in the course will remain the same. You can find the topics covered in the past versions of the course on the Spring 2013 syllabus.


Question 3: Do we need to know biology?

Answer. Knowing something never hurts, so if you have taken Biology in college, this may be useful. Having said that, the course itself does not require any specific knowledge of biology beyond what is studied in high school. We will primarily concentrate on studying DNA and solving computational problems associated with DNA analysis. There is a certain amount of knowledge about DNA, amino acids and biological processes that we need for this course.

Fortunately, all this will be presented to you at the beginning of the class. Dr. Anya Goodman, the instructor of BIO 441: Bioinformatics Applications course, which will take place in parallel with CSC 448 will guest lecture in our class and will discuss the topics that are necessary to successfully complete the course.


Question 4: What's the deal with BIO 441?

Answer. CSC 448 is an advanced algorithms (and applications) course targeted as computational science majors. BIO 441: Bioinformatics Applications (also cross-listed as CHEM 441) is a bioinformatics course targeted at life sciences majors. The course introduces students to the computational biology problems, and teaches them how to use bioinformatics software to solve those problems. BIO 441 is not a software development course and students taking BIO 441 are expected to have significant expertise in biology, and are not expected to have and expertise in programming. The course is taught by Dr. Anya Goodman from the Department of Chemistry and Biochemistry.

There is a natural synergy between CSC 448 and BIO 441 that we have taken advantage in the past and will continue on pursuing in the upcoming offerings of both courses.


Question 5: How will CSC 448 and BIO 441 work together?

Answer. The two classes are scheduled to take place on exactly the same time. Our lab periods coincide and are scheduled in adjacent labs on the third floor of building 14. We will use the temporal and spatial proximity of our courses in the following ways:

  1. Lecture exchange. As mentioned earlier, Dr. Goodman will present guest lectures on topics related to Biology to our class. In my capacity as the CSC 448 instructor, I will visit BIO 441 students and will discuss the process of software development with them.

  2. Joint lab assignments. The most important aspect of the collaboration between the two classes is the joint labs. Throughout the quarter the students from BIO 441 and CSC 448 will be working together in teams of 5-6 students (2 to 3 CSC 448 students + 2 to 3 BIO 441 students) on joint lab assignments. The assignments involve depeloping software that implements (versions of) the algorithms discussed in class for use by the BIO 441 students in their coursework. See more information below in the answer to question 6.


Question 6: What are these joint labs and how will they work?

Answer. BIO 441 students have a quarter-long research project on genome study and annotation. As part of this project, the students need to solve a variety of computational tasks (that roughly corresponds to the topics and algorithms we study in CSC 448). This gives rise to the joint labs: BIO 441 students will work with you on developing software for their needs.

We will form joint teams early in the quarter. Each lab assignment is a request to deliver a program (or a small software system) that performs a specific task with the genome data that BIO 441 students need to analyze. Together with BIO 441 students you will develop the requirements document for the desired program/system. The CSC 448 part of the team will then design and implement the software, and jointly with BIO 441 students will test its work. Through the rest of the quarter (the lifetime of the software), the CS students may wind up entertaining maintenance requests on all the produced and delivered software.

The expectation is that throughout the development cycle for each of the lab assignments, each part of the team will play a role of experts in their respective field. You will rely on the expertise of BIO 441 students to present to you the proper requirements and to interpret for you their needs. BIO 441 students will rely on your software development expertise to get the actual job done.


Question 7: How will the labs be graded?

Answer. For CSC 448 students, there is only one set of deliverables for each lab: the software that BIO 441 students will run to complete their tasks. To a large degree this means that success in CSC 448 labs is based on both how good your BIO 441 partners are, as well as on how well you have implemented their requirements.

This dependence on another course is somewhat unusual for Computer Science coursework. However, it mimics pretty well the actual job environment, where how well you have done your job of developing the software depends on how well your customers were able to use it.


Question 8: What else is going to go towards the final grade in the course?

Answer. The course will also have a paper-and-pencil midterm exam and a paper-and-pencil final exam. Both exams are not unlike the CSC 349 exams: your knowledge of specific algorithms studied in the course will be tested.


Question 9:Is the class hard? What do students who took the class say?

Answer. Yes, this is not an easy class, even if you thought CSC 349 was easy.The class combines three aspects: learning advanced algorithms, employing software engineering principles for project development, and doing so while working side-by-side with experts from a completely different field. This is not an easy feat to accomplish in the confines of 10 weeks.

Students who have taken this course routinely comment on the amount of work the have to do. Each offering of the course attempts to fix the most glaring issues with the coursework, and you should expect some of the issues that plagued previous offerings (unclear deliverable specs, unclear processes, too many assignments in parallel) to be rectified. This, however, does not mean that other issues won't show up - this course is and will always be work in progress.

At the same time, students who wound up doing well in the course have understood its key objectives, and valued the uniqueness of the experience they received in the class - as hard, as getting this experience sometimes was.

The success of students in CSC 448 depends on their ability to be an expert in their own field (CS, software development), as well as on the ability of their BIO 441 partners to be experts in the field of biology. To increase overall chances of success, we are applying more stringent filters for getting students into both classes. BIO 441 students are expected to all be from the disciplines with significant life sciences component (Biology, Biochemistry, Animal Science and similar). For CSC 349 students we are enforcing the CSC 349 prerequisite. Knowledge of software engineering (CSC 307/308) is a big plus too, although not a requirement, as we cover most of what we need in class and in the lab.


Question 10: Why should I take the class?

Answer. If you qualify, you should take the class because it will give you a number of experiences that are, thus far, unique to the Computer Science (and Software Engineering) curriculum.


Question 11: Why should I NOT take the class?

Answer. First and foremost, you should not take the class if you do not meet the prerequisites. Even if you do meet the prerequisites though, there are still valid reasons for you not to take the class. Don't take the class if:


May 20, 2014, Alex Dekhtyar, dekhtyar(at)calpoly.edu