COMPUTER VISION
|
[ Attendance | Grades ]
1. Course Description:
The goal of computer vision is to make computer work like human visual perception, namely, to understand and recognize the world through visual information, such as, images or videos. Human visual perception, after millions of years of evolution, is extremely good in understanding and recognizing objects or scenes. To have similar abilities to human visual perception (or beyond), computer scientists have been attempting to develop algorithms by relying on various visual information, and this course is about these algorithms. In case you are wondering why we should care about computer vision, consider this: if you think your visual perception system is important and beneficial, so is computer vision.
The potential practical benefits of computer vision systems are immense. It is anticipated that computer vision systems will soon become commonplace and its technology will be applied for a broad range of products, such as, google search on images and videos, object/face recognition and tracking, human pose identification, 3D reconstruction from images, image/video enhancement, robot vision, medical imaging, computer graphics, computer gaming, surveillance, remote sensing, user interactions for mobile devices, intelligent vehicle systems (a car that can drive itself), etc.
To enrol for the course, programming skills in C/C++ are
required. This course emphasizes more in the practicality of computer
vision, meaning more projects, which are more fun.
Lecturer: Robby
T. Tan
E-mail: tanrobby (at)
gmail.com
Official
website: computer
vision class
Samples from the previous year results:
[ Result 1 |
Result 2 |
Result 3 ]

2. References:
Main Reference (mandatory): Computer Vision by R. Szeliski: free download.
Literatures (optional):
- Online and free books related to computer vision [website]
- G. Bradski, A. Kaehler, "Learning OpenCV: Computer Vision with OpenCV Library", O'Reilly Media 2008, ISBN: 0596516130.
- D. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", Prentice Hall 2003, ISBN: 0-13-191193-7. [old draft in pdf].
- R. Hartley and A. Zisserman, "Multiple View Geometry in Computer Vision", Cambridge 2003, ISBN: 0-521-54051-8.
- O. Faugeras, "Three-Dimensional Computer Vision: A Geometric Viewpoint", MIT Press, ISBN: 0-262-06158-9.
Programming Resources (computer vision libraries and functions):
- OpenCV (the most popular computer vision library in C/C++, download). This is the standard library for this course.
- OpenCV Reference Manual [pdf]
Also, you can search in google scholar for academic papers, by entering the title, the author's name, or the topic:
3. Format:
Assignments
During the course, students are expected to have a few assignments.
Programming skills in C/C++ are required to accomplish the assignments.
Written Exam
There is a written exam in the course.
Class Attendance
Attendance is mandatory. When you cannot attend the class, you
have to give a notification (along with the reasons) by e-mail to the lecturer. Fail to do
this, there is a penalty in the attedance score.
Academic Honesty
Academic honesty is compulsory in accomplishing the assignment,
projects, and the exam. Exchanging codes for different groups are not
allowed. Using codes from the previous year or from the internet is
prohibited, unless stated otherwise in the lectures. Copying texts of
the reports from other groups is strictly prohibited. Generally,
cheating, academic misconduct, plagiarism, and fabrication of any
submitted material (including code and text) are not tolerated. We
will use software to detect any code or text plagiarism. Any violation
to the academic honesty will imply failure to pass the course.
4. Grading:
The final grade is average of the following assessments:- 40%: Assignments
- 50%: Exam
- 10%: Attendance
Retake exam:
To participate in the retake of the exam, the original grade must be no less than 4. Taking the retake exam should be discussed with the lecturer, particularly when the original score is 6 or above.
5. Period and Time:
Term 2: from November 2011 until February 2012.
Upon Graduation
The masters program Game and Media Technology allows you to have opportunities to work on the field of Computer Vision and its related areas. If you are interested, please contact Robby T. Tan.
6. Schedule
Notes:
- The schedule was arranged as a rough guideline. The lectures will emphasize more on clarity rather than strictness in following the schedule.
- Acknowledgements: major parts of the slides and materials used in the lecture are taken from various internet sources. The lecturer thanks those people who made the materials available.
| | | | | |
|---|---|---|---|---|
| 1 | 16-11-2011 | 17.15-19.00 |
Logistics + Introduction (motivations, applications, and overview) Reading Materials:
Additional Materials: |
Slide 1 Exercise 1 Assignment 1 |
| 2 | 18-11-2011 | 09.00-10.45 |
Image Formation: Geometry and Radiometry Reading Materials:
Additional materials: |
Slide 2 Exercise 2 Assignment 2 |
| 3 | 23-11-2011 | 17.15-19.00 |
Silhouette-based Volume Reconstruction Reading Materials:
|
Slide 3 Exercise 3 Assignment 3 |
| 25-11-2011 | No Class | |||
| 4 | 30-11-2011 | 17.15-19.00 |
Silhouette-based Volume Reconstruction: Part 2 | |
| 5 | 02-12-2011 | 09.00-10.45 |
Volume-based Visual Tracking from Multiple Views Additional materials:
|
Slide 4 Exercise 4 Assignment 4 |
| 6 | 07-12-2011 | 17.15-19.00 |
Probability Theory and Bayesian Inferences Additional materials:
| |
| 7 | 9-12-2011 | 09.00-10.45 |
MRF + Graphcuts Additional materials:
|
Slide 5 Exercise 5 |
| 8 | 14-12-2011 | 17.15-19.00 |
Object Classification and Recognition: Part 1 Additional materials:
| Slide 6 Exercise 6 Assignment 5 |
| 9 | 16-12-2011 | 09.00-10.45 |
Object Classification and Recognition: Part 2 | |
| 21-12-2011 | No class | |||
| 23-12-2011 | No class | |||
| 10 | 11-01-2012 | 17.15-19.00 |
Image Features and Matching: SIFT Reading materials:
| Slide 7 Exercise 7 |
| 11 | 13-01-2012 | 09.00-10.45 |
Image Editing: Image Inpainting Reading materials:
|
Slide 8 Exercise 8 |
| 12 | 18-01-2012 | 17.15-19.00 |
Image Editing: Gradient-based Image Editing Additional materials: |
Slide 9 Exercise 9 |
| 13 | 20-01-2012 | 09.00-10.45 |
Motion: Stochastic Visual Tracking: Particle Filtering Additional materials:
|
Slide 10 Exercise 10 |
| 14 | 25-01-2012 | 17.15-19.00 |
Physics-based Approaches: An Introduction Additional materials: |
Slide 11 Exercise 11 |
| 15 | 27-01-2012 | 09.00-10.45 |
Fog Removal, Demonstrations and Discussion (on assignment 5 and final exam) | |
| 01-02-2012 | 17.00-20.00 |
Final Exam |
Old News:
- [09/03/2012]: The maximum score of assignment 5 is 15. Therefore, in calculating your assignment score, you should divide the total score with 5.5. The overall calculation is that 10% attendance, 40% assignments, 50% final exam.
- [08/03/2012]: The grades (including the grades of Assignment 5) have been added.
- [08/03/2012]: The minimum grade to pass the course is 6, however if your total grade is 5.5 (or larger), it will be rounded up to 6.
- [08/03/2012]: If your grade is below 5.5 and you want to pass the course, you have to do the retake. To be able to participate in the retake, you have to e-mail me two days before the retake date.
- [08/03/2012]: The grades mentioned in the website are not the final ones (they might change). They become final and official when released in OSIRIS. The main reason of announcing them in the website is to receive your feedback for any possible mistakes.
- [06/03/2012]: Information about the retake can be found here.
- [23/02/2012]: The grades for the exam are already available.
- [24/02/2012]: Some grades for assignment 4 and 5 are still blank. The reason is because I cannot either compile or run the program. For those who cannot find the grades, a schedule for a meeting will be set soon (see the schedule below #3). In the meeting, which takes approximately 30 minutes, you should be able to run your program in my computer. This announcement is the revision of the yesterday's announcement, in order to make the meeting schedule more efficient.
- [24/02/2012]: The meeting schedule is here. If you're not available, please e-mail me your available date and time, as soon as possible.
- [24/02/2012]: If you intend to do the retake, you have to e-mail me at least 2 weeks before the retake day.
- [03/02/2012]: : due to a number of requests, the deadline is extended to Sunday, 5/2/2012, at 23.00. Note, while you are free to ask questions by e-mail on the weekend, I might not be able to answer your questions. Also, send your submission to tanrobby.uu gmail account.
- [25/01/2012]: If you are interested in computer vision, and intend to do an experimentation project or master thesis project in the subject. There are a few topics you might want to consider: There are also internship positions in both commercial and non-commercial institutions. Please e-mail me if you are interested.
- [25/01/2012]: The questions in the final exam will taken from the lectures, slides, exercises, and reading materials (available in the schedule).
- [27/01/2012]: Final exam: the topics of fog removal and specular highlight removal will not be included in the exam.
- [25/01/2012]: Additional information for the experimentation and master thesis projects in computer vision: There are also internship positions in both commercial and non-commercial institutions.
- [25/01/2012]: For those who are interested in knowing the implementation of color constancy and specular highlight removal algorithm, you can find the code here.
- [20/01/2012]: If you are wondering how to implement particle filter algorithm, here is an implementation in C/C++.
- [20/01/2012]: The code for solving Poisson's equation (Gradient Space Manipulation) is available here.
- [19/01/2012]: For those who are interested in gradient space manipulation, here are results and a well written report of a student (Bart Liefers) : website.
- [13/01/2012]: If you are interested in image inpainting, here is an implementation in matlab. Also, one of my students (Sander van de Ven) has created the following inpainting videos: [video 1 | video 2].
- [20/01/2012]: Assignment 5: Question: "What are the size of the patches when using HSV? How many visual words should I generate?". Answer: these questions are part of the assignment. Meaning, you have to find out by yourself by doing some experimentation. This also applies for other parameters (such as, the number of bins in color histograms).
- [20/01/2012]: Assignment 5: I found a newer version of MultiBoost (ver.1.1.05) that works for both windows and linux. It can be downloaded here. For windows users, follow the instructions below:
-
Download and extract the zip file. Open cmake. In cmake's GUI, insert the name of the folder where CMakeLists.txt of MultiBoost is located, click "Configure" and then click "Generate" (please also choose the compiler you use, either Borland or Visual Studio). The last step will generate a project file (.sln). Open the project file and compile the code.
- [20/01/2012]: Assignment 5: Question: "When extracting the HSV patches, should they overlap?". Answer: No, they should not.
- [17/01/2012]: Assignment 5: Question: can we have 2 main executable programs, one for the training process and the other for the testing process? Answer: Yes, actually I would like to encourage this. In any case, you should clearly write the instructions on how to run your programs in the report.
- [16/01/2012]: Assignment 5: Question: we use linux (11.04) and run into problems when compiling MultiBoost code. Answer: you have to modify the code by including all necessary header files, or simply download the modified code here and just type "make" to compile it.
- [16/01/2012]: Assignment 5: Question: do we need to have a singe executable program for the assignment? Answer: No, you can have a few executable programs. However, there should be no human intervention whevener the main program is started. This implies that you have to call those executable programs from a single main program automatically.
- [16/01/2012]: Assignment 5: a minor correction in the number of missing segmentation images in class 15. Instead of 17 images, it should be 19 images. This might imply that you have to do manual segmentation for 4 flower images.
- [16/12/2011]: The grades of Assignment 3 are already available. If you cannot find your grade or have questions about it, you should let me know.
- [15/12/2011]: Assignment 4: Q: In the off-line process, are we allowed to do the selection of each person manually? Yes, you are allowed to select or crop the region of each person manually.
- [12/12/2011]: Assignment 4: The implementation of the back projection is actually available in the provided code: (1) to obtain the camera position, use function getCamLoc (in cvcCam.cpp) which is basically used by function getCamCoord (in xScene3D.cpp); (2) to obtain the 3D position of a pixel, use function pt2W3D (in cvcCam.cpp). If you have problems with the provided back-projection code, let me know.
- [07/12/2011]: For those who were not in the lecture today:
- The deadline of assignment 3 is extended to 12/12/2012 at 17.00.
- The deadline of assignment 4 is extended to 20/12/2012 at 17.00.
- If you have difficulties in generating voxels (in assignment 3), let me know by e-mail. If necessary we can have an appointment to look at the problems together.
- If you have already sent the solution of assignment 3, but want to send a better one, you are allowed to do so (until the deadline). However, if you think the submitted solution is good enough, I suggest that you move on to assignment 4.
- [05/12/2011]: FAQ of assignment 3. Q: we have done everything we can to remove shadows from the silhouettes, but we cannot remove them entirely. What should we do?
A: In general, shadow detection and removal are still an open problem in computer vision. Thus, some noise due to shadows will be tolerated. - [07/12/2011]: If you have not finished assignment 3, there is a possibility of deadline extension, which will be discussed in the lecture today.
- [05/12/2011]: FAQ of assignment 3. Q: why are the origins of our cameras not correctly located in the 3D space when showing them in OpenGL?
A: the provided 3D reconstruction code requires the information of the size of the squares (the black or white squares) in the checkerboard in millimeters. So, if you use the automatic calibration from assignment 2, you need to measure the length of one of the squares by a ruler, multiply it with the translation values, and then store the new translation values in the ini files. Note, if you use the additional code for manual calibration (getExtrinsics.cpp), you only need to change one parameter in the code responsible for that value ("square_size"). In any case, If you still run into problems, even after doing this, you should let me know. - [02/12/2011]: Assignment 2: all submitted results have been uploaded and the grades are already available. If you cannot find your grade or the grade is not like what you expected, you should e-mail me.
- [01/12/2011]: Correction for the method to project the camera's origin onto the ground. In the today lecture, I mentioned having calculated C (the camera's origin according to the world coordinates) we can set y=0. It is wrong. The correct one is to set z=0. The reason is because we do not work in the camera coordinates anymore, but in the world coordinates (the checkerboard's coordinates), where the height of the camera is parallel to the z-axis of the world coordinates. In any case, if you have problems with the back-projection when implementing the method, let me know.
- [01/12/2011]: The voxel-based tracking algorithm I explained in the lecture is not the state-of-the-art (nor the best) algorithm, however with the knowledge you currently have, I think there are some elements in the algorithm that you can learn from; such as, the concept of appearance modeling, object localization, back-projection, etc (also importantly, based on the algorithm, you can have some practical programming experiences in dealing with real data).
- [01/12/2011]: Aside from this news section, now and then also take a look at the FAQ section of the assignment websites. You might find important messages related to the assignments.
- [30/11/2011]: In assignment 1 to 3, basically there is no programming required (or very small amount of it). Among other purposes, those assignments are meant to give you opportunities to learn OpenCV and to brush up your C/C++, as well to have some computer-vision related experiments. However, starting from assignment 4, you will do intensive programming. So, for those who are not good in C/C++ or still have problems with OpenCV, it is time to learn and to solve your problems.
- [30/11/2011]: From the provided code in Assignment 3, you should really try to understand the whole concept of the voxel-based 3D reconstruction by carefully examining the code.
- [30/11/2011]: As I mentioned in the lecture today, the reasons of having the attendance list are (1) I noticed an increase in the final grades (in average) if I include the attendance in the grading, (2) there is less misunderstanding regarding the assignments, the exams, the contents of the lectures, etc, (3) the materials of the course are not entirely based on the textbook (it is used only for a supplement to the lectures). In any case, if you cannot attend the lectures for good reasons, you should e-mail me.
- [29/11/2011]: Assignment 3: I have added a new section "notes and frequently asked questions" in the assignment website.
- [28/11/2011]: For those who still have difficulties in finishing Assignment 2, you should let me know. Again, I am willing to help.
- [28/11/2011]: The grades of Assignment 1 are already available here. If you cannot find your grade (for any reasons), you should e-mail me.
- [28/11/2011]: Some of the broken links in the schedule have been fixed. If you still notice broken links in this website, e-mail me.
- [28/11/2011]: A student (Elena) has informed me that the AI open course of Stanford University this week is discussing image processing and computer vision. Also, in this open course, topics such as "probability in AI", "HMMs and Filters" can be useful for understanding the similar topics in our computer vision course.
- [27/11/2011]: The complete instructions of Assignment 3 are now available.
- [26/11/2011]: Assignment 2: an example of code that draws 3D virtual coordinates in an image (the 3D to 2D projection) is available here. If you cannot run the code or find bugs in the code, please e-mail me.
- [24/11/2011]: Assignment 2:
- There is a small bug in the code I provided (the one written by Peter Prins). At line 89, it should be "corner[j].y" instead of "corner[j].x" (the corrected code is here). If you want to know whether the estimated intrinsic parameters are correct, you can look at the value of fx and fy (the focal-length values in Intrinsic.xml). If they are similar, then the intrinsic parameters are likely to be correct.
- If the radial distortion in your images is considerably small, the process of undistortion might be unnecessary. Removing the process of undistortion in the provided code can be done by adding the following command: "distortion_coeffs = cvCreateMat(5,1,CV_32FC1);" right after cvCalibrateCamera2(). Or, in the corrected code, uncomment line 174.
- [24/11/2011]: If you run into problems or have questions, you should e-mail me.
- [24/11/2011]: The tentative instructions of assignment 3 are available here.
- [23/11/2011]: Assignment 2: if you have difficulties with your calibration code, try this code (written by Peter Prins). Obviously, you should change some parameters, file names, etc.
- [22/11/2011]: On Friday (25/11/2011), there will be no class.
- [18/11/2011]: The instructions of Assignment 2 are available here.
- [18/11/2011]: A student is still looking for a partner for the practical group. If you are also looking for a partner, e-mail me immediately.
- [20/11/2011]: The slides for lecture 2 have been uploaded.
- [18/11/2011]: If you have finished with Assignment 1, and want to submit your work, e-mail your complete submission to: tanrobby.uu (at) gmail.com. If the files are too big, send through dropbox, or other file transfer providers.
- [18/11/2011]: For submitting your work (that contains .exe files) through e-mail, you must change the file name .zip to something else, otherwise it would be filtered out.
- [17/11/2011]: The link of the attendance scores is now available. Click "attendance" above.
- [16/11/2011]: To download opencv: website | sourceforge
- [16/11/2011]: To download the textbook: website
- [16/11/2011]: The remaining lectures will be in MIN-208.
- [16/11/2011]: The complete instruction of assignment 1 and how to submit will be announced here soon.
- [16/11/2011]: The lecture on Friday (18/11) will take place in schedule)
- [16/11/2011]: Regarding the assignments, if you have questions or run into problems, you should e-mail me.
- [16/11/2011]: If you cannot find a partner for the assignments, you should come to the lecture on 18/11, so that I can pair you up. klein">AARD-KLEIN.