CS410 Text Information Systems (Spring 2013)
Instructor: ChengXiang Zhai
| Home | Basic Information | Schedule |
| Readings | Assignments | Project | Resources |
Project presentation schedule is available here.
The instructor will hold extra office hours on Monday, May 6, 2:30pm-5pm (in 2116 SC) to help address
any last-minute problems some of you may encounter in finishing your course projects.
The room of project presentations is 3403 Siebel Center (not our classroom)
Please check out the details about project presentations on the project page.
The Midterm for CS410 is scheduled to be on April 9, Tuesday, 11 am - 12:15pm, at Auditorium 149, National Soybean Research Center (NSRC), 1101 W. Peabody Drive, Urbana. (south campus)
Practice quesstions for preparing for the midterm exam are available here.
A sample midterm from the past and solutions is available here. Note that some questions (i.e.,
2(c) and 3) are out the scope of the topics that we will cover in our midterm.
Assignment 4 is available now.
Sample project topics are available here.
Topics to be covered in the midterm are available here.
Project proposals are due Tuesday, March 26, 2013. Please check the project
page for details. Literature review topics are also due at the same time. Please
check this page
for more details.
Assignments schedule has been changed. Assign 3 Part 3 will be due on Mar. 30, and Assign 4 will be released on Mar. 25
Detailed guidelines for readings have been posted on the Readings page.
Assignment 3 is available now.
Assignment 2 is available now and due on Sunday, Feb 17, 2013, 11:59pm (extended)
Assignment 1 on Compass is available.
Piazza would be used for announcement and discussion for CS410. Please get registered ASAP.
Assignment #1 is available and due on Tuesday, Feb. 5, 2013 (See Schedule Page )
The current TA office hours are: (1) Xiaolong Wang: 3:30-5:30pm Wednesdays; (2) Mianwei Zhou: 8-9pm Wednesdays; 3-4pm Fridays, all in 0207 Siebel Center (basement).
As the amount of online textual information (e.g., web pages, beblogs, tweets, email,
news articles, office documents, and scientific literature)
grows explosively, it is increasingly important to develop tools
to help us manage and exploit the huge amount of information. Web search engines,
such as Google and Bing, are good examples of such tools, and they
are now an essential part of everyone's life.
In this course, you will learn the underlying technologies of these
and other powerful tools for managing and analyzing text information. You will be able
to learn the basic principles and algorithms for managing, analyzing, and mining text data as well as
obtain handson experience with
using existing information retrieval toolkits to set up your own
search engines and improving their search accuracy. You will also have an opportunity to work on a course project
on a topic of your choice related to the course materials.
Unlike structured data, which is typically managed with a relational database, textual information is unstructured and poses special challenges due to the difficulty in precisely understanding natural language and users' information needs.
In this course, we will introduce a variety of techniques for accessing and mining text information.
The course emphasizes basic principles and pratically useful algorithms.
Topics to be covered include, among others, text analysis, text retrieval, text categorization, text filtering, clustering, topic mining and analysis, search engine design and implementation, and applications in Web search and mining.
The course is lecture-based. Grading is based on
a set of assignments, a late midterm examination, and a
course project. Those who registered the course for 4 credit hours are required to finish a literature
survey on a frontier topic.
For more information about the course policy, please see " Basic Information" of the course.