September 2, 2014

[propn] A Tool for Undeclared Mathematicians @UWaterloo

This is my sixth term at university. For my first three terms I was in Management Engineering, which is a UWaterloo hybrid of information technology, business, and industrial engineering. My fourth term I switched out of Management Engineering into Mathematics, with the intent of concentrating in Computer Science. I quickly realized that Computer Science wasn’t for me, and my fifth term I took a motley array of mathematics courses that ended up loosely focused around Combinatorics and Optimization. I’m now in my sixth term, doing an exchange at HKUST, and still an undeclared Mathematics student, roughly equal parts towards a degree in Computer Science, Combinatorics and Optimization, Pure Mathematics, Computational Mathematics, and Statistics.

UWaterloo likes to streamline their students into specific concentrations very early on in their academic career. Talking to other exchange students at HKUST, it sounds like most schools have three to five different concentrations within Mathematics. UWaterloo has eleven.

The reason that other schools can have such a limited number of concentrations is because they group topics into broader classifications. For example, at UWaterloo we take Algebra in five courses: Advanced Algebra, Advanced Linear Algebra 1, Advanced Linear Algebra 2, Groups and Rings, and Fields and Galois Theory. At HKUST they take: Honors in Linear and Abstract Algebra 1, and Honors in Linear and Abstract Algebra 2. In Germany they take one year long advanced Algebra course that covers the material from all the above.

It makes sense that UWaterloo likes to introduce specialization early, because UWaterloo is a co-operative university and industry likes specialization. This is most apparent in the Engineering faculty at UWaterloo. Most schools don’t differentiate their engineering programs until year two or later (and have fewer differentiated engineering concentrations to boot) but UWaterloo differentiates their engineering concentrations from day one. This is because Engineering students at UWaterloo are looking for their first internship in either their first or second term. The more specialized their knowledge, the easier it is for them to find a job.

Specialization is useful when it comes to improving hireability, but is dangerous when it comes to people who don’t know what they want to do - people like me. Despite not getting closer to graduating, I enjoyed my term branching out in mathematics more than any term prior. There’s a lot of interesting types of mathematics and being limited to just one specific concentration isn’t necessarily best for everyone. To that end, I’d like to propose a tool for helping undeclared mathematicians at UWaterloo:

Each degree concentration at UWaterloo has required courses. Each course has zero or many pre-requisite courses. We can think of this data as a sort of path. A student follows the path, by taking courses, which leads them to their destination, a degree. These paths are not independent though. They twist and turn, fork and converge. A useful mathematical model for this type of data is a directed graph, where each class is a vertice with a directed edge to the other classes that it is a pre-requisite for, and all degree concentrations that require it. The leaf vertices in this directed graph are either classes without pre-requisites or degree concentrations.

For example, the Algebra sequence described above would be represented as:

Advanced Algebra -> Advanced Linear Algebra 1 -> Advanced Linear Algebra 2 -> Groups and Rings -> Fields and Galois Theory -> ... -> Concentration in Pure Mathematics

and all of the courses would contain additional directed edges to other courses that they are a pre-requisite for, and degree concentrations that they are a requirement for.

From this model we can construct a visualization to help lend clarity to the relationships between specific classes, degree concentrations, and greater trends in mathematics. Using this visualization, students can plan their courses within a bigger mathematical context, rather than being limited to the scope of a specific degree concentration. For example, I could have used such a visualization to answer questions such as:

Which courses are important to take early because they are requisite for many later courses?
Which courses are required by the most degree concentrations, and could thus be taken as a way of postponing making a degree concentration decision?
Which degree concentrations are similar, and could perhaps lend themselves to double majoring, or minoring?

There are many forms that this visualization could take, but I think a useful one harks back to the path analogy. Courses will be represented as lines with widths dependent on the relative importance of the course as a pre-requisite for other courses. Pre-requisite course lines will fork and converge into new lines representing the courses they are pre-requisite for. All of these lines will ultimately end up flowing into a destination point, representing the degree concentration that they are required for. Degree concentration points with thicker lines flowing into them would represent less flexibility (more required courses).

The thickness of the line representing a course would be determined by a point system. Each course would get one point for each of the following:

The number of degree concentrations that require this course to graduate.
The number of courses that require this course as a pre-requisite. In some cases #2 could end up being a fraction, for when degree concentrations require “any 3 of the following 5 courses” (then each of the five courses would get 3/5 of a point for that degree).

The resulting visualization would look like a network of roads, with the paths starting from the courses without pre-requisites and ending at the different degree concentrations. In order to plan your courses you could simply map out your route along the visualization.

I don’t think building this visualization would be all too difficult. We’ll start by assuming that we have a data source that lets us look up degree concentrations and courses, and that once we have a degree concentration we can additionally look up the courses required for that degree concentration, and that once we have a course we can look up the pre-requisites for that course. For simplicity’s sake, we’ll bunch degree concentrations and courses together into a Node object, and required courses, and pre-requisite courses together into a children property on the Node object.

The data source that we’re assuming does in fact exist, but it would have to be pieced together from the degree specifications and course catalog websites.

let Node be a class with properties id and children
let D be a data source that maps id's to Node objects
let degreeConcentrations be a list of Node ids

The algorithm we’ll use will be based on Breadth First Search, seeded with all the different degree concentrations (which we know). The goal is to get a list of leaf nodes representing the courses without pre-requisites, and to build up the count representing the width of each line.

let Q be a queue
let G be a list

forEach degreeConcentration in degreeConcentrations
    Q.enqueue(D.get(degreeConcentration))

while !Q.empty()
    Node n = Q.dequeue()
    // increment the count representing the width of this line
    n.count += Math.sum(n.parents().map(function(parent){
        return parent.count
    }))
    // apply bread first search
    forEach child in n.children()
        D.get(child).parents().add(n)
        Q.enqueue(child)
    if (n.children().empty())
        // then this is a course without pre-requisites
        // keep a reference to it in the graph
        G.add(n)

After this algorithm terminates we’ll be left with an object G with references to each of the starting nodes in our graph. In order to build our visualization we use the parents and count properties that we built up in the previous section. We’ll assume we have a drawLine(id1, id2, thickness) function that draws a line from the point at id1 to the point at id2 with the given thickness.

The reason that we couldn’t do this in the previous section is because we’re performing our BFS on a graph and not a tree. In other words, interior nodes can have connections to multiple parent degree concentrations.

// starting another round of BFS
// (going the other direction this time for simplicity)
let Q be a queue
// seed with the courses without pre-requisites
forEach Node n in G
    Q.queue(n)
// draw the lines
while !Q.empty()
    Node n = Q.dequeue()
    forEach child in n.children()
        drawLine(n, child, child.count)
        Q.enqueue(child)

Done!

I wasn’t expecting this to be as long as it ended up being, but I guess my subconscious had time to do a lot of thinking while my conscious was busy trying work out my class schedule at HKUST. I probably won’t end up having time to actually build this (mostly due to the two undiscussed complications of accessing the data programmatically and deciding where to place each point) but if anyone else is interested I could be convinced. I think it’s probably too late for existing students (like me) but it could be a very useful for incoming students.

As always, feel free to reach out to me @jomrcr. Thanks for reading : )

Kudos

[propn] A Tool for Undeclared Mathematicians @UWaterloo

Now read this

Ça va