top of page
JMathGLogo1.png

Posets and Python 2 - Depth-First Search

Updated: Jun 28, 2023

In the last post, we built a class in Python that takes in a list of elements and a list of relations and returns a Poset object. Within the Poset class, we defined another class PosetElement so that we could redefine the dunder method "__lt__" (meaning less than) to let us compare elements of our poset using our list of relations. The first thing I want to do is fix an oversight.


We defined "self.relations" to just be the input list, but we really should have converted them into ordered pairs of PosetElement objects. Doing that directly like so

self.relations = []
for r in relations_:
    r0 = self.PosetElement(r[0],relations_)
    r1 = self.PosetElement(r[1],relations_)
    self.relations.append((r0,r1))

actually causes sneaky errors down the line! It is very confusing to figure out what's wrong when the names are the same but equality still throws "False". This is because we're creating a bunch of new PosetElement objects instead of using what we already have in "self.set". And we can get the correct items from "self.set" by looking at indices in "set".

self.relations = []
for r in relations_:
    r0 = self.set[set_.index(r[0])]
    r1 = self.set[set_.index(r[1])]
    self.relations.append((r0,r1))

Here's what we have so far:

class Poset():

     class PosetElement():
          def __init__(self1,name,relations_):
               self1.name = name
               self1.relations = relations_

          def __lt__(self1,B):
               return ((self1.name,B.name) in self1.relations)
                    
     def __init__(self,set_,relations_):
          self.set = []
          for s in set_:
               self.set.append(self.PosetElement(s,relations_))
          self.relations = []
          for r in relations_:
               r0 = self.set[set_.index(r[0])]
               r1 = self.set[set_.index(r[1])]
               self.relations.append((r0,r1))

There are many things we might want to know about a poset, so today I want to update our class with a few methods and attributes that will help us actually study the posets we are interested in. Let's start by adding equality, which will simply be equality for their names.

def __eq__(self1,B):
    return (self1.name == B.name)

Fun Fact: Python checks equality is by checking their id, a unique integer assigned to an object. If two objects have the same id, then they are equal. This is fun to test on a bunch of different things, but do know your id numbers will probably be different!

print(id(2)) #output: 4361099600
print(id(1+1)) #output: 4361099600
print(id(len('Hi'))) #output: 4361099600
print(id('Hi')) #output: 4365278704
print(id(set)) #output: 4359670192

Now let's finish out our comparisons by defining "__le__, __gt__, __ge__" inside PosetElement.

def __le__(self1,B):
    return (((self1.name,B.name) in self1.relations) or self1.name == B.name)

def __gt__(self1,B):
    return ((B.name,self1.name) in self1.relations)

def __ge__(self1,B):
    return (((B.name,self1.name) in self1.relations) or self1.name == B.name)

Whenever two elements A,B of our poset have the property that "A < B = False" and "B < A = False", then we say that X and Y are incomparable. We'll add a method to the Poset class to do this check in a minute.


I also want to add that if we wanted to be careful, we should add checks to make sure both inputs are PosetElement objects. In this case, we'll just assume we're entering things correctly.


Here are a bunch of examples:

P = Poset([1,2,3,4,5],[(1,2), (1,3), (1,4), (2,5), (3,5), (4,5)])
_1 = P.set[0]
_2 = P.set[1]
_3 = P.set[2]
_4 = P.set[3]
_5 = P.set[4]
print(_1 < _2) #output: True
print(_1 == _2) #output: False
print(_1 <= _2) #output: True
print(_1 <= _1) #output: True
print(_1 > _2) #output: False
print( _5 >= _2) #output: True
print(_2 > _5) #output: False
print(_2 >= _2) #output: True

Visually, this poset looks like this:

Poset([1,2,3,4,5], [(1,2),(1,3),(1,4),(2,5),(3,5),(4,5)])

An important distinction is that when we graph a poset like this, we usually assume it's a Hasse Diagram, which means the partial order is transitive. In our case, we are explicitly naming the relations, so the only relations are those arrows shown and transitivity is not assumed. A graph like this, with vertices and directed edges, can also be called a digraph. So our Poset construction could also be looked at as a DiGraph class. So for the P we defined, we have "1 < 5 = False". In the future, we could add a transitive closure method within Poset.


Methods in the Poset Class


Let's start by adding a function to Poset that will check if two elements are incomparable. As we mentioned before, this comes down to checking if "(A = B) = False" and "(A < B) = False" and "(B < A) = False". And we can actually do this very simply using one of De Morgan's Laws, which tells us that the two statements

not (X or Y)

and

(not X) and (not Y)

are logically equivalent. What's great is that now that we've defined our dunder methods, we can use the "==, <, >, <=, >=" operators and Python will understand what we mean.

def incomparable(self,A,B):
          return  (not (A == B or A < B or A > B))

And a few examples:

print(P.incomparable(_2,_5)) #output: False
print(P.incomparable(_1,_5)) #output: True
print(P.incomparable(_1,_2)) #output: False
print(P.incomparable(_2,_3)) #output: True
print(P.incomparable(_2,_2)) #output: False

The next method we'll add is just one to return the elements of the set. Currently printing this gives a list containing a bunch of "<__main__.Poset.PosetElement object at 0x10633f2e0>". But that usually isn't what we want when asking for the elements of a Poset. So we'll add a method to get the elements of a Poset and show them as their names.

def get_elements(self):
          return [x.name for x in self.set]

Let's check the difference between "P.set" and "P.get_elements()"

print(P.set) #output: [<__main__.Poset.PosetElement object at 0x10bbec370>, <__main__.Poset.PosetElement object at 0x10bbec3d0>, <__main__.Poset.PosetElement object at 0x10bbec430>, <__main__.Poset.PosetElement object at 0x10bbec490>, <__main__.Poset.PosetElement object at 0x10bbec4f0>]
print(P.get_elements()) #output: [1, 2, 3, 4, 5]

Notice we call "P.set" because it is an attribute but "P.get_elements()" because it is a method. We'll also add a "P.get_relations()" method to display the relations by their names. This gives:

class Poset():

     class PosetElement():
          def __init__(self1,name,relations_):
               self1.name = name
               self1.relations = relations_

          def ___eq__(self1,B):
               return (self1.name == B.name)

          def __lt__(self1,B):
               return ((self1.name,B.name) in self1.relations)

          def __le__(self1,B):
               return (((self1.name,B.name) in self1.relations) or self1.name == B.name)

          def __gt__(self1,B):
               return ((B.name,self1.name) in self1.relations)

          def __ge__(self1,B):
               return (((B.name,self1.name) in self1.relations) or self1.name == B.name)
                    
     def __init__(self,set_,relations_):
          self.set = []
          self.relations = relations_
          for s in set_:
               self.set.append(self.PosetElement(s,relations_))

     def incomparable(self,A,B):
          return  (not (A == B or A < B or A > B))
         
     def get_elements(self):
          return [x.name for x in self.set]
          
     def get_relations(self):
          return [(r[0].name, r[1].name) for r in self.relations]

Posets carry all their structure in the ordering we give its elements, so what we want to do next is characterize that structure in a...structured way. Then we can use that information to find out details about our poset. This structure will come in the form of chains. A chain of length k in a poset is an ordered list of elements [v1, v2, ..., vk] so that

v1 < v2 < ... < vk

Another word for a chain is a Total Order, which differs from the Partial Orders we've been thinking about because in a total order, any two elements can be compared. Think of the real number line. Just one long line. Our posets can be thought of as being built up (glued together) from chains of maximal length, so finding them will be our next goal!


Depth-First Search


Ok, this will be fun! I've worked with graphs in SageMath a bunch and always wondered how exactly they implemented the many property checks and methods they offer for graphs. Now I'll get the chance to go ahead and implement a very important algorithm for working with our posets: the Depth-First Search (DFS) algorithm. Let's see how it goes.


We'll define a function "DFS(V: list, E: list)" which takes in a list of vertices and a list of edges. This statement is equivalent to "DFS(V, E)", but I wanted to include the type hints for readability. Let's detail our process:

  1. Calculate all in-neighborhoods and out-neighborhoods of vertices.

  2. Use that to form a list of leaves (vertices with no outward arrows) and sources (vertices with no inward edges).

  3. We will choose our root from our list of sources and keep track of which sources haven't been searched yet.

  4. We'll move forward at random if we can, recording the chain we're forming.

  5. If we can't move forward because we reached a leaf, we formed a maximal chain and will record it. This is essentially the info we really want from this!

  6. If we can't move forward because we're at a root, then that means all chains from that root have been traversed. We choose a new root from the list of unsearched sources if it is not empty. If it is empty, then the algorithm terminates and we return the list of maximal chains.

  7. If we can't move forward because we're at a vertex with no outward edges available (this include leaves), then we move back along the last edge we traversed and remove that vertex from our current chain.

Dictionaries are going to make this work much cleaner. To accomplish (1), we initialize two dictionaries whose keys are the vertices and values are empty lists. To fill those lists with the appropriate neighbors, we just look through all edges (x,y) and record x in the in-neighborhood of y, and y in the out-neighborhood of x. All together, this becomes:

def neighborhoods(V: list, E: list):
     n = len(V)
     out_neighborhood = {V[i]:[] for i in range(n)}
     in_neighborhood = {V[i]:[] for i in range(n)}
     for e in E:
          out_neighborhood[e[0]].append(e[1])
          in_neighborhood[e[1]].append(e[0])
          
     return out_neighborhood,in_neighborhood

Remember throughout that "out_neighborhood" and "in_neighborhood" are dictionaries whose values (their list of in/out-neighbors) we retrieve via "out_neighborhood[v]".


Now we go ahead and define "DFS(V: list, E: list)" and form our neighborhoods from V and E. For (2), we form our list of leaves by adding vertices with no out-neighborhood, and we form our list of sources by adding vertices with no in-neighborhood.

def DFS(V: list, E: list):
     out_n, in_n = neighborhoods(V,E)
     leaves = [v for v in V if len(out_n[v]) == 0]
     sources = [v for v in V if len(in_n[v]) == 0]

Before moving on, we also want to initialize a bunch of variables. We do this to be careful later, where we could get an error for referencing variables before they're defined. This includes our "root = 0", our current vertex "c_v = 0", and our current "chain = []". We just use 0 because it doesn't matter what their values are at first - they will be overwritten before ever being referenced.


We also form our list of "maximal_chains = []", our unsearched sources "us_s = sources", and a variable "c_done = True" to check if we're done with the component we're searching from a chosen root. Moving on to the actual algorithm, we start with a "while" loop that always runs. This is because when the algorithm is over, we will return our list of maximal chains and that will end the loop.


For (3), we import all functions from the random module: "from random import *". We use "choice(us_s)" to choose a random unsearched source as our root. We remove that element from "us_s", set our current vertex to the root, begin our chain with the root, and set "c_done = False" since we're exploring a new component. We also recalculate "out_n" and "in_n" with each new root choice to make sure we truly get all chains.

while True:
          if c_done: #will always run first
               out_n, in_n = neighborhoods(V,E)
               root = choice(us_s)
               us_s.remove(root)
               c_v = root
               chain = [root]
               c_done = False

I included the comment "will always run first" to remind ourselves that even though it's an "if" statement, this block of code will always run first since we initialized "c_done = True", and it is this block that truly starts the algorithm.


With "out_n[c_v]" being the available out-neighborhood of the current vertex, we accomplish (4) by checking if it is not empty. If so, we choose a new vertex "n_v = choice(out_n[c_v])" and remove it from the list of out-neighbors. We set it as our current vertex "n_v = c_v" and add it to our current chain.

if len(out_n[c_v]) > 0:
    n_v = choice(out_n[c_v])
    out_n[c_v].remove(n_v)
    c_v = n_v
    chain.append(c_v)

Next we fill in the "else" statement, which means our vertex has no out-neighborhood. We can accomplish goal (5) by checking if "c_v in leaves" and recording our chain as a maximal chain if so.

else:
    if c_v in leaves:
        maximal_chains.append(chain)

For (6), we check if our current vertex is the root and if so, we check to see if there are any more unsearched sources to choose as a root. If there are none, we're done! We'll go ahead and print that for celebration and then return our list of maximal chains. If "us_s" is not empty, we set "c_done = True" to note that we finished this component. This also means the next loop we do will catch that and choose a new root.

else:
    ...
        if c_v == root:
            if len(us_s) > 0:
                c_done = True
            else:
                print('Done')
                return maximal_chains

Finally, we'll add an "else" statement to the "if c_v == root" condition for (7) where we remove our current vertex from the chain and set the current vertex to the last element in the chain.

if c_v == root:
    ...
else:
    chain = chain[:-1]
    c_v = chain[-1]

And there we have it! Here is the whole DFS code put together.


Full Code

def DFS(V: list, E: list):
     out_n, in_n = neighborhoods(V,E)
     leaves = [v for v in V if len(out_n[v]) == 0]
     sources = [v for v in V if len(in_n[v]) == 0]
     us_s = sources
     maximal_chains = []
     chain = []
     c_done = True
     root = 0
     c_v = 0

     while True:
          if c_done: #will always run first
               out_n, in_n = neighborhoods(V,E)
               root = choice(us_s)
               us_s.remove(root)
               c_v = root
               chain = [root]
               c_done = False

          if len(out_n[c_v]) > 0:
               n_v = choice(out_n[c_v])
               out_n[c_v].remove(n_v)
               c_v = n_v
               chain.append(c_v)

          else:
               if c_v in leaves:
                    maximal_chains.append(chain)
               if c_v == root:
                    if len(us_s) > 0:
                         c_done = True
                    else:
                         print('Done')
                         return maximal_chains
               else:
                    chain = chain[:-1]
                    c_v = chain[-1]

Examples

Let's take the poset we've been working with. In addition to "maximal_chains", I'm going to print "c_v" each time we do a loop to see how our algorithm moves along our digraph.

M = DFS([1,2,3,4,5],[(1,2),(1,3),(1,4),(2,5),(3,5),(4,5)])
for c in M:
    print(c)
#output: [1,2,5], [1,4,5], [1,3,5]

And the vertices we travelled along were:

1, 3, 5, 3, 1, 4, 5, 4, 1, 2, 5, 2, 1

Which can be visualized by highlighting the path of vertices the algorithm takes:

Depth-First Search Ex 1 Gif

Now let's test a new poset!

N = DFS([1,2,3,4,5,6,7],[(1,4),(2,5),(3,5),(4,6),(5,6),(5,7),(4,7)])
for c in M:
    print(c)
#output: [3,5,6], [3,5,7], [1,4,6], [1,4,7], [2,5,7], [2,5,6]

What's fun is that once we start working with more complicated posets, the path that DFS takes can be fairly different with each pass. This is fun to watch! Here are a few examples:

3, 5, 6, 5, 7, 5, 3, 1, 4, 6, 4, 7, 4, 1, 2, 5, 7, 5, 6, 5, 2

Depth-First Search Ex 2-1 Gif

 

2, 5, 7, 5, 6, 5, 2, 3, 5, 6, 5, 7, 5, 3, 1, 4, 7, 4, 6, 4, 1

Depth-First Search Ex 2-2 Gif

And that's Depth-First Search! Next time, we'll make use of those maximal chains we got from this process to learn things about the structure of our Poset class.




Thank you for reading!




Jonathan M Gerhard



Recent Posts

See All

Comments


bottom of page