CNK's blog

Postgres Makes Scheduling Easy

At work, we need to build a scheduling system. We want to present the user with a list of possible dates - and then the possible slots on that date. I don’t want to have all the possible empty slots in the database so I thought I would have to build them procedurally using Python.

import calendar
from datetime import timedelta
from pytz import timezone as pytz_timezone

AVAILABLE_DAYS = ['Monday', 'Wednesday', 'Friday']
AVAILABLE_START_TIME = {'hours': 8, 'minutes': 0, 'timezone': 'UTC'}
# start + estimated duration must be earlier than this
AVAILABLE_END_TIME = {'hours': 20, 'minutes': 0, 'timezone': 'UTC'}

def possible_times(start_date, end_date, estimated_duration, granularity=60):
    '''
    Returns a list of times when a user may start a reservation between start_date and end_date (inclusive)
    By default reservations may start hourly from AVAILABLE_START_TIME onwards;
    you may adjust how frequently reservations may start by setting the 'granularity' (in minutes)
    '''
    possibles = []
    date = _first_slot(start_date)
    while date <= end_date:
        if not _is_possible_day(date):
            # skip this day
            date += timedelta(days=1)
            continue

        # find slots on day
        last_slot = _last_slot(date, estimated_duration)
        while date <= last_slot:
            possibles.append(date)
            date += timedelta(minutes=granularity)

        # go to next day
        date = _first_slot(date + timedelta(days=1))

    return possibles


# ############## helper methods #######################

def _is_possible_day(date, available_days=None):
    if not available_days:
        available_days = AVAILABLE_DAYS
    return calendar.day_name[date.weekday()] in available_days


def _first_slot(date, start_time=None):
    '''Returns the first slot of the day'''
    if not start_time:
        start_time = AVAILABLE_START_TIME
    first_slot = date.replace(hour=start_time['hours'],
                              minute=start_time['minutes'],
                              tzinfo=pytz_timezone(start_time['timezone']))
    return first_slot


def _last_slot(date, duration, end_time=None):
    if not end_time:
        end_time = AVAILABLE_END_TIME
    last_slot = date.replace(hour=end_time['hours'],
                             minute=end_time['minutes'],
                             tzinfo=pytz_timezone(end_time['timezone']))
    last_slot -= duration
    return last_slot

The code above loops over the days in the range - and then on available days, loops over the hours in that day and returns a list of datetimes. There is a lot of ugly adding of Python timedelta objects and resetting the time to start iterating on a new day. It works - but the next step, eliminating slots that are already full, is going to be even uglier - lots of tedious “does this interval overlap with existing scheduled events”.

from datetime import datetime, timezone, timedelta
from django.test import TestCase
from ..utils import possible_times

class ReservationUtilsTests(TestCase):
    # ############### integration tests ##########################
    def test_no_possible_times_on_sunday(self):
        start_date = datetime(2017, 4, 30, hour=8, minute=0, tzinfo=timezone.utc)
        end_date = datetime(2017, 4, 30, hour=23, minute=0, tzinfo=timezone.utc)
        duration = 60  # in minutes
        slots = possible_times(start_date, end_date, duration)
        self.assertEqual(slots, [])

    def test_hourly_possible_times_end_earlier_if_takes_longer(self):
        start_date = datetime(2017, 4, 30, hour=8, minute=0, tzinfo=timezone.utc)
        end_date = datetime(2017, 5, 2, hour=23, minute=0, tzinfo=timezone.utc)
        expected = [datetime(2017, 5, 1, 8, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 9, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 10, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 11, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 12, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 13, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 14, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 15, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 16, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 17, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 18, 0, tzinfo=timezone.utc),
                    datetime(2017, 5, 1, 19, 0, tzinfo=timezone.utc),
                    ]
        slots = possible_times(start_date, end_date, estimated_duration=timedelta(minutes=60))
        self.assertEqual(len(slots), 12)
        self.assertEqual(slots, expected)
        slots2 = possible_times(start_date, end_date, estimated_duration=timedelta(minutes=120))
        self.assertEqual(len(slots2), 11)
        self.assertEqual(slots2, expected[0:11])

When I started looking into how to check the overlap, I started to looking into checking overlaps in the database - and found that a) Postgres has a date range data type (tstzrange), b) Django’s Postgres extensions has a field that wraps the Postgres tstzrange field (DateTimeRangeField), and c) the Postgres docs even have an example of how to create indexes that prevent you from scheduling more than one person to occupy a specific room at one time. All that ugly python, turns into:

import calendar
from django.db import connection

AVAILABLE_DAYS = ['Monday', 'Wednesday', 'Friday']
AVAILABLE_START_TIME = '08:00'
# start + estimated duruation must be earlier than this
AVAILABLE_END_TIME = '20:00'


def possible_times(start_date, end_date, estimated_duration, granularity=60):
    allowed_days = [list(calendar.day_name).index(day) + 1 for day in AVAILABLE_DAYS]

    cursor = connection.cursor()
        sql = '''
              SELECT *
              FROM   generate_series (timestamp %(start_date)s
                                      , timestamp %(end_date)s - interval '%(duration)s minutes'
                                      , interval '%(granularity)sm') h
              WHERE  EXTRACT(ISODOW FROM h) in %(allowed_days)s
                AND    h::time >= %(start_time)s
                AND    h::time <= %(end_time)s - interval '%(duration)s minutes'
                ;
              '''
        cursor.execute(sql, {'start_date': start_date,
                             'start_time': AVAILABLE_START_TIME,
                             'end_date': end_date,
                             'end_time': AVAILABLE_END_TIME,
                             'duration': estimated_duration,
                             'granularity': granularity,
                             'allowed_days': tuple(allowed_days),
                             })

    slots = [row[0] for row in cursor.fetchall()]
    return slots

The only slightly tricky part of that was restricting allowed days to MWF. I want my constant to use the day names, not the integers Postgres uses for days of the week. So I needed to import Python’s calendar module to convert “Monday” to an integer. Python uses 0 for Monday, but Postgres thinks Monday is 1, so add 1. Then it took me a little while to figure out how to pass a list into the query in a way that everything is properly interpolated and quoted; the trick: tuple(allowed_days).

Now I just need to join to my reservations table to exclude slots where the schedule is already full.

Comments