PyInteger

Course notes for ESC190 Computer Algorithms and Data Structures

While in C the allowed integer datatypes (mainly intand unsigned int) have a restricted size, Python ==implements== a seemingly infinite integer. We will call this the pyinteger ADT and derive a possible implementation (such a task appeared on a past midterm!). The ADT pyinteger allows us to represent arbitrarily large integers and perform the two basic operations,

We first set up our header file, pyinteger.h:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#if !defined(PYINT_H) // We want to define PyIntegers exactly once
#define PYINT_H
typedef struct pyint{
    int *buffer;
    int length;
} pyint;

void create_pyint(pyint *p, int length);
void destroy_pyint(pyint *p);
void set_pyint(pyint *p, int value);
void print_pyint(pyint *p);

void add_pyints(pyint *p, pyint *q);        // Add 2 pyints together
void add_pyint_scalar(pyint *p, int value); // Add an integer to a pyint

#endif

where pyint is a struct that contains the following fields:

Here, we are representing an integer via an array where each element in this array is a digit from 0-9. For example, the integer 1234 would be represented as [1, 2, 3, 4]. Note that we will handle cases where the number we want to store exceeds the number of digits currently allocated in the buffer (and will typically double the array to avoid having to reallicate memory too often).

Creating the Struct

Now we work in pyint.c and we wish to implement the ADT. But first, we can create a function that allows us to create a pyint struct which initially stores int n. What are some natural tasks that we must do?

To determine the initial capacity, we need to determine the number of digits int n has.

I claim that the formula for the number of digits of 𝑛 is given bynum of digits=floor(log10⁡(𝑛))+1.

Proof: A 𝑘-digit number can be written as𝑛=𝑎𝑘−110𝑘−1+𝑎𝑘−210𝑘−2+⋯+𝑎1101+𝑎.where the leading coefficient 𝑎𝑘−1≠0. To double-check this, note that 15=1×101+5 so the leading power is 𝑘−1. Note that:$10𝑘−1\le 𝑛<10𝑘$.It is important that the right inequality is strict! This is because 10𝑘 has 𝑘+1 digits while 𝑛 only has 𝑘 digits. For those who want more rigour, note that each 𝑎𝑖 is smaller or equal to 9 so we can bound 𝑛≤∑𝑖=0𝑘−19⋅10𝑖.

Since the logarithm is a monotonically increasing function, we havelog10⁡(10𝑘−1)≤log10⁡(𝑛)<log10⁡(10𝑘)or equivalently,𝑘−1≤log10⁡(𝑛)<𝑘.Therefore, floor(log10⁡(𝑛))=𝑘−1 so𝑘=floor(log10⁡(𝑛))+1as expected.

However: If 𝑛=0 then we get something that is undefined, so we have to treat this case separately.

Obviously, you are not expected to prove this on the midterm, but I provided the proof here for those who are interested and for the rigour. It’s a handy fact to know though, so you don’t need to be stressed about weird edge cases! The process of extracting the digits from the integer is a bit tricky, but it’s something that we’ve done before! To get the last digit, we can compute𝑛%10.To get the second to last digit, we can shift all the digits by diving the number by 10 and ignoring the decimal part. Then we can repeatedly apply this same algorithm until we have no more digits left. Therefore, we should start from the end of the array and work our way backwards. Here is the code:

void create_integer(pyinteger **pyint, int n){
    // Allocate memory for the pyinteger struct
    *pyint = (pyinteger*)malloc(sizeof(pyinteger));

    // Proven in a theorem
    if (n == 0){
        (*pyint)->capacity = 1;
    }
    else{
        (*pyint)->capacity = (int)(log10(n)) + 1;
    }

    // Currently the number of digits is the same as the capacity
    (*pyint)->num_digits = (*pyint)->capacity;

    // Allocate memory for the digits
    (*pyint)->digits = (int*)malloc(sizeof(int) * (*pyint)->capacity);

    // Fill the digits
    for (int loc = (*pyint)->num_digits - 1; loc >= 0; loc--){
        (*pyint)->digits[loc] = n % 10;
        n /= 10;
    }
}

Implementing PlusPlus

Now we need to implement the function plusplus. This function should add 1 to the integer. Why might this be a tricky task? Well, the naive approach is to just add 1 to the units digit, but if the units digit is 9, then something weird might happen. Let’s break it up into two cases, and work with the easier case first (this is a good idea in a test situation, since you know there will definitely be partial marks)

we can implement it (with a small hiccup we’ll discuss in a bit) as follows:

1
2
3
4
5
6
7
8
9
10
11
void plusplus(pyinteger *pyint){
    int loc;
    for (loc = pyint->num_digits - 1; loc >= 0; loc--){
        if (pyint->digits[loc] == 9){
            pyint->digits[loc] = 0;
        } else {
            pyint->digits[loc]++;
            break;
        }
    }
}

Note that at this stage, there is one more hiccup. If everything is 9, then the size of the number can increase! This specific case is a bit annoying, but just doing everything up to this point will give you 8/10 points. If you can’t fully figure how to solve a question, solving it partially can still land you a lot of points!

So how do we fix this? To deal with this case, we ask the following natural questions:

To detect if we even need to consider this case, there are many ways to do so. One way is to check the value of loc after the for loop finishes. If the for loop exits via the break statement, we should have loc >= 0. If it doesn’t exit via the break statement, that means the first digit was also a 9, so loc == 0.

How can we check if we need to re-allocate memory? At this point, we know that the number of digits has increased by 1, so we can update num_digits. Now we can check if the number of digits is greater than the capacity. If it is, then we need to re-allocate memory. We can do this by doubling the capacity.

Finally, we need to be able to shift all the digits over. At least, this was my first thought when I saw the problem. However, if we work through a simple example (or think about it), at this point, every single digit should be 0! Therefore, all we need to do here is set the first digit to 1. Easy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void plusplus(pyinteger *pyint){
    int loc;
    for (loc = pyint->num_digits - 1; loc >= 0; loc--){
        if (pyint->digits[loc] == 9){
            pyint->digits[loc] = 0;
        } else {
            pyint->digits[loc]++;
            break;
        }
    }

    if(loc == -1){
        pyint->num_digits++;
        if(pyint->num_digits > pyint->capacity){
            pyint->capacity *= 2;
            // Reallocate memory for the digits
            pyint->digits = (int*)realloc(pyint->digits, sizeof(int) * pyint->capacity);
        }

        // First digit is 1
        pyint->digits[0] = 1;

        // Last digit is 0
        pyint->digits[pyint->num_digits - 1] = 0;
    }
}

Implementing Add

Now we need to implement adding. First, why might this be a hard task? It’s because the digits are not aligned in a “nice” manner. When we learn long addition in elementary school, we align the rightmost digit (ones digit) of the two numbers. But the way we implemented it here, we align the leftmost digit.

Note: we did not get to finish this in class, but here are some ideas for solutions