Important Note: If you see funny characters in your display, and if you are using Xwindows (for instance, under linux), please see the following file for information on configuring your browser: Xfonts (unfortunately, printing will still give funny fonts - please use the pdf or postscript versions of this document for printing!)

Introduction to First-Order Differential Equations

# Introduction to First-Order Differential Equations

A differential equation is just an equation which involves ``differentials'', that is to say, derivatives. A simple example is

 dy dt =   0  ,
or, in different notation just y¢ = 0, where we understand that y is a function of an independent variable t. (We use t because in many examples the independent variable happens to be time, but of course any other variable could be used.) If y¢ = 0, y must be constant. In other words, the general solution of the given equation is y º c, for some constant c. Another easy example of a differential equation is:
 dy dt =   -27  .
This means that y = y(t) has a graph which is a line with slope -27. The general solution of this equation is y = -27t+c, for some constant c.

An initial value problem is a problem in which we give a differential equation together with an extra condition at a point, like:

 dy dt =   -27  ,    y(0) = 3  .
There is a unique solution of this initial-value problem, namely y(t) = -27t + 3. It can be found by first finding the general solution y = -27t+c and then plugging-in t = 0 to get 3 = -27(0)+c, so c = 3. Sometimes the ``initial'' condition may be specified, of course, at some value of the independent variable t, for example:
 dy dt =   -27  ,    y(2) = 3  .
The solution of this initial-value problem can be also obtained by plugging into the general form y = -27t+c: we substitute 3 = y(2) = -27(2)+c, which gives that c = 57, and so the solution is y(t) = -27t+57.

A slightly more complicated example of a differential equation is:

 dy dt =   sint + t2  .
The general solution is (by taking antiderivatives) y = -cost + t3/3+c. Another example:
 dy dt =   e-t2  .
This equation has a general solution, but it cannot be expressed in terms of elementary functions like polynomials, trigs, logs, and exponentials. (The solution is the ``error function'' that is used in statistics to define the Gaussian or normal probability density.) One of the unfortunate facts about differential equations is that we cannot always find solutions as explicit combinations of elementary functions. So, in general, we have to use numerical, geometric, and graphical techniques in the analysis of properties of solutions.

The examples just given are too easy (even if y¢   =   e-t2 doesn't look that easy), in the sense that they can all be solved, at least theoretically, by taking antiderivatives. The subject of differential equations deals with far more general situations, in which the unknown function y appears on both sides of the equation:

 y¢   =   f(t,y)
or even much more general types: systems of many simultaneous equations, higher order derivatives, and even partial derivatives when there are other independent variables (which leads to ``partial differential equations'' and are the subject of more advanced courses).

One aspect of differential equations is comparatively easy: if someone gives us an alleged solution of an equation, we can check whether this is so. Checking is much easier than finding! (Analogy: if I ask you to find a solution of the algebraic equation 10000x5-90000x4+ 65100x3+ 61460x2+13812x+972 = 0 it may take you some time to find one. On the other hand, if I tell you that x = 3/2 is a root, you can check whether I am telling the truth or not very easily: just plug in and see if you get zero.) For example, if someone claims that the function

 y   = 1 1+t2
is a solution of the equation y¢ = -2ty2, we can check that she is right by plugging-in:
 æç è 1 1+t2 ö÷ ø ¢   =   - 2t (1+t2)2 =   -2t æç è 1 1+t2 ö÷ ø 2 .
But if someone claims that y = [1/( 1+t)] is a solution, we can prove him to be wrong:
 æç è 1 1+t ö÷ ø ¢   =   - 1 (1+t)2 ¹   -2t æç è 1 1+t ö÷ ø 2
because the two last functions of t are not the same.

In this section, we discuss several simple examples of first-order differential equations.

Most applications of mathematics, and in particular, of differential equations, proceed as follows.

Starting from a ``word problem'' description of some observed behavior or characteristic of the real world, we attempt to formulate the simplest set of mathematical equations which capture the essential aspects. This set of equations represents a mathematical model of reality. The study of the model is then carried out using mathematical tools. The power of mathematics is that it allows us to make quantitative and/or qualitative conclusions, and predictions about behaviors which may not have been an explicit part of the original word description, but which nonetheless follow logically from the model.

Sometimes, it may happen the results of the mathematical study of the model turn out to be inconsistent with features found in the ``real world'' original problem. If this happens, we must modify and adapt the model, for example by adding extra terms, or changing the functions that we use, in order to obtain a better match. Good modeling, especially in science and engineering, is often the result of several iterations of the ``model/reality-check/model'' loop!

## 2  Unrestricted Population Growth

When dealing with the growth of a bacterial culture in a Petri dish, a tumor in an animal, or even an entire population of individuals of a given species, biologists often base their models on the following simple rule:

The increase in population during a small time interval of length Dt is proportional to Dt and to the size of the population at the start of the interval.

For example, statistically speaking, we might expect that one child will be born in any given year for each 100 people. The proportionality rule then says that two children per year are born for every 200 people, or that three children are born for each 100 people over three consecutive years. (To be more precise, the rate of increase should be thought of as the ``net'' rate, after subtracting population decreases.)

The rule is only valid for small intervals (small Dt), since for large Dt one should also include compounding effects (children of the children), just as the interest which a bank gives us on savings (or charges us on loan balances) gets compounded, giving a higher effective rate.

Let us call P(t) the number of individuals in the population at any given time t. The simplest way to translate into math the assumption that ``the increase in population P(t+Dt)-P(t) is proportional to Dt and to P(t)'' is to write

 P(t+Dt) - P(t)    =   k  P(t) Dt
(1)
for some constant k. Notice how this equation says that the increase P(t+Dt) - P(t) is twice as big if Dt is twice as big, or if the initial population P(t) is twice as big.

Example: in the ``one child per 100 people per year'' rule, we would take k = 10-2, if we are measuring the time t in years. So, if at the start of 1999 we have a population of 100,000,000, then at the beginning of the year 2001 = 1999+2 the population should be (use Dt = 2):

 P(2001) = P(1999) + 10-2 P(1999) Dt = 108 + 10-2 108 (2) = 102,000,000
according to the formula. On the other hand, by the end of January 3rd, 1999, that is, with Dt = 3/365, we would estimate P(1999+3/365) = 108 + 10-2 108 (3/365) » 100,008,219 individuals. Of course, there will be random variations, but on average, such formulas turn out to work quite well.

The equation (1) can only be accurate if Dt is small, since it does not allow for the ``compound interest'' effect. On the other hand, one can view (1) as specifying a step-by-step difference equation as follows. Pick a ``small'' Dt, let us say Dt = 1, and consider the following recursion:

 P(t+1)    =   P(t) + k P(t)    =   (1+k) P(t)
(2)
for t = 0,1,2,¼. Then we compute P(2) not as P(0)+2kP(0), but recursively applying the rule: P(2) = (1+k)P(1) = (1+k)2P(0). This allows us to incorporate the compounding effect. It has the disadvantage that we cannot talk about P(t) for fractional t, but we could avoid that problem by picking a smaller scale for time (for example, days). A more serious disadvantage is that it is hard to study difference equations using the powerful techniques from calculus. Calculus deals with things such as rates of change (derivatives) much better than with finite increments. Therefore, what we will do next is to show how the problem can be reformulated in terms of a differential equation. This is not to say that difference equations are not interesting, however. It is just that differential equations can be more easily studied mathematically.

If you think about it, you have seen many good examples of the fact that using derivatives and calculus is useful even for problems that seem not to involve derivatives. For example, if you want to find an integer t such that t2-189t+17 is as small as possible, you could try enumerating all possible integers (!), or you could instead pretend that t is a real number and minimize t2-189t+17 by setting the derivative to zero: 2t-189 = 0 and easily finding the answer t = 94.5, which then leads you, since you wanted an integer, to t = 94 or t = 95.

Back to our population problem, in order to use calculus, we must allow P to be any real number (even though, in population studies, only integers P would make sense), and we must also allow the time t to be any real number. Let us see where equation (1) leads us. If we divide by Dt, we have

 P(t+Dt) - P(t) Dt =   k  P(t)  .
This equation holds for small Dt, so we may let Dt® 0. What is the limit of [(P(t+Dt) - P(t))/( Dt)] as Dt® 0? It is, as you remember from Calculus I (yes, you do), the derivative of P evaluated at t. So we end up with our first differential equation:
 P¢(t)    =   k  P(t)  .
(3)
This is the differential equation for population growth. We may read it like this:

The rate of change of P is proportional to P.

The solution of this differential equation is easy: since P¢(t)/P(t) = k, the chain rule tells us that

 (lnP(t))¢ = k  ,
and so we conclude that lnP(t) = kt + c for some constant c. Taking exponentials of both sides, we deduce that P(t) = ekt+c = C ekt, where C is the new constant ec. Evaluating at t = 0 we have that P(0) = Ce0 = C, and we therefore conclude:
 P(t)    =   P(0) ekt  .
(Actually, we cheated a little, because P¢/P doesn't make sense if P = 0, and also because if P is negative then we should have used ln(-P(t)). But one can easily prove that the formula P(t) = P(0) ekt is always valid. In any case, for population problems, P is positive.)

Which is better in practice, to use the difference equation (2) or the differential equation (3)? It is hard to say: the answer depends on the application. Mathematically, differential equations are usually easier to analyze, although sometimes, as when we study chaotic behavior in simple one-dimensional systems, difference equations may give great insight. Also, we often use difference equations as a basis of numerical techniques which allow us to find an approximation of the solution of a differential equation. For example, Euler's method, which we will meet soon, basically reverses the process of going from (1) to (3).

Let us now look at some more examples of differential equations.

## 3  Limits to Growth: Logistic Equation

Often, there are limits imposed by the environment on the maximal possible size of a population: not enough nutrients for a large bacterial culture, insufficient food for the human population of an island, or a small hunting territory for a given animal species. Ecologists talk about the carrying capacity of the environment, a number N with the property that no populations P > N are sustainable. If the population starts bigger than N, the number of individuals will decrease. To come up with an equation that represents this situation, we follow the same steps that we did before, except that now we have that P(t+Dt)-P(t) should be negative if P(t) > N. In other words, we have P(t+Dt)-P(t) = f(P(t))Dt, where f(P) is not just ``kP'' but should be instead a more complicated expression involving P, and which has the properties that:

• f(0) = 0 (no increase in the population if there is no one around to start with!),

• f(P) > 0 when 0 < P < N (the population increases while there are enough resources), and

• f(P) < 0 when P > N.

Taking limits just like we did before, we arrive to the differential equation:
 P¢(t)    =   f(P(t))  .
From now on, we will drop the ``t'' when it is obvious, and use the shorthand notation P¢ = f(P) instead of the more messy P¢(t) = f(P(t)). We must still decide what function ``f'' is appropriate. Because of the properties wanted (f(0) = 0, f(P) > 0 when 0 < P < N, f(P) < 0 when P > N), the graph of f as a function of P should look more or less like this:

The simplest choice is a parabola which opens downward and has zeroes at P = 0 and P = N: f(P) = -cP(P-N), with c > 0, or, with k = cN, f(P) = kP(1-P/N). We arrive in this way to the logistic population model
 P¢   =   k  P æç è 1- P N ö÷ ø .
(4)
(Remember: this is shorthand for P¢(t) = kP(t)(1-P(t)/N).)

## 4  Solution of Logistic Equation

Like P¢ = kP, equation (4) is one of those (comparatively few) equations which can actually be solved in closed form. To solve it, we do almost the same that we did with P¢ = kP (this is an example of the method of separation of variables ): we write the equation as dP/dt = kP(1-P/N), formally multiply both sides by dt and divide by P(1-P/N), arriving at

 dP P(1-P/N) =   k  .
Next we take antiderivatives of both sides, obtaining
 óõ dP P(1-P/N) = óõ k  dt  .
The right-hand side can be evaluated using partial fractions:
 1 P(1-P/N) = N P(N-P) = 1 P + 1 N-P
so
 lnP - ln(N-P) + c1    =   kt + c2
for some constants c1 and c2, or, with c = c2-c1,
 ln æç è P N-P ö÷ ø =   kt + c
(5)
and, taking exponentials of both sides,
 P N-P =   C ekt
(6)
with C = ec. This is an algebraic equation for P, but we can go a little further and solve explicitly:
P = C ekt (N-P)    Þ     C ekt P +P = C ekt N    Þ     P = C ekt N
C ekt+1
= N
 1+ 1 C e-kt
.
Finally, to find C, we can evaluate both sides of equation (6) at t = 0:
 C = P(0) N-P(0)
and therefore conclude that
 P(t)   = P(0) N P(0)+[N-P(0)]e-kt .
(7)
Observe that, since e-kt® 0 as t® ¥, P(t)® N, which is not surprising. (Why?)

Homework assignment: use a computer to plot several solutions of the equation, for various values of N and of P(0).

## 5  Some ``Small-Print Legal Disclaimers''

(You may want to skip this section in a first reading.)

We cheated a bit when deriving the solution for the logistic equation. First of all, we went a bit too fast over the ``divide by dt'' business. What is the meaning of dividing by the differential? Well, it turns out that it is OK to do this, because what we did can be interpreted as, basically, just a way of applying (backwards) the chain rule. Let us justify the above steps without using differentials. Starting from the differential equation (4), we can write, assuming that P ¹ 0 and P ¹ N (so that we are not dividing by zero):

 P¢ P(1-P/N) =   k  .
(8)
Now, one antiderivative of [1/( P(1-P/N))], as a function of P, is the function Q(P) = ln([(P)/( N-P)]) (let us suppose that N > P, so the expression inside the log is positive). So, the chain rule says that
 d Q(P(t)) dt = dQ dP dP dt = 1 P(1-P/N) P¢(t)  .
Therefore, equation (8) gives us that
 d Q(P(t)) dt =   k
from which we then conclude, by taking antiderivatives, that
 Q(P(t)) = kt + c
which is exactly the same as the equation (5), which had before been obtained using differentials. In general, we can always justify ``separation of variables'' solutions in this manner, but from now on we will skip this step and use the formal method.

There is still a small gap in our arguments, namely we assumed that P ¹ 0 and that P ¹ N (so that we were not dividing by zero) and also N > P, so the expression inside the log was positive. We'll see later, when we cover uniqueness of solutions that, because P = 0 and P = N are equilibria of the system, any solution that starts with P(0) > N will always have P(t) > N, and a similar property is true for each of the intervals P < 0 and 0 < P < N. So we can treat each of the cases separately.

If N < P, then the antiderivative is ln|[(P)/( N-P)]| (that is, we use absolute values). But this doesn't change the general solution. All it means is that equation (6) becomes

 êê ê P N-P êê ê =   C ekt
which can also be written as in (6) but with C negative. We can treat the case P < 0 in the same way.

Finally, the exceptional cases when P could be zero or N are taken care of once we notice that the general solution (7) makes sense when P(0) = 0 (we get P º 0) or when P(0) = N (we get P º N).

## 6  Equilibria

Observe that if, for some time t0, it happens that P(t0) = 0, then the right-hand side of the differential equation (4) becomes zero, so P¢(t0) = 0, which means that the solution cannot ``move'' from that point. So the value P = 0 is an equilibrium point for the equation: a value with the property that if we start there, then we stay there forever. This is not a particularly deep conclusion: if we start with zero population we stay with zero population. Another root of the right hand side is P = N. If P(t0) = N then P¢(t0) = 0, so if we start with exactly N individuals, the population also remains constant, this time at N. Again, this is not surprising, since the model was derived under the assumption that populations larger than N decrease and populations less than N increase.

In general, for any differential equation of the form y¢ = f(y), we say that a point y = a is an equilibrium if a is a root of f, that is, f(a) = 0. This means that if we start at y = a, we cannot move away from y = a. Or, put in a different way, the constant function y(t) º a is a solution of y¢ = f(y) (because y¢(t) = a¢ º 0 and also f(y(t)) = f(a) = 0. One says also that the constant function y(t) = a is an equilibrium solution of y¢ = f(y).

The analysis of equilibria allows us to obtain a substantial amount of information about the solutions of a differential equation of the type y¢ = f(y) with very little effort, in fact without even having to solve the equation. (For ``nonautonomous'' equations, when t appears in the right hand side: y¢ = f(t,y), this method doesn't quite work, because we need to plot f against two variables. The technique of slope fields is useful in that case.) The fundamental fact that we need is that - assuming that f is a differentiable function - no trajectory can pass through an equilibrium : if are ever at an equilibrium, we must have always been there and we will remain there forever. This will be explained later, when covering uniqueness of solutions.

For example, suppose that we know that the plot of f(y) against y looks like this:

where we labeled the points where f(y) has roots, that is to say, the equilibria of y¢ = f(y).

We can conclude that any solution y(t) of y¢ = f(P) which starts just to the right of A will move rightwards, because f(y) is positive for all points between A and B, and so y¢ > 0. Moreover, we cannot cross the equilibrium B, so any such trajectory stays in the interval (A,B) and, as t increases, it approaches asymptotically the point B. To summarize, if y(0) = y0 with y0 Î (A,B), then the graph of the solution y(t) of y¢ = f(y) must look more or less like this:

Homework assignment: For the same function f shown above, give an approximate plot of a solution of y¢ = f(y) for which y(0) Î (B,C). Repeat with y(0) Î (C,D) and with y(0) Î (D,E).

## 7  More Examples

Let us discuss some more easy examples.

#### Populations under Harvesting

 P¢   =   k  P æç è 1- P N ö÷ ø
which describes population growth under environmental constraints. Suppose that P(t) represents the population of a species of fish, and that fishing removes a certain number K of fish each unit of time. This means that there will be a term in P(t+Dt)-P(t) equal to -KDt. When we divide by Dt and take limits, we arrive at the equation for resources under constant harvesting:
 P¢   =   k  P æç è 1- P N ö÷ ø - K  .
Many variations are possible. For example, it is more realistic to suppose that a certain proportion of fish are caught per unit of time (the more fish, the easier to catch). This means that, instead of a term -KDt for how many fish are taken away in an interval of length Dt, we'd now have a term of the form -KP(t)Dt, which is proportional to the population. The differential equation that we obtain is now P¢ = k P (1-P/N) - KP. Or, if only fish near the surface can be caught, the proportion of fish caught per unit of time may depend on the power P2/3 (do you understand why? are you sure?). This would give us the equation P¢ = k P (1-P/N) - KP2/3.

#### Epidemics

The spread of epidemics is another example whose study can be carried out using differential equations. Suppose that S(t) counts the number of individuals infected with a certain virus, at time t, and that people mix randomly and get infected from each other if they happen to be close. One model is as follows. The increase in the number of infected individuals S(t+Dt)-S(t) during a time interval of length Dt is proportional to the number of close encounters between sick and healthy individuals, that is, to S(tH(tDt, because S(t)H(t) is the total number of pairs of (sick,healthy) individuals, and the longer the interval, the more chances of meeting. Taking limits as usual, we arrive to S¢(t) = kS(t)H(t), where k is some constant. If the total number of individuals is N, then H(t) = N-S(t), and the equation becomes:

 S¢   =   k S(t) (N - S(t))
which is very similar, it turns out, to the logistic equation. There are many variations of this idea. For instance, if in every Dt time interval a certain proportion of infected individuals get cured, we'd have a term -kS(t).

#### Chemical Reactions

Chemical reactions also give rise to similar models. Let us say that there are two reactants A and B, which may combine to give C via A+B® C (for each molecule of A and B, we obtain a molecule of C). If the chemicals are well-mixed, the chance of two molecules combining is proportional to how many pairs there are and to the length of time elapsed (just like with the infection model, molecules need to get close enough to react). So c¢(t) = ka(t)b(t), where a(t) is the amount of A at time t and b(t) the amount of B. If we start with amounts a0 and b0 respectively, and we have c(t) molecules of C at time t, this means that a(t) = a0-c(t) and b(t) = b0-c(t), since one molecule of A and B was used up for each molecule of C that was produced. So the equation becomes

 c¢   =   k (a0-c)(b0-c)  .

#### Air Resistance

As a last example, let us take a body moving in air (or another fluid). For low speeds, air resistance (drag) is proportional to the speed of the object, and acts to slow down the object, in other words, it acts as a force k|v|, in a direction opposite to movement, where |v| is the absolute value of the velocity. Suppose that a body is falling towards the earth, and let us take ``down'' as the positive direction of movement. In that case, Newton's ``F = ma'' law says that the mass times the acceleration v¢ is equal to the total force on the body, namely mg (its weight) plus the effect of drag, which is -kv (because the force acts opposite to the direction of movement):

 mv¢   =   mg - kv  .
For large velocities, drag is often modeled more accurately by a quadratic effect -kv2 in a direction opposite to movement. This would lead to an equation like mv¢ = mg - kv2 for the velocity of a falling object.

#### Newton's Law of Cooling

The temperature inside a building is assumed to be uniform (same in every room) and is given by y(t) as a function of the time t. The outside air is at temperature a(t), which also depends on the time of the day, and there is a furnace which supplies heat at a rate h(t) (or, for negative h, an air-conditioning unit which removes heat at that rate). What is the temperature in the building? Newton's law of cooling tells us that the rate of change of temperature dy/dt will depend on the difference between the inside and outside temperatures (the greater the difference, the faster the change), with a term added to model the effect of the furnace:

 mc  y¢(t)   =   -k (y(t)-a(t))  + h(t)  ,
where the mass of air in the building is the constant m (no windows can be opened, and doors are usually tightly closed, being opened rarely and briefly, so we assume that m is a constant), c is a positive constant (the heat capacity), and k is another positive constant (which is determined by insulation, building layout, etc).

#### Mixing Problems

(See book for this.)

## 8  Homework Problem

You should match the following word descriptions and differential equations.
More than one equation may match a description, and vice versa.

Descriptions:

1. The rate of change of the population of a certain country, which depends on the birth and death rates as well as on the number of immigrants, who arrive at a constant rate into the country.

2. The rate of change of the population of a certain country, which depends on the birth and death rates, but there is a net emigration from the country (at a constant rate).

3. Fish in a certain area, which reproduce in proportion to the population, subject to limits imposed by the carrying capacity of the environment, and the population of which is also reduced by fishing which proceeds at a constant rate.

4. The temperature of a building, when the outside temperature varies periodically (it goes down during the night, up during the day) and there is no heating or air-conditioning.

5. The temperature of a building, when the outside temperature varies periodically (it goes down during the night, up during the day) and heating is being applied at a constant rate.

6. The temperature of a building, when the outside temperature is constant, and there is no heating or air-conditioning.

7. The temperature of a building, when the outside temperature is constant, and heating is being applied at a constant rate.

8. The amount of money in a savings account, when interest is compounded continuously, and also additional money is being added at a constant rate (the person always deposits a certain percentage of her paycheck).

9. The rate of change of the volume of a raindrop, which evaporates at a rate proportional to its surface area.

10. The rate of change of the volume of a raindrop, which evaporates at a rate proportional to its diameter.

11. The mass of a radioactive substance which is decaying (at a rate proportional to the amount present).

12. The amount of chlorine in a swimming pool; chlorinated water is added at a fixed rate, the water in the pool is well-mixed, and water is being removed from the pool so that the total volume is constant.

Equations (all constants are positive):