We have derived the density of the chi-squared variable with one degree of freedom, see also Example 3.52, J. Abdey, Guide ST2133.

General chi-squared

For with independent standard normals we can write where the chi-squared variables on the right are independent and all have one degree of freedom. This is because deterministic (here quadratic) functions of independent variables are independent.

Definition. The gamma distribution is a two-parametric family of densities. For the density is defined by

Obviously, you need to know what is a gamma function. My notation of the parameters follows Feller, W. An Introduction to Probability Theory and its Applications, Volume II, 2nd edition (1971). It is different from the one used by J. Abdey.

Property 1

It is really a density because

(replace )

Suppose you see an expression and need to determine which gamma density this is. The power of the exponent gives you and the power of gives you It follows that the normalizing constant should be and the density is

Property 2

The most important property is that the family of gamma densities with the same is closed under convolutions. Because of the associativity property it is enough to prove this for the case of two gamma densities.

Alternative proof. The moment generating function of a sum of two independent beta distributions with the same shows that this sum is again a beta distribution with the same , see pp. 141, 209 in the guide.

The gamma function and gamma distribution are two different things. This post is about the former and is a preparatory step to study the latter.

Definition. The gamma function is defined by

The integrand is smooth on so its integrability is determined by its behavior at and . Because of the exponent, it is integrable in the neighborhood of The singularity at is integrable if In all calculations involving the gamma function one should remember that its argument should be positive.

Properties

1) Factorial-like property. Integration by parts shows that

if

2) because

3) Combining the first two properties we see that for a natural

Thus the gamma function extends the factorial to non-integer

4)

Indeed, using the density of the standard normal we see that

These problems are among the most difficult. It's important to work out a general approach to such problems. All references are to J. Abdey, Advanced statistics: distribution theory, ST2133, University of London, 2021.

General scheme

Step 1. Conditioning is usually suggested by the problem statement: is conditioned on .

Your life will be easier if you follow the notation used in the guide: use for probability mass functions (discrete variables) and for (probability) density functions (continuous variables).

a) If and both are discrete (Example 5.1, Example 5.13, Example 5.18):

b) If and both are continuous (Activity 5.6):

c) If is discrete, is continuous (Example 5.2, Activity 5.5):

d) If is continuous, is discrete (Activity 5.12):

In all cases you need to figure out over which to sum or integrate.

Step 2. Write out the conditional densities/probabilities with the same arguments
as in your conditional equation.

Step 3. Reduce the result to one of known distributions using the completeness
axiom.

Example 5.1

Let denote the number of hurricanes which form in a given year, and let denote the number of these which make landfall. Suppose each hurricane has a probability of making landfall independent of other hurricanes. Given the number of hurricanes , then can be thought of as the number of successes in independent and identically distributed Bernoulli trials. We can write this as . Suppose we also have that . Find the distribution of (noting that ).

Solution

Step 1. The number of hurricanes takes values and is distributed as Poisson. The number of landfalls for a given is binomial with values . It follows that .

Write the general formula for conditional probability:

Step 2. Specifying the distributions:

where

and

where

Step 3. Reduce the result to one of known distributions:

(pull out of summation everything that does not depend on summation variable
)

(replace to better see the structure)

(using the completeness axiom for the Poisson variable)

Why do we need this link? For simplicity consider the rectangle The integrals

and

both are taken over the rectangle but they are not the same. is a double (two-dimensional) integral, meaning that its definition uses elementary areas, while is an iterated integral, where each of the one-dimensional integrals uses elementary segments. To make sense of this, you need to consult an advanced text in calculus. The difference notwithstanding, in good cases their values are the same. Putting aside the question of what is a "good case", we concentrate on geometry: how a double integral can be expressed as an iterated integral.

It is enough to understand the idea in case of an oval on the plane. Let be the function that describes the lower boundary of the oval and let be the function that describes the upper part. Further, let the vertical lines and be the minimum and maximum values of in the oval (see Chart 1).

Chart 1. The boundary of the oval above the green line is described by u(x) and below - by l(x)

We can paint the oval with strokes along red lines from to If we do this for all we'll have painted the whole oval. This corresponds to the representation of as the union of segments with

and to the equality of integrals

(double integral) (iterated integral)

Density of a sum of two variables

Assumption 1 Suppose the random vector has a density and define (unlike the convolution theorem below, here don't have to be independent).

From the definitions of the distribution function and probability

we have

The integral on the right is a double integral. The painting analogy (see Chart 2)

Chart 2. Integration for sum of two variables

suggests that

Hence,

Differentiating both sides with respect to we get

If we start with the inner integral that is with respect to and the outer integral with respect to then similarly

Exercise. Suppose the random vector has a density and define Find Hint: review my post on Leibniz integral rule.

Convolution theorem

In addition to Assumption 1, let be independent. Then and the above formula gives

This is denoted as and called a convolution.

The following may help to understand this formula. The function is a density (it is non-negative and integrates to 1). Its graph is a mirror image of that of with respect to the vertical axis. The function is a shift of by along the horizontal axis. For fixed it is also a density. Thus in the definition of convolution we integrate the product of two densities Further, to understand the asymptotic behavior of when imagine two bell-shaped densities and When goes to, say, infinity, the humps of those densities are spread apart more and more. The hump of one of them gets multiplied by small values of the other. That's why goes to zero, in a certain sense.

The convolution of two densities is always a density because it is non-negative and integrates to one:

Replacing in the inner integral we see that this is

This post parallels the one about the call debit spread. A combination of several options in one trade is called a strategy. Here we discuss a strategy called a put debit spread. The word "debit" in this name means that a trader has to pay for it. The rule of thumb is that if it is a debit (you pay for a strategy), then it is less risky than if it is a credit (you are paid). Let denote the price of the put with the strike suppressing all other variables that influence the put price.

Assumption. The market values higher events of higher probability. This is true if investors are rational and the market correctly reconciles views of different investors.

We need the following property: if are two strike prices, then for the corresponding put prices (with the same expiration and underlying asset) one has

Proof. A put price is higher if the probability of it being in the money at expiration is higher. Let be the stock price at expiration Since is a moment in the future, is a random variable. For a given strike the put is said to be in the money at expiration if If and then It follows that the set is a subset of the set Hence the probability of the event is higher than that of the event and

Put debit spread strategy. Select two strikes buy (take a long position) and sell (take a short position). You pay for this.

Our purpose is to derive the payoff for this strategy. We remember that if then the put expires worthless.

Case In this case both options expire worthless and the payoff is the initial outlay: payoff

Case Exercising the put , in comparison with selling the stock at the market price you gain The second option expires worthless. The payoff is: payoff

Case Both options are exercised. The gain from is, as above, The holder of the long put sells you stock at price Since your position is short, you have nothing to do but comply. The alternative would be to buy at the market price, so you lose The payoff is: payoff

Summarizing, we get:

payoff

Normally, the strikes are chosen so that From the payoff expression we see then that the maximum profit is the maximum loss is and the breakeven stock price is This is illustrated in Figure 1, where the stock price at expiration is on the horizontal axis.

Figure 1. Payoff from put debit spread. Source: https://www.optionsbro.com/

Conclusion. For the strategy to be profitable, the price at expiration should satisfy Buying a put debit spread is appropriate when the price is expected to stay in that range.

In comparison with the long put position taking at the same time the short call position allows one to reduce the initial outlay. This is especially important when the stock volatility is high, resulting in a high put price. In the difference that volatility component partially cancels out.

Remark. There is an important issue of choosing the strikes. Let denote the stock price now. The payoff expression allows us to rank the next choices in the order of increasing risk: 1) (both options are in the money, less risk), 2) and 3) (both options are out of the money, highest risk). Also remember that a put debit spread is less expensive than buying and selling in two separate transactions.

Exercise. Analyze a put credit spread, in which you sell and buy .

A combination of several options in one trade is called a strategy. Here we discuss a strategy called a call debit spread. The word "debit" in this name means that a trader has to pay for it. The rule of thumb is that if it is a debit (you pay for a strategy), then it is less risky than if it is a credit (you are paid). Let denote the call price with the strike suppressing all other variables that influence the call price.

Assumption. The market values higher events of higher probability. This is true if investors are rational and the market correctly reconciles views of different investors.

We need the following property: if are two strike prices, then for the corresponding call prices (with the same expiration and underlying asset) one has

Proof. A call price is higher if the probability of it being in the money at expiration is higher. Let be the stock price at expiration Since is a moment in the future, is a random variable. For a given strike the call is said to be in the money at expiration if If and then It follows that the set is a subset of the set Hence the probability of the event is lower than that of the event and

Call debit spread strategy. Select two strikes buy (take a long position) and sell (take a short position). You pay for this.

Our purpose is to derive the payoff for this strategy. We remember that if then the call expires worthless.

Case In this case both options expire worthless and the payoff is the initial outlay: payoff

Case Exercising the call and immediately selling the stock at the market price you gain The second option expires worthless. The payoff is: payoff (In fact, you are assigned stock and selling it is up to you).

Case Both options are exercised. The gain from is, as above, The holder of the long call buys from you at price Since your position is short, you have nothing to do but comply. You buy at and sell at Thus the loss from is The payoff is: payoff

Summarizing, we get:

payoff

Normally, the strikes are chosen so that From the payoff expression we see then that the maximum profit is the maximum loss is and the breakeven stock price is This is illustrated in Figure 1, where the stock price at expiration is on the horizontal axis.

Figure 1. Payoff for call debit strategy. Source: https://www.optionsbro.com/

Conclusion. For the strategy to be profitable, the price at expiration should satisfy Buying a call debit spread is appropriate when the price is expected to stay in that range.

In comparison with the long call position taking at the same time the short call position allows one to reduce the initial outlay. This is especially important when the stock volatility is high, resulting in a high call price. In the difference that volatility component partially cancels out.

Remark. There is an important issue of choosing the strikes. Let denote the stock price now. The payoff expression allows us to rank the next choices in the order of increasing risk: 1) (both options are in the money, less risk), 2) and 3) (both options are out of the money, highest risk). Also remember that a call debit spread is less expensive than buying and selling in two separate transactions.

Exercise. Analyze a call credit spread, in which you sell and buy .

Its content, organization and level justify its adoption as a textbook for introductory statistics for Econometrics in most American or European universities. The book's table of contents is somewhat standard, the innovation comes in a presentation that is crisp, concise, precise and directly relevant to the Econometrics course that will follow. I think instructors and students will appreciate the absence of unnecessary verbiage that permeates many existing textbooks.

Having read Professor Mynbaev's previous books and research articles I was not surprised with his clear writing and precision. However, I was surprised with an informal and almost conversational one-on-one style of writing which should please most students. The informality belies a careful presentation where great care has been taken to present the material in a pedagogical manner.

Carlos Martins-Filho Professor of Economics University of Colorado at Boulder Boulder, USA

The number of visits to my website has exceeded 206,000. This number depends on what counts as a visit. An external counter, visible to everyone, writes cookies to the reader's computer and counts many visits from one reader as one. The number of individual readers has reached 23,000. The external counter does not give any more statistics. I will give all the numbers from the internal counter, which is visible only to the site owner.

I have a high percentage of complex content. After reading one post, the reader finds that the answer he is looking for depends on the preliminary material. He starts digging it and then has to go deeper and deeper. Hence the number 206,000, that is, one reader visits the site on average 9 times on different days. Sometimes a visitor from one post goes to another by link on the same day. Hence another figure: 310,000 readings.

I originally wrote simple things about basic statistics. Then I began to write accompanying materials for each advanced course that I taught at Kazakh-British Technical University (KBTU). The shift in the number and level of readership shows that people need deep knowledge, not bait for one-day moths.

For example, my simple post on basic statistics was read 2,300 times. In comparison, the more complex post on the Cobb-Douglas function has been read 7,100 times. This function is widely used in economics to model consumer preferences (utility function) and producer capabilities (production function). In all textbooks it is taught using two-dimensional graphs, as P. Samuelson proposed 85 years ago. In fact, two-dimensional graphs are obtained by projection of a three-dimensional graph, which I show, making everything clear and obvious.

The answer to one of the University of London (UoL) exam problems attracted 14,300 readers. It is so complicated that I split the answer into two parts, and there are links to additional material. On the UoL exam, students have to solve this problem in 20-30 minutes, which even I would not be able to do.

Why my site is unique

My site is unique in several ways. Firstly, I tell the truth about the AP Statistics books. This is a basic statistics course for those who need to interpret tables, graphs and simple statistics. If you have a head on your shoulders, and not a Google search engine, all you need to do is read a small book and look at the solutions. I praise one such book in my reviews. You don't need to attend a two-semester course and read an 800-page book. Moreover, one doesn't need 140 high-quality color photographs that have nothing to do with science and double the price of a book.

Many AP Statistics consumers (that's right, consumers, not students) believe that learning should be fun. Such people are attracted by a book with anecdotes that have no relation to statistics or the life of scientists. In the West, everyone depends on each other, and therefore all the reviews are written in a superlative degree and streamlined. Thank God, I do not depend on the Western labor market, and therefore I tell the truth. Part of my criticism, including the statistics textbook selected for the program "100 Textbooks" of the Ministry of Education and Science of Kazakhstan (MES), is on Facebook.

Secondly, I have the world's only online, free, complete matrix algebra tutorial with all the proofs. Free courses on Udemy, Coursera and edX are not far from AP Statistics in terms of level. Courses at MIT and Khan Academy are also simpler than mine, but have the advantage of being given in video format.

The third distinctive feature is that I help UoL students. It is a huge organization spanning 17 universities and colleges in the UK and with many branches in other parts of the world. The Economics program was developed by the London School of Economics (LSE), one of the world's leading universities.

The problem with LSE courses is that they are very difficult. After the exams, LSE puts out short recommendations on the Internet for solving problems like: here you need to use such and such a theory and such and such an idea. Complete solutions are not given for two reasons: they do not want to help future examinees and sometimes their problems or solutions contain errors (who does not make errors?). But they also delete short recommendations after a year. My site is the only place in the world where there are complete solutions to the most difficult problems of the last few years. It is not for nothing that the solution to one problem noted above attracted 14,000 visits.

The average number of visits is about 100 per day. When it's time for students to take exams, it jumps to 1-2 thousand. The total amount of materials created in 5 years is equivalent to 5 textbooks. It takes from 2 hours to one day to create one post, depending on the level. After I published this analysis of the site traffic on Facebook, my colleague Nurlan Abiev decided to write posts for the site. I pay for the domain myself, $186 per year. It would be nice to make the site accessible to students and schoolchildren of Kazakhstan, but I don't have time to translate from English.

Once I was looking at the requirements of the MES for approval of electronic textbooks. They want several copies of printouts of all (!) materials and a solid payment for the examination of the site. As a result, all my efforts to create and maintain the site so far have been a personal initiative that does not have any support from the MES and its Committee on Science.

You must be logged in to post a comment.