# time series

In statistics, signal processing, and econometrics, a time series is a sequence of data points, measured typically at successive times, spaced at (often uniform) time intervals. Time series analysis comprises methods that attempt to understand such time series, often either to understand the underlying theory of the data points (where did they come from? what generated them?), or to make forecasts (predictions). Time series prediction is the use of a model to predict future events based on known past events: to predict future data points before they are measured. The standard example is the opening price of a share of stock based on its past performance.

As shown by Box and Jenkins in their book, models for time series data can have many forms and represent different stochastic processes. When modeling the mean of a process, three broad classes of practical importance are the autoregressive (AR) models, the integrated (I) models, and the moving average (MA) models (the MA process is related but not to be confused with the concept of moving average ). These three classes depend linearly on previous data points and are treated in more detail in the articles autoregressive moving average models (ARMA) and autoregressive integrated moving average (ARIMA). The autoregressive fractionally integrated moving average (ARFIMA) model generalizes the former three. Non-linear dependence on previous data points is of interest because of the possibility of producing a chaotic time series.

Among non-linear time series, there are models to represent the changes of variance along time (heteroskedasticity). These models are called autoregressive conditional heteroskedasticity (ARCH) and the collection comprises a wide variaty of representation (GARCH, TARCH, EGARCH, FIGARCH, CGARCH, etc). Recently, wavelet transform based methods (for example locally stationary wavelets and wavelet decomposed neural networks) have gained favour. Multiscale (often referred to as multiresolution) techniques decompose a given time series, attempting to illustrate time dependance at multiple scales.

## Notation

A number of different notations are in use for time-series analysis:

is a common notation which specifies a time series X which is indexed by the natural numbers. We also are accustomed to

## Assumptions

There are only two assumptions from which the theory is built: The general representation of an autoregressive model well-known as AR(p) is

where the term εt is the source of randomness and is called white noise. It is assumed to have the following characteristics:

1.

2.

3.

If it also has a normal distribution, it is called normal white noise:

## Related tools

Tools for investigating time-series data include:

## Applied time series

Time series analysis is exercised in numerous applied fields, from astrophysics to geology. Model selection is often based on the underlying assymption on the data generating process. Take, for example, traffic flow, here we would fully expect periodic behaviour (with bursts at peak travel times). In such a situation one may consider applying Dynamic Harmonic Regression (this is highly similar to airline data which is frequently analysed in the statistics literature).

More recently there has been increased use of time series methods in geophysics (the analysis of rain fall and climate change for example). Within industry, almost every sector will in some way perform time series analysis. With retail, for example, tracking and predicting sales. Analysts will typically load their data into a statistics package ( R and S-Plus are examples of such programs). The most important step is the review of the Autocorrelation function (ACF) which indicates the number of lagged observations to be included in any time series model (one should always analyse the partial autocorrelation function as well).

In general financial series often require non-linear models (such as ARCH) as the application of autoregressive models often results in a model suggesting that to predict the value of tomorrows, lets say share price here, depends almost entirely on yesterday's share price:

(where α1 is close to 1).

Robert Engle recognised the importance of including lagged values of the series' variance. In general time series can be considered in the time domain and/or the frequency domain. This duality has led to many of the recent developments in time series analysis. Wavelet-based methods are an attempt to model series in both domains. Wavelets are compactly supported "small waves", which when convolved with the series itself (when scaled and dylated) gives a scale by scale analysis of the temporal dependance of a series. Such wavelet based methods are frequently applied for climate change problems.

One other (and less researched) area of time series analysis considers the "mining" of series to reterospectively extract knowlegde. In the literature this is referred to as time series data mining (TSDM). Techniques in this area often depend on "feature detection". In essence this is an attempt to find the "characteristic" behaviour of the series, and use this to find areas of the series which do not adhere to this behaviour. Current efforts are led by the computer science department at the University of California (Riverside).

Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It is applicable to a wide variety of academic disciplines, from the physical and social sciences to the humanities.
Signal processing is the analysis, interpretation and manipulation of signals. Signals of interest include sound, images, biological signals such as ECG, radar signals, and many others.
Econometrics is concerned with the tasks of developing and applying quantitative or statistical methods to the study and elucidation of economic principles.[1] Econometrics combines economic theory with statistics to analyze and test economic relationships.
In statistics, a data point is a single typed measurement. Here type is used in a way compatible with datatype in computing; so that the type of measurement can specify whether the measurement results in a Boolean value from , an integer or real number, or some
Forecasting is the process of estimation in unknown situations. Prediction is a similar, but more general term, and usually refers to estimation of time series, cross-sectional or longitudinal data.

An abstract model (or conceptual model) is a theoretical construct that represents something, with a set of variables and a set of logical and quantitative relationships between them.
In financial markets, the stock capital of a corporation or a joint-stock company is the capital raised through the issuance, sale and distribution of shares. A person or organization that holds at least a partial share of stock is called a shareholder.
In econometrics, the Box-Jenkins methodology, named after the statisticians George Box and Gwilym Jenkins, applies autoregressive moving average ARMA or ARIMA models to find the best fit of a time series to past values of this time series, in order to make forecasts.
A stochastic process, or sometimes random process, is the opposite of a deterministic process (or deterministic system) in probability theory. Instead of dealing only with one possible 'reality' of how the process might evolve under time (as is the case, for example, for
moving average or rolling average is one of a family of similar techniques used to analyze time series data. It is applied in finance and especially in technical analysis.
In statistics, autoregressive moving average (ARMA) models, sometimes called Box-Jenkins models after the iterative Box-Jenkins methodology usually used to estimate them, are typically applied to time series data.
In statistics, an autoregressive integrated moving average (ARIMA) model is a generalisation of an autoregressive moving average or (ARMA) model. These models are fitted to time series data either to better understand the data or to predict future points in the series.
In statistics, autoregressive fractionally integrated moving average models are time series models that generalize ARIMA (autoregressive integrated moving average) models by allowing non-integer values of the differencing parameter and are useful in modeling time series
chaos theory describes the behavior of certain nonlinear dynamical systems that under specific conditions exhibit dynamics that are sensitive to initial conditions (popularly referred to as the butterfly effect).
heteroscedastic if the random variables have different variances. The complementary concept is called homoscedasticity. (Note: The alternative spelling homo- or heteroskedasticity is equally correct and is also used frequently.
autoregressive conditional heteroskedasticity (ARCH, Engle (1982)) model considers the variance of the current error term to be a function of the variances of the previous time period's error terms. ARCH relates the error variance to the square of a previous period's error.
In mathematics, a natural number can mean either an element of the set (i.e the positive integers or the counting numbers) or an element of the set (i.e. the non-negative integers).
In the mathematical sciences, a stationary process (or strict(ly) stationary process) is a stochastic process whose probability distribution at a fixed time or position is the same for all times or positions.
Ergodic theory, the study of ergodic transformations, grew out of an attempt to prove the ergodic hypothesis of statistical physics. Much of the early work in what is now called chaos theory was pursued almost entirely by mathematicians, and published under the title of "ergodic
White noise is a random signal (or process) with a flat power spectral density. In other words, the signal's power spectral density has equal power in any band, at any centre frequency, having a given bandwidth.
normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields. Each member of the family may be defined by two parameters, location and scale: the mean ("average",
Autocorrelation is a mathematical tool used frequently in signal processing for analysing functions or series of values, such as time domain signals. Informally, it is a measure of how well a signal matches a time-shifted version of itself, as a function of the amount of time shift.
In statistical signal processing and physics, the spectral density, power spectral density, or energy spectral density is a positive real function of a frequency variable associated with a stationary stochastic process, or a deterministic function of time, which has
Fourier transform, named in honor of French mathematician Joseph Fourier, is a certain linear operator that maps functions to other functions. Loosely speaking, the Fourier transform decomposes a function into a continuous spectrum of its frequency components
Frequency domain is a term used to describe the analysis of mathematical functions or signals with respect to frequency.