Nt1330 Unit 2 Dgression Analysis

\section{Dynamic summarization of data streams}

We define a data stream $X$ as a possibly infinite data set where each of the samples $x_{n}$ is available only after the time instant $t_{n}$. The arrival time of consecutive samples needs not to be equidistant; i.e., $% t_{n}-t_{n-1}$ may be different from $t_{n+1}-t_{n}$. The data available from the stream $X$ up to a given time instant $t_{n}$, $X_{n}$, is made up by the samples $X_{n}=\{x[0],x[1],x[2],...,x[i],...,x[n]\}$, where each sample $x[i]$ it is made up of $p$ features: $x[i]=%
\{x[i]_{1},x[i]_{2},x[i]_{3},...,x[i]_{j},...,x[i]_{p}\}$.\textit{Nota: no tengo claro si usar los corchetes s\'{o}lo para denotar muestras que cambian en el tiempo, o para cualquier otra cosa que cambie …show more content…
Pero no tengo claro si complicar m\'{a}s las cosas o no\ldots }

In this section we propose a strategy for efficient calculation of statistics, such as mean or covariance, that can be used to summarize data streams. These summaries may be used for any purpose, such as the calculation and update of a prototype representing the stream in a dynamic evolutionary clustering algorithm. This strategy is based on a weighting scheme which \textquotedblleft progressively forgets\textquotedblright\ old samples when new samples arrive. The \textquotedblleft memory\textquotedblright\ of the weighting scheme is controlled by a single intuitive parameter that allows the strategy to be adapted to faster or slower streams' dynamics, providing a control parameter that permits achieving a compromise between preserving the structure of the stream in the presence of noise and outliers, and permitting the rapid evolution of the stream summary in changing situations.

\subsection{Weighting scheme}

We shall assign a weight $w_{i,n}$ to the sample $x_{i}$ of the stream $X$ at the time instant $t_{n}$ given by:
\begin{equation}
w_{i,n}=\frac{1}{(n+1-i)^{\frac{1}{m}}}, \label{OldWeight} …show more content…
In Eq. (\ref{OldWeight}) the weight of the last sample is always one, whereas the other samples' weight decreases monotonically having the oldest samples the less weight. The smaller $m$ is, the faster the weight of the oldest samples decreases. If $m=0$ there is no memory; i.e., $w_{i,n}=0$ $%
\forall iathcal{Z}}})/2$.

The Bhattacharyya distance measures the divergence between two

Related Documents

Nt1310 Unit 3 Assignment 1 Dr10

Nt1310 Unit 3 Assignment 1 Dr10

Nt1330 Unit 1 Data Analysis Paper

Nt1330 Unit 1 Data Analysis Paper

Nt1330 Unit 3 Assignment 1 Case Study

Nt1330 Unit 3 Assignment 1 Case Study

Nt1330 Unit 5 Test Paper

Nt1330 Unit 5 Test Paper

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 2 Study Guide

Nt1330 Unit 2 Study Guide

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment Analysis

Nt1330 Unit 3 Assignment Analysis

Nt1330 Unit 3 Assignment 1 Agression Analysis

Nt1330 Unit 3 Assignment 1 Agression Analysis

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 3 Assignment 1

Nt1330 Unit 1 Assignment 2

Nt1330 Unit 1 Assignment 2

Nt1310 Unit 9 Final Project

Nt1310 Unit 9 Final Project

Related Topics

Ready To Get Started?

Discover

Company

Follow