diff --git a/Lectures/Lecture - Non Parametric Analysis/Buhlmann (2002) - Bootstraps for Time Series.pdf b/Lectures/Lecture - Non Parametric Analysis/Buhlmann (2002) - Bootstraps for Time Series.pdf new file mode 100644 index 0000000000000000000000000000000000000000..ed74232e6e578129ce9818e67857441874ea43b2 Binary files /dev/null and b/Lectures/Lecture - Non Parametric Analysis/Buhlmann (2002) - Bootstraps for Time Series.pdf differ diff --git a/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricAnalysis.ipynb b/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricAnalysis.ipynb index c916cab7b983a0c9e2b616278723297bb89762cf..7a11020754dcd7edf7e072fd5c4a98d92f6dac62 100644 --- a/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricAnalysis.ipynb +++ b/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricAnalysis.ipynb @@ -352,7 +352,7 @@ "## String Length Method\n", "***\n", "\n", - "This algorithm was proposed originally in [Lafler & Kinman (1965](https://ui.adsabs.harvard.edu/abs/1965ApJS...11..216L/abstract) and than discussed in [Dworetsky (1983)](https://ui.adsabs.harvard.edu/abs/1983MNRAS.203..917D/abstract) and extended in [Clarke (2002)](http://localhost:8888/notebooks/Lecture-NonParametricAnalysis.ipynb#:~:text=Clarke%20(2002)%20%2D%20%22String/Rope%20length%20methods%20using%20the%20Lafler%2DKinman%20statistic%22). " + "This algorithm was proposed originally in [Lafler & Kinman (1965](https://ui.adsabs.harvard.edu/abs/1965ApJS...11..216L/abstract)) and than discussed in [Dworetsky (1983)](https://ui.adsabs.harvard.edu/abs/1983MNRAS.203..917D/abstract) and extended in [Clarke (2002)](http://localhost:8888/notebooks/Lecture-NonParametricAnalysis.ipynb#:~:text=Clarke%20(2002)%20%2D%20%22String/Rope%20length%20methods%20using%20the%20Lafler%2DKinman%20statistic%22). " ] }, { @@ -519,7 +519,7 @@ "## Generalized Correlation Function: Correntropy\n", "****\n", "\n", - "> Recently, non-parametric methods have attracted a considerable attention and novel appoaches, based on ideas and algorithms developed for machine-learning in a big-data scenario have been developed.\n", + "> Recently, non-parametric methods have attracted a considerable attention and novel appoaches, based on ideas and algorithms developed for machine-learning in a big-data scenario, have been developed.\n", "\n", "\n", "- [Huijse et al. (2012)](https://ui.adsabs.harvard.edu/abs/2012ITSP...60.5135H/abstract) discussed a methodology based on information theoretic (IT) based criteria.\n", @@ -563,6 +563,12 @@ "\n", "- Kernels can also be viewed as covariance functions for correlated observations at different points of the input domain (see lectures about Gaussian Processes). \n", " - A kernel can be any semi-definite positive function. And it is possible to develop periodic kernels.\n", + " \n", + "- For instance, one possibility is:\n", + "\n", + "$$ G_{\\sigma;P}(z-y) = \\frac{1}{\\sqrt{2\\pi}\\sigma} \\exp \\left ( - \\frac{\\sin^2 \\left( \\frac{\\pi}{P} (z - y) \\right)}{0.5\\sigma^2}\\right ) $$\n", + "\n", + "- where now the kernel depends explictly on the period.\n", "\n", "- This brings to a metric combining the correntropy with a periodic kernel to measure similarity among samples separated by a given period. \n", "\n", @@ -598,6 +604,8 @@ "\n", "- Technically speaking, the MI is the divergence (i.e. statistical distance) between the joint PDF of the RVs and the product of their marginal PDFs.\n", "\n", + "- Shannon's MI for continuous RVs $X$ and $Y$ with joint PDF $f_{X, Y}(\\cdot, \\cdot)$ is defined as:\n", + "\n", "$$ \\text{MI}_S(X, Y) = D_{KL}(f_{X,Y} || f_X f_Y) = \\iint f_{X,Y} \\log f_{X,Y} \\,dx \\,dy - \\int f_{X} \\log f_X \\,dx - \\int f_{Y} \\log f_Y \\,dy $$\n", "\n", "where $D_{KL}(\\cdot || \\cdot)$ is the Kullback-Leibler divergence and $f_X (x)= \\int f_{X,Y} (x,y)\\,dy$, $f_Y (y) = \\int f_{X,Y} (x, y)\\,dx$ are the marginal PDFs of $X$ and $Y$, respectively.\n", @@ -620,7 +628,7 @@ "where $f_{X,Y}(\\cdot, \\cdot)$ is the joint PDF of $X$ and $Y$ while $f_X(\\cdot)$ and $f_Y(\\cdot)$ are the marginal PDFs, respectively. \n", "\n", "- The terms $V_J$, $V_M$ and $V_C$ correspond to the integrals of the squared joint PDF, squared product of the marginal PDFs and product of joint PDF and marginal PDFs, respectively. \n", - "- In the ITL framework, there are estimators of these quantities that can be computed directly from data samples.\n", + "- In the ITL framework, there are estimators of these quantities that can be computed directly from data samples (see quoted paper for details).\n", " - This estimator is called, in literature, the information potential, IP, of an RV and it corresponds to the expected value of its PDF.\n", "\n", "- Skipping further technical details that a concerned reader can find in the quoted papers, the analysis follow these guidelines:\n", diff --git a/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricPeriodogram.ipynb b/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricPeriodogram.ipynb index 4baf8779d3c9e6104d900361c345d2c68ac38633..a616cdee626fb6bcd7724b409b9cdf5ccc52e176 100644 --- a/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricPeriodogram.ipynb +++ b/Lectures/Lecture - Non Parametric Analysis/Lecture-NonParametricPeriodogram.ipynb @@ -744,7 +744,7 @@ "- In IID bootstrap the data points are randomly sampled with replacement. The IID bootstrap destroys not only the periodicity but also any time correlation or structure in the time series. This results in underestimation of the confidence bars. \n", " - For data with serial correlations it is better to use moving block (MB) bootstrap. In MB bootstrap blocks of data of a given length are patched together to create a new time series. The block length is a parameter. Because light curves are irregularly sampled we set a block length in days rather than number of points. The ideal is to set the length so that it destroys the periodicity and preserves most of the serial correlation\n", "\n", - "- Bootstrap applied to time series is discussed in [Bühlmann (2002) - \"Bootstraps for time series.\"](https://www.jstor.org/stable/3182810?casa_token=fuWC-ffm10sAAAAA%3A2PyzOtU2pxoXzaTaaYIaWjrCA3nokR5BjvLdggYcci8Rn8G7UPz4ceMEXfuDOOHg1NtuVWO6ZMkXUjwJl5pLcKxt1ojZTK9WpgrjJPdGsE6o83LjNcuG)." + "- Bootstrap applied to time series is discussed in [Bühlmann (2002) - \"Bootstraps for time series.\"](https://www.jstor.org/stable/3182810)." ] }, { @@ -759,6 +759,18 @@ "- [Süveges (2014) - \"Extreme-value modelling for the significance assessment of periodogram peaks\"](https://ui.adsabs.harvard.edu/abs/2014MNRAS.440.2099S/abstract). " ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Further Material\n", + "\n", + "Papers for examining more closely some of the discussed topics.\n", + "\n", + "- [Bühlmann (2002) - \"Bootstraps for Time Series\"](https://www.jstor.org/stable/3182810).\n", + " " + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -829,7 +841,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.11" + "version": "3.12.7" } }, "nbformat": 4, diff --git a/README.md b/README.md index 7d1cdd204fd0a590461e27572c2325c5cd308d81..11c2f9814dfe1e53b471f141285cd2188dec2f9a 100644 --- a/README.md +++ b/README.md @@ -2,4 +2,4 @@ This is a repository with material (notebooks, papers, etc.) for the **Time Domain Astrophysics** course delivered at the *Università dell'Insubria* by Stefano Covino. -*Last update: 25 November 2024.* +*Last update: 27 November 2024.*