定期采样时间序列探索工具
Tool to explore regularly sampled time series

原始链接: https://github.com/rajivsam/tseda

## tseda:时间序列探索应用概要 tseda是一个Python应用程序,旨在探索规则采样的时间序列数据(每小时或更高频率),目前限制为2000个样本。它引导用户完成一个三步工作流程:**初步评估**、**时间序列分解**和**观测记录**。 **初步评估**利用核密度估计、箱线图以及自相关/偏自相关函数(ACF/PACF)来揭示数据分布和潜在季节性。**分解**采用奇异谱分析(SSA)来识别底层成分(趋势、季节性、噪声),基于特征值分布和用户定义的组别。这允许进行变化点分析和噪声结构评估。**观测记录**提供AIC排名诊断、自动摘要以及用户笔记的空间,最终生成一份报告。 tseda可以通过pip或conda安装,并提供Web应用程序界面和notebook环境。它需要Python 3.13或更高版本,并接受包含时间戳和数值列的CSV或Excel文件——数据必须是规则采样的,并且不包含缺失值。欢迎通过GitHub提交开发贡献和功能请求。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 探索定期抽样时间序列的工具 (github.com/rajivsam) 3点 由 rsva 1小时前 | 隐藏 | 过去 | 收藏 | 讨论 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

An application for time series exploration.

tseda lets you explore regularly sampled time series with a sampling frequency of one hour or greater. It is currently limited to 2,000 samples (this is configurable).

Three-Step Exploration Workflow

Explore the distribution and spread of values using a kernel density estimate and box plot. You get to see the raw distribution of the values. The PACF and ACF provide clues about seasonality and autoregressive components.

(b) Decomposition Using Singular Spectral Analysis

On the basis of the sampling frequency, a window for SSA is determined. This is a heuristic assignment. For example:

Sampling Frequency Window Size
Hourly 24
Monthly 12
Quarterly 4

This can be changed in the UI. Based on the eigen value distribution, observations from the ACF plot and the eigen vector plot, the seasonal components can be determined if present. Based on these initial plots, the user needs to input a set of groupings and reconstruct the series with these groupings. The reconstruction plots are shown. If there is structure in the series, then change point analysis can be done using the fact that the components are smooth. A change point plot is shown. The explained variance from signal and noise components and the assessment of the noise structure (independent or correlated) is provided.

The SSA is based on the eigen decomposition of the trajectory matrix. Though the raw signal is correlated, the eigenvectors are uncorrelated. If we assume that the signal is Gaussian, this also implies independence. We can use the Akaike Information Criterion for model selection and determine the AIC as a function of the rank of the model. This is shown in the observation page. An automatic summary of all the observations is provided.

The package also provides a notebook interface to these features. If you have a new dataset that you want to analyze, look at the data loader directory for examples. Download your dataset, clean it, produce your time series, and analyze it with tseda.

Python 3.13 or higher is required to run this package.

Before starting the installation, verify your Python version:

Ensure the output shows Python 3.13 or higher. If not, please upgrade Python before proceeding.

Install And Run From PyPI

Conda is the recommended package manager for development and installation (development was done with conda):

conda create -n tseda python=3.13
conda activate tseda
pip install tseda

Then run the app:

Non-Developer Quick Start

If you just want to run the app with minimal setup:

  1. Install with pipx:
  1. Launch the app:
  1. Open your browser at http://127.0.0.1:8050.

If pipx is not available, use the standard Python install instructions below.

Verify you have Python 3.13 or higher installed:

Create and activate a virtual environment, then install from PyPI:

python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install tseda

You can also launch with Python module execution:

Note: python tseda is not a valid way to run an installed package because Python treats tseda as a local script path.

By default, the app starts at http://127.0.0.1:8050.

Optional runtime overrides:

TSEDA_HOST=0.0.0.0 TSEDA_PORT=8050 TSEDA_DEBUG=false tseda
  • Click "Drag and Drop or Select Files" in the Initial Assessment panel.
  • Your file must be a CSV or Excel file with at least two columns: a timestamp column (first) and a numeric value column (second).
  • The data must be regularly sampled at hourly or lower frequency (e.g., hourly, daily, monthly).
  • The dataset must contain no missing values (NA / NaN). Clean your data before uploading.
  • Files are limited to 2,000 rows (configurable via MAX_FILE_LINES in ts_analyze_ui.py).

4. Explore In Three Steps

Step Panel What to do
1 Initial Assessment of Time Series Review distribution plots (KDE, box plot) and the ACF / PACF for autocorrelation patterns.
2 Time Series Decomposition Review the eigenvalue plot, then enter component groupings (e.g., Trend, Seasonal, Noise) and click Apply Grouping.
3 Observation Logging Review the AIC rank diagnostics, read the auto-generated summary, and add your own observations before saving the report.

Development Install (From Source)

If you are developing locally from source:

  1. Build source and wheel distributions:
  1. Validate distributions before upload:
pip install -r docs/requirements.txt
sphinx-build -b html docs/source docs/_build/html

You can also use the Makefile:

The generated site will be available in docs/_build/html.

This repository includes .readthedocs.yaml configured to build docs from docs/source/conf.py.

  1. Push the repository to GitHub (or another supported provider).
  2. Sign in to Read the Docs and import the project.
  3. In Read the Docs project settings:
    • Set the default branch.
    • Confirm the config file path is .readthedocs.yaml.
  4. Trigger a build from the Read the Docs dashboard.
  5. Optionally enable a custom domain and versioned docs.

If the build fails, inspect the Read the Docs build logs and replicate locally using:

Contributing & Feature Requests

If you'd like to request a feature or report an issue, please open an issue on GitHub. You're also welcome to reach out to me directly.

联系我们 contact @ memedata.com