For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | codeboy7432's commentsregister

I added class comments to each class which explain the high level implementation details. Clamping is supported with natural cubic splines, and this is done by taking the slopes at each endpoint.

Monotonicity is currently not supported (for cubic splines).


I authored this package as I needed to generate confidence intervals for time series data without using SciPy. Sharing this here, as this could be a useful package for others :)

Included Models: - Linear regression - Ridge regression - Linear spline - Isotonic regression - Bin regression - Cubic spline - Natural cubic spline - Exponential moving average - Kernel functions (Gaussian, KNN, Weighted average)


Thanks for releasing this. I was just wondering if something like this exists. I've worked on a few projects where scipy was banned due to the large dependencies it pulls in.


Have you compared these to StatsModels? Or R? (which includes most of these ootb)


I did look into stats models, but this was a large library with more than I needed. I did not look into R.. Are there any lightweight alternatives in this language?


I think most or all of these models are in the R Standard lib


Not all as far as I know. For ridge regression you will want to install glmnet, for example. mgcv which is usually shipped with R provides access to a few common fast kernels which seem to be the ones python programmers are familiar with.


Can use lm.ridge from MASS instead of glmnet, but yeah there’s going to be some smoother not in R standard library


mgcv also provides many more varieties of splines than base R.


Am I guess correct to assume that you could not use sklearn/scikit as well because it depends on SciPy? (I am under the impression that sklearn/scikit is the dominant library for an implementation of these algorithms.)


That is correct. I had to generate confidence intervals on over 8000 univariate data sets using very small VMs, so I needed to limit large dependancies as much as I could. This package was the result of this!


Two questions:

Why the requirement that you can't use scipy?

Have you heard of the package stats model?


Well, SciPy depends heavily on NumPy, which as a CPython-specific extension won't run on other Python interpreters in general. Although for example there is ulab for MicroPython which replicates part of NumPy, and PyPy has a compatibility layer for CPython extensions.

Edit: well, Regessio itself also depends on NumPy, but might be able to run on top of ulab whereas I really doubt SciPy would.


The repo on OP also depends on numpy


Responding that there's something out there called ulab doesn't really answer my question, which was: where does op's requirement to not use scipy come from.


(Same as comment above)

".. I had to generate confidence intervals on over 8000 univariate data sets using very small VMs, so I needed to limit large dependancies as much as I could. This package was the result of this!"

Based on the comments in this thread, it may be worth trying to make this package not dependant on Numpy as well?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You