微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

python – 在pandas / numpy中实现分段函数的正确方法

我需要创建一个函数来传递给curve_fit.在我的例子中,函数最好定义为分段函数.

我知道以下内容不起作用,但我正在显示它,因为它使函数的意图清晰:

def model_a(X, x1, x2, m1, b1, m2, b2):
    '''f(x) has form m1*x + b below x1, m2*x + b2 above x2, and is
    a cubic spline between those two points.'''
    y1 = m1 * X + b1
    y2 = m2 * X + b2
    if X <= x1:
        return y1    # function is linear below x1
    if X >= x2:
        return y2    # function is linear above x2
    # use a cubic spline to interpolate between lower
    # and upper line segment
    a, b, c, d = fit_cubic(x1, y1, x2, y2, m1, m2)
    return cubic(X, a, b, c, d)

当然,问题在于X是一个熊猫系列,并且形式(X <= x1)评估为一系列布尔值,因此失败的消息是“系列的真值是模糊的”. 似乎np.piecewise()就是针对这种情况设计的:“无论condlist [i]为True,funclist [i](x)都用作输出值.”所以我尝试了这个:

def model_b(X, x1, x2, m1, b1, m2, b2):
    def lo(x):
        return m1 * x + b1
    def hi(x):
        return m2 * x + b2
    def mid(x):
        y1 = m1 * x + b1
        y2 = m2 * x + b2
        a, b, c, d = fit_cubic(x1, y1, x2, y2, m1, m2)
        return a * x * x * x + b * x * x + c * x + d

    return np.piecewise(X, [X<=x1, X>=x2], [lo, hi, mid])

但是这次会议失败了:

return np.piecewise(X, [X<=x1, X>=x2], [lo, hi, mid])

消息“IndexError:数组索引太多”.我倾向于认为它反对condlist中有两个元素和funclist中有三个元素这一事实,但是文档明确指出funclist中的额外元素被视为认元素.

任何指导?

解决方法:

这个piece of code在NumPy中对np.piecewise的定义是以list / ndarray为中心的:

# undocumented: single condition is promoted to a list of one condition
if isscalar(condlist) or (
        not isinstance(condlist[0], (list, ndarray)) and x.ndim != 0):
    condlist = [condlist]

因此,如果X是一个系列,那么condlist = [X< = x1,X> = x2]是两个系列的列表.
由于condlist [0]既不是列表也不是ndarray,condlist被“提升”为一个条件的列表:

condlist = [condlist]

由于这不是我们想要发生的,我们需要在将它传递给np.piecewise之前使condlist成为NumPy数组的列表:

X = X.values

例如,

import numpy as np
import pandas as pd
def model_b(X, x1, x2, m1, b1, m2, b2):
    def lo(x):
        return m1 * x + b1
    def hi(x):
        return m2 * x + b2
    def mid(x):
        y1 = m1 * x + b1
        y2 = m2 * x + b2
        # a, b, c, d = fit_cubic(x1, y1, x2, y2, m1, m2)
        a, b, c, d = 1, 2, 3, 4
        return a * x * x * x + b * x * x + c * x + d
    X = X.values
    return np.piecewise(X, [X<=x1, X>=x2], [lo, hi, mid])

X = pd.Series(np.linspace(0, 100, 100))
x1, x2, m1, b1, m2, b2 = 30, 60, 10, 5, -20, 30
f = model_b(X, x1, x2, m1, b1, m2, b2)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。

相关推荐