我想用简单的线性回归预测未来某个日期的值,但我不能因为日期格式.
这是我的数据框:
data_df =
date value
2016-01-15 1555
2016-01-16 1678
2016-01-17 1789
...
y = np.asarray(data_df['value'])
X = data_df[['date']]
X_train, X_test, y_train, y_test = train_test_split
(X,y,train_size=.7,random_state=42)
model = LinearRegression() #create linear regression object
model.fit(X_train, y_train) #train model on train data
model.score(X_train, y_train) #check score
print (‘Coefficient: \n’, model.coef_)
print (‘Intercept: \n’, model.intercept_)
coefs = zip(model.coef_, X.columns)
model.__dict__
print "sl = %.1f + " % model.intercept_ + \
" + ".join("%.1f %s" % coef for coef in coefs) #linear model
我试图将日期转换为失败
data_df['conv_date'] = data_df.date.apply(lambda x: x.toordinal())
data_df['conv_date'] = pd.to_datetime(data_df.date, format="%Y-%M-%D")
解决方法:
线性回归不适用于日期数据.因此我们需要将其转换为数值.以下代码将日期转换为数值:
import datetime as dt
data_df['Date'] = pd.to_datetime(data_df['Date'])
data_df['Date']=data_df['Date'].map(dt.datetime.toordinal)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 [email protected] 举报,一经查实,本站将立刻删除。