我刚开始使用 Orange,并且无法找到如何获取基本汇总统计信息,例如 n(计数)、平均值和标准偏差。
是否有一个小部件可以做到这一点,而我只是忽略了它?
我刚开始使用 Orange,并且无法找到如何获取基本汇总统计信息,例如 n(计数)、平均值和标准偏差。
是否有一个小部件可以做到这一点,而我只是忽略了它?
我找到了 3 种方法来做到这一点:
对于我们这些想要在 Web 上使用 Orange3 API 获得更多脚本示例的人,这里有一个对我有用的解决方案。
from Orange.data import ContinuousVariable,DiscreteVariable
import numpy as np
dom = in_data.domain
continuous = [d for d in dom.variables if type(d) == ContinuousVariable]
rows = ["mean","std","min","max","range"]
dom = Domain(continuous,metas = [DiscreteVariable(name="stat",values=rows)])
summary = [
[np.mean(in_data[:,dom.index(d)]) for d in continuous],
[np.std(in_data[:,dom.index(d)]) for d in continuous],
[np.min(in_data[:,dom.index(d)]) for d in continuous],
[np.max(in_data[:,dom.index(d)]) for d in continuous],
[np.ptp(in_data[:,dom.index(d)]) for d in continuous] ]
meta = [[i] for i in range(0,5)]
out_data = Table.from_numpy(dom,summary,metas=meta)
谢谢。稍作改动后为我工作:
from Orange.data import Domain, ContinuousVariable,DiscreteVariable
from Orange.data.pandas_compat import table_from_frame
import numpy as np
import pandas as pd
dom = in_data.domain
df = pd.DataFrame(in_data.X)
Table=table_from_frame(df)
continuous = [d for d in dom.variables if type(d) == ContinuousVariable]
rows = ["mean","std","min","max","range"]
dom = Domain(continuous,metas = [DiscreteVariable(name="stat",values=rows)])
summary = [
[np.mean(in_data[:,dom.index(d)]) for d in continuous],
[np.std(in_data[:,dom.index(d)]) for d in continuous],
[np.min(in_data[:,dom.index(d)]) for d in continuous],
[np.max(in_data[:,dom.index(d)]) for d in continuous],
[np.ptp(in_data[:,dom.index(d)]) for d in continuous] ]
meta = [[i] for i in range(0,5)]
out_data = Table.from_numpy(dom,summary,metas=meta)